Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out a pattern for cutting down noisy validation errors #39

Open
danbri opened this issue Apr 1, 2022 · 3 comments
Open

Figure out a pattern for cutting down noisy validation errors #39

danbri opened this issue Apr 1, 2022 · 3 comments

Comments

@danbri
Copy link
Contributor

danbri commented Apr 1, 2022

I forget the terminology, but the AND/NOT/OR complexity here serves to suppress errors that are downstream of some more fundamental error. @ericprud et al have plans for doing this within ShEx in more standardized ways, so the actual intended shape content doesn't get lost in all the boolean trickery.

PREFIX : <http://schema.org/>
PREFIX validate: <https://google.com/search/validation/valid>

<S1> {
  :url . + %validate:url{console.log('some url checking code here')%}
} AND {
  :datePublished . ? %validate:date-time{console.log('some datetime checking code here')%}
} AND {
  :claimReviewed . 
} AND {
  :itemReviewed {
    a [:CreativeWork]
  } AND (
    NOT {
      a [:CreativeWork]
    } OR {
      :author (
        {
          a [:Organization]
        } OR {
          a [:Person]
        }
      ) AND (
        NOT (
          {
            a [:Organization]
          } OR {
            a [:Person]
          }
        ) OR {
          :name . ?
        }
      )?
    } AND {
      :datePublished . ? %validate:date-time{console.log('some datetime checking code here')%}
    }
  )?
} AND {
  :author (
    {
      a [:Organization]
    } OR {
      a [:Person]
    }
  ) AND (
    NOT (
      {
        a [:Organization]
      } OR {
        a [:Person]
      }
    ) OR (
      {
        :name . 
      } OR {
        :url . 
      }
    ) AND {
      :url . * %validate:url{console.log('some url checking code here')%}
    }
  )?
} AND {
  :reviewRating {
    a [:Rating]
  } AND (
    NOT {
      a [:Rating]
    } OR {
      :alternateName . 
    } AND (
      (
        NOT {
          :name . 
        } OR {
          :alternateName . ?
        }
      ) AND (
        NOT (
          NOT {
            :name . 
          }
        ) OR {
          :alternateName . +
        }
      )
    ) AND NOT (
      {
        :alternateName . 
      } AND {
        :name . 
      }
    ) AND (
      NOT (
        (
          {
            :ratingValue . 
          } OR {
            :bestRating . 
          } OR {
            :worstRating . 
          }
        ) AND NOT (
          {
            :ratingValue /-1/ 
          } AND {
            :bestRating /-1/ 
          } AND {
            :worstRating /-1/ 
          }
        )
      ) OR {
        :ratingValue /([0-9]+[\.,]?[0-9]*)\/([0-9]+[\.,]?[0-9]*)/  OR /([0-9]+[\.,]?[0-9]*)%/  OR /([0-9]+[\.,]?[0-9]*)/ +
      } AND (
        NOT {
          :ratingValue /([0-9]+[\.,]?[0-9]*)/ +
        } OR {
          
        } %validate:rating%
      )
    )
  )+
}
@danbri
Copy link
Contributor Author

danbri commented Apr 1, 2022

See https://github.com/shexSpec/shex/blob/master/status.md under "discriminators@.

As described by @ericprud back in 2020:

the idea is that if you're validating something as a CreativeWork and it has a type of Recipe but doesn't actually satisfy Recipe, ShEx won't drown you in errors about every kind of CreativeWork that it fails, but instead it will just tell you why it doesn't satisfy Recipe
(and yes, danbri, i know it's not "failing", but you get the idea)
if y'all think the above description is missing some use case or nuance, let us know

Draft spec and examples: https://hackmd.io/1fpnYHxoSYOQhvYxHXddjA

@danbri
Copy link
Contributor Author

danbri commented Sep 14, 2022

ShEx service-specific shape examples for Recipe and Dataset, https://github.com/google/schemarama/tree/main/demo/validation/shex/specific/ServiceB

@danbri
Copy link
Contributor Author

danbri commented Sep 14, 2022

We want the bit of ShEx that checks for 'name' property in the shape to be able to link to something like https://developers.google.com/search/docs/advanced/structured-data/recipe#the_bit_of_the_docs_that_talks_about_name_property.

Realistically this might need initially to be done via out-of-band info rather than assuming all the shex from Google carries such details. Hence shapepath / IDs being an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant