Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore demos, document their config API and how to set up static-hosted and docker installations #34

Open
3 tasks
danbri opened this issue Mar 18, 2022 · 2 comments
Assignees

Comments

@danbri
Copy link
Contributor

danbri commented Mar 18, 2022

We should have documentation showing how to set this up for (a) static serving (b) docker serving, and a live installation of at least one of these.

Background

The original demos used server side processing for 3 things:

  • headless browser, so that JS-injected markup can be extracted
  • URL fetching, since a web page can't retrieve arbitrary URLs
  • Some simple python/Flask to map URL paths to file paths, HTML template-based generation, and also to JSON-ify a representation of the examples.

The whole thing can be run as a docker container but it would be good to have a simplified pure static version that could be run by anyone very easily. To do this:

  • convert file-mapping and JSONifying steps into pre-publication file management i.e. serve the shacl, shex, Turtle and JSON files the same way we serve up JS, CSS and HTML statically.
  • Make a simple HTML page from templates/scc.html etc.
  • Document the apps configuration API i.e. the files it needs to load, and which are currently served by app.py and basic_app.py Flash python server example.

I made a first attempt at the documentation below.

Draft documentation

Config API

SchemaramaJS configures itself with various files loaded from relative URIs:

  • /shacl/shapes - an aggregation of SHACL shape definitions.
  • /shacl/subclasses - a Turtle file listing rdfs:subClassOf relationships.
  • /shex/shapes - an aggregation of ShEx shape definitions.
  • /hierarchy - a JSON description that groups shape definitions in a hierarchy of associated services/projects.
  • /services/map - JSON associating services from hierarchy with shapes and patterns from SHACL and ShEx.
  • /tests - a JSON list of tests, where each is a piece of text using JSON-LD, RDFa, Microdata.

It will also typically serve icons associated with the hierarchy of services, e.g. initial demo uses:

  • /static/images/services/Schema.png
  • /static/images/services/ServiceA.png
  • /static/images/services/ServiceB.png
  • /static/images/services/ServiceBProduct1.png
  • /static/images/services/ServiceBProduct2.png
  • /static/images/services/ServiceBProduct3.png
  • /static/images/services/ServiceC.png
  • /static/images/services/ServiceD.png

Config details

The original demo shows a mix of shapes - some basic structures from Schema.org's definitions, and some associated with example online services. SchemaramaJS will try to load these upon initialization.

/shacl/shapes

This can be quite large, e.g. looking at headers using

curl -s -D - -o /dev/null http://127.0.0.1:3002/shacl/shapes

Content-Disposition: inline; filename=full.shacl
Content-Type: application/octet-stream
Content-Length: 223194

We get a large dump of SHACL in RDF/Turtle syntax.

/shacl/shex

Similarly, here we are served (in demo configuration):

HTTP/1.0 200 OK
Content-Disposition: inline; filename=full.shexj
Content-Type: application/octet-stream
Content-Length: 633692
Last-Modified: Wed, 09 Mar

Similarly, for the ShEx version we get a large dump of ShEx in ShExJ syntax.

/shacl/subclasses

curl -s -D - http://127.0.0.1:3002/shacl/subclasses

This data file reproduces rdfs:subClassOf assertions from relevant schemas. It is in Turtle format, and is not tightly linked to SHACL, except by the fact that only the SHACL validator uses it; it is not passed to ShEx validator during setup. In principle it could be used for other purposes, and we could change the file/url path accordingly.

In demo configuration, it is every subtype-supertype relationship defined in schema.org (and therefore note sometimes a type has multiple supertypes). Here are the lines relating to the ComedyClub type:

curl -s -D - http://127.0.0.1:3002/shacl/subclasses | grep ComedyClub

schema:ComedyClub rdfs:subClassOf schema:Place .
schema:ComedyClub rdfs:subClassOf schema:EntertainmentBusiness .
schema:ComedyClub rdfs:subClassOf schema:Organization .
schema:ComedyClub rdfs:subClassOf schema:LocalBusiness .
schema:ComedyClub rdfs:subClassOf schema:Thing .

/hierarchy

SchemaramaJS loads a JSON configuration file defining a hierarchy of services/applications that can be associated with the various validations being checked. In turn this file can include image URLs.

Demo config is this:

{
  "nested": [
    {
      "service": "ServiceA"
    },
    {
      "nested": [
        {
          "service": "ServiceBProduct1"
        },
        {
          "service": "ServiceBProduct2"
        },
        {
          "service": "ServiceBProduct3"
        }
      ],
      "service": "ServiceB"
    },
    {
      "service": "ServiceC"
    },
    {
      "service": "ServiceD"
    }
  ],
  "service": "Schema"
}

/services/map

SchemaramaJS also uses a JSON service mapping file, which associates validation shapes (named in common across
SHACL and ShEX) with the services described in /services:

{
  "ValidSchemaAboutPage": "Schema",
  "ValidSchemaAcceptAction": "Schema",
  "ValidSchemaAccommodation": "Schema",
  "ValidSchemaAccountingService": "Schema",
  "ValidSchemaAchieveAction": "Schema",
  "ValidSchemaAction": "Schema",
  "ValidSchemaActionAccessSpecification": "Schema",
  "ValidSchemaActionStatusType": "Schema",
  "ValidSchemaActivateAction": "Schema",
  "ValidSchemaAddAction": "Schema",
  "ValidSchemaAdministrativeArea": "Schema",
  "ValidSchemaAdultEntertainment": "Schema",
  "ValidSchemaAggregateOffer": "Schema",
  "ValidSchemaAgreeAction": "Schema",
  "ValidSchemaAirline": "Schema",
  "ValidSchemaAirport": "Schema", [...etc etc...]
  "ValidSchemaWriteAction": "Schema",
  "ValidSchemaXPathType": "Schema",
  "ValidSchemaZoo": "Schema",
  "ValidServiceBRecipe": "ServiceB",
  "ValidServiceBProduct1Recipe": "ServiceBProduct1",
  "ValidServiceBProduct2Recipe": "ServiceBProduct2",
  "ValidServiceBProduct3Recipe": "ServiceBProduct3",
  "ValidServiceARecipe": "ServiceA",
  "ValidServiceDRecipe": "ServiceD",
  "ValidServiceCRecipe": "ServiceC" 
}

/tests

Finally, SchemaramaJS loads a collection of example tests, each is an appropriately escaped text value,
structured in a very plain JSON file:

{ 
  "tests": [ 
     "escaped markup here e.g. json-ld...", 
     "second example here e.g. microdata..." 
  ]
}

No additional metadata is included; SchemaramaJS will try to figure out how to parse it.

Config-using Validator code

These files are all loaded by static/js/scc/core.js:

$(document).ready(async () => {
    $.getJSON("https://api.ipify.org/?format=json", function(e) {
        ip = e.ip;
    });
    await $.get(`shacl/shapes`, (res) => shaclShapes = res);
    await $.get(`shacl/subclasses`, (res) => subclasses = res);
    await $.get(`shex/shapes`, (res) => shexShapes = JSON.parse(res));
    await $.get(`hierarchy`, (res) => {
        hierarchy = res;
        constructHierarchySelector(hierarchy, 0);
    });
    await $.get(`services/map`, (res) => shapeToService = res);
    $.get(`tests`, (res) => initTests(res.tests));
    shexValidator = new schemarama.ShexValidator(shexShapes, {annotations: annotations});
    shaclValidator = new schemarama.ShaclValidator(shaclShapes, {
        annotations: annotations,
        subclasses: subclasses,
    });
});
@danbri danbri self-assigned this Mar 18, 2022
@danbri
Copy link
Contributor Author

danbri commented Mar 18, 2022

Started a rough script that copies things into the right place in an ephemeral "_serving" folder.

@danbri
Copy link
Contributor Author

danbri commented Mar 18, 2022

Possible diagnosis and fix for this not running: we're using very simple static HTTP servers that aren't sending the right media type headers for things that are in JSON (or any format for that matter).

I tried

    $.get(`tests`, (res) => { 
        let jres = $.parseJSON(res);    
        initTests(jres.tests)
    });

... in core.js line 39 and it seems to work.

Another gotcha, the demo assumes at least 3 tests will be sent from /tests, currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant