Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

Convert the database back to RDF #12

Closed
KonradHoeffner opened this issue Jan 19, 2021 · 13 comments
Closed

Convert the database back to RDF #12

KonradHoeffner opened this issue Jan 19, 2021 · 13 comments
Assignees
Labels
enhancement New feature or request

Comments

@KonradHoeffner
Copy link
Contributor

KonradHoeffner commented Jan 19, 2021

Try Ontop

Try Sparqlify

  • .deb package links are dead right now, created an issue
  • installing via Maven failed, created an issue and spoke with Claus Stadler, the maintainer

Ontop was ultimately and successfully used.

@KonradHoeffner KonradHoeffner self-assigned this Jan 19, 2021
@KonradHoeffner KonradHoeffner added the enhancement New feature or request label Jan 19, 2021
@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Mar 3, 2021

Tables

Citation Tables

citation Table Columns

  • suffix
  • swp_suffix (uses the view swp_citation_rdf)
  • label
  • comment
  • type

citation_has_classified Table Columns

  • citation_suffix
  • classified_suffix
  • properly construct the property based on the type of the citation

Uses the view citation_classified_rdf.

Softwareproduct Tables

  • softwareproduct

  • swp_has_child

  • swp_has_client

  • swp_has_programminglibrary

  • swp_has_language

  • swp_has_interoperabilitystandard

  • swp_has_programminglanguage

  • swp_has_operatingsystem

  • swp_has_license

  • swp_has_databasesystem

  • feature_supports_function

Other Source of Truth

Those don't get converted because they are not edited in the database frontend because their source of truth resides somewhere else.

@KonradHoeffner
Copy link
Contributor Author

All the tables are converted now, so the data can now be validated and fed back into the SPARQL endpoint.

@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Mar 3, 2021

feature_supports_function values are already present in the endpoint from the catalogues and the connection is on the classified level, not the citation level. I think we don't need that table at all as its source of truth is the catalogues. Spun out into its own issue #18.

@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Mar 3, 2021

Next Step: Diff between endpoint and converted database data, starting with softwareproduct.

@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Mar 4, 2021

Diff of citations contains 2135 lines. Minimize.

date time action different lines endpoint only database only common
2021-03-05 2474
2021-03-05 2457
2021-03-05 11:34 1759
2021-03-05 11:59 update citation_has_classified from SPARQL endpoint 1442
2021-03-05 13:43 exclude studies from the diff SPARQL query 1371
2021-03-10 14:34 60 1300 1030
2021-03-11 15:13 rename Abott-IStat to AbbotIStat 50 1290 1040
2021-03-11 15:23 rename AgfaPacsImpax features 45 1285 1045
2021-03-11 15:32 add agfa ris elefante future 42 1285 1048
2021-03-11 15:36 fix label typo 40 1283 1050
2021-03-11 15:59 Fix Bahmni base organizational unit classified property. 37 1283 1053
2021-03-12 15:37 feature supports function citation 8 1283 1082
2021-04-06 13:50 fix individual.ttl references 0 1340 1089
2021-04-06 13:59 add two citations 0 1334 1095
2021-04-06 14:11 some products.ttl 0 1327 1102
2021-04-06 18:41 db.ttl and gnuhealth.ttl 16 1186 1598
2021-04-06 19:17 more products and citations 16 1183 1654
2021-04-07 10:43 products, citations, fix citation suffixes in db 380 1544 1351
2021-04-07 15:23 138 306 2586
2021-04-08 11:55 merge the rest of the products and citations 8 2 2944
2021-04-19 10:50 investigate, why so many new triples 23 1867 2928
2021-04-19 15:04 merged most of the new triples 33 47 4748

@KonradHoeffner
Copy link
Contributor Author

I uploaded missing Bahmni features to the database but the diff is still large, investigate whether the upload worked.

KonradHoeffner added a commit that referenced this issue Mar 5, 2021
KonradHoeffner added a commit that referenced this issue Mar 10, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 7, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 7, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 7, 2021
…y. Add Bahmni to swp.ttl. Still need to remove bahmni.ttl but check first if something is missing. Part of hitontology/database#12.
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 7, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 7, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 7, 2021
@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Apr 8, 2021

Mass renaming primary keys in PSQL is possible with queries such as the following:

UPDATE "public"."citation"
SET "suffix"=REPLACE("suffix",'RIS','Ris')
WHERE "suffix" LIKE 'RIS%'

Postgres keeps foreign key references intact by renaming them as well.

See https://www.postgresqltutorial.com/postgresql-replace/.

KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 8, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 8, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 8, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 8, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 8, 2021
KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 8, 2021
@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Apr 8, 2021

The citations and software products are now fully merged. Closing this issue. If anything important crops up, please reopen.

@KonradHoeffner
Copy link
Contributor Author

Jetzt sind auf einmal 1865 neue Tripel in der Datenbank, wo kommen die her? Investigate und Hinzufügen.

@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Apr 19, 2021

UPDATE "public"."citation"
SET "suffix"=REPLACE("suffix",'LIMS','Lims')
WHERE "suffix" LIKE 'BikaLIMS%'

@KonradHoeffner
Copy link
Contributor Author

KonradHoeffner commented Apr 19, 2021

Remove all special characters from URIs.

UPDATE "public"."citation"
SET "suffix"=REGEXP_REPLACE("suffix",'[^\w]','')
WHERE "suffix" ~ '.*[^\w].*'

Needed to run twice because it only replaces one occurrence each go.

@KonradHoeffner
Copy link
Contributor Author

Umlaute ersetzen:

UPDATE "public"."citation"
SET "suffix"=REPLACE("suffix",'ä','ae')
WHERE "suffix" LIKE '%ä%'

Analog für ö und ü.

@KonradHoeffner
Copy link
Contributor Author

UPDATE "public"."citation"
SET "suffix"=REPLACE("suffix",'RIS','Ris')
WHERE "suffix" LIKE 'RIS%'

KonradHoeffner added a commit to hitontology/ontology that referenced this issue Apr 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant