Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Construct queries for DataIds #2

Closed
holycrab13 opened this issue Oct 12, 2021 · 3 comments
Closed

Construct queries for DataIds #2

holycrab13 opened this issue Oct 12, 2021 · 3 comments

Comments

@holycrab13
Copy link
Contributor

holycrab13 commented Oct 12, 2021

Based on the subject with type dataid:Dataset I want to create a graph with only the subjects that are somehow linked to the Dataset subject.

E.g. all the triples with subject of type Dataset (?dataset) but also all the triples with the subject ?proof (with ?dataset sec:proof ?proof)

?dataset a dataid:Dataset .
?dataset ?p ?o .

?dataset sec:proof ?proof .
?proof ?q ?r .

Doing this only for ?proof works fine, however this has to be done several times for different properties (full construct query below the example data)

Example data:

{
  "@context" : "http://localhost:3000/system/context.jsonld",
  "@graph" : [
    {
      "@id": "http://localhost:3000/jan/examples",
      "@type": "dataid:Group",
      "title": { "@value" : "Example Group", "@language" : "en" },
      "abstract": { "@value" : "This is an example group for API testing.", "@language" : "en" },
      "description": { "@value" : "This is an example group for API testing.", "@language" : "en" }
    },
    {
      "@id": "http://localhost:3000/jan/examples/dbpedia-ontology-example",
      "@type": "dataid:Artifact"
    },
    {
      "@id": "http://localhost:3000/jan/examples/dbpedia-ontology-example/2021-10-12",
      "@type": "dataid:Version"
    },
    {
      "@id": "http://localhost:3000/jan/examples/dbpedia-ontology-example/2021-10-12#Dataset",
      "@type": "dataid:Dataset",
      "title": { "@value" : "DBpedia Ontology Example", "@language" : "en" },
      "abstract": { "@value" : "This is an example for API testing.", "@language" : "en" },
      "description": { "@value" : "This is an example for API testing.", "@language" : "en" },
      "publisher": "http://localhost:3000/jan#this",
      "group": "http://localhost:3000/jan/examples",
      "artifact": "http://localhost:3000/jan/examples/dbpedia-ontology-example",
      "version": "http://localhost:3000/jan/examples/dbpedia-ontology-example/2021-10-12",
      "hasVersion": "2021-10-12",
      "issued": "2021-10-12T13:19:05Z",
      "license": "http://creativecommons.org/licenses/by/4.0/",
      "distribution": [
        {
          "@id": "http://localhost:3000/jan/examples/dbpedia-ontology-example/2021-10-12#ontology--DEV_type=parsed_sorted.nt",
          "@type": "dataid:SingleFile",
          "issued": "2021-10-12T13:19:05Z",
          "file": "http://localhost:3000/jan/general/test/2021-10-11/ontology--DEV_type=parsed_sorted.nt",
          "format": "nt",
          "compression": "none",
          "downloadURL": "https://akswnc7.informatik.uni-leipzig.de/dstreitmatter/archivo/dbpedia.org/ontology--DEV/2021.07.09-070001/ontology--DEV_type=parsed_sorted.nt",
          "byteSize": "4439722",
          "sha256sum": "b3aa40e4a832e69ebb97680421fbeff968305931dafdb069a8317ac120af0380",
          "hasVersion": "2021-10-12"
        }
      ]
    }
  ]
}

Construct Query:

PREFIX dataid: <http://dataid.dbpedia.org/ns/core#>
PREFIX dataid-cv: <http://dataid.dbpedia.org/ns/cv#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dcat:  <http://www.w3.org/ns/dcat#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX databus: <https://databus.dbpedia.org/system/ontology#>
PREFIX sec: <https://w3id.org/security#>

# Creates a collection
CONSTRUCT
{
  ?dataset a dataid:Dataset .
  ?dataset ?p ?o .
  ?proof ?q ?r .
  ?version a dataid:Version .
  ?artifact a dataid:Artifact .
  ?distribution a dataid:SingleFile .
  ?distribution ?e ?f .
  ?cvProperty ?j ?k .
}
WHERE
{
  ?dataset a dataid:Dataset .
  ?dataset ?p ?o .

  ?dataset sec:proof ?proof .
  ?proof ?q ?r .

  ?dataset dataid:version ?version .
  ?version a dataid:Version .

  ?dataset dataid:artifact ?artifact .
  ?artifact a dataid:Artifact .

  ?dataset dcat:distribution ?distribution .
  ?distribution a dataid:SingleFile .
  ?distribution ?e ?f .

  ?distribution ?cvProperty ?cvLiteral .
  ?cvProperty ?j ?k .
  ?cvProperty rdfs:subPropertyOf dataid:contentVariant .
}

Problem: Running the query is very slow!

@kurzum
Copy link
Member

kurzum commented Oct 14, 2021

@holycrab13 I have the following questions:

  1. Where is the query done? I thought that this was the query limiting the POSTed DataId on Register in order to prevent adding extra triples. Or is this query the one creating the collections?
  2. The sample data does not contain sec:proof. But maybe it is not necessary.
  3. Could the query not be like this:
CONSTRUCT {
?d a dataid:Dataset .
?d dct:title ?title . 
?d dcat:distribution ?distribution . 
# [... all allowed props]
?distribution dct:title ?distributiontitle .
# [... all allowed props]
} WHERE {
?d a dataid:Dataset .
?d dct:title ?title . 
?d dcat:distribution ?distribution . 
# [...]
?distribution dct:title ?distributiontitle .
# [...]
}

or

CONSTRUCT {} WHERE { ?s ?p1 ?o1 . ?o1 ?p2 ?o2.  FILTER ?p1 in (<dcat:distribution>, <dataid:version, .... >)   .  FILTER ?p2 in ...  } 

@holycrab13
Copy link
Contributor Author

holycrab13 commented Oct 14, 2021

  1. Query is done in an extra module here:

https://github.com/dbpedia/databus/blob/master/server/app/common/execute-construct.js.
Adding extra triples is an issue in both POST and PUT, so both use it.

The entire publishing process with calls to execute-construct is done here:
https://github.com/dbpedia/databus/blob/master/server/app/publish/publish-dataid.js
This is done for POST right now, but I will also route the PUT request to this code.

Construct query is here: https://github.com/dbpedia/databus/blob/master/server/app/common/queries/constructs/construct-version.sparql

  1. testing works without sec:proof, I can add it but it's not so important for the issue
  2. I will give it a try!

@holycrab13
Copy link
Contributor Author

holycrab13 commented Oct 18, 2021

I had to use OPTIONAL in the construct queries - searching for optional clauses in construct queries led me to this article:
http://www.snee.com/bobdc.blog/2014/10/dropping-optional-blocks-from.html

They suggest using UNION instead to save performance - it's somewhat unrelated as I wasn't using OPTIONAL before but I wrote this query:

PREFIX dataid: <http://dataid.dbpedia.org/ns/core#>
PREFIX dataid-cv: <http://dataid.dbpedia.org/ns/cv#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dcat:  <http://www.w3.org/ns/dcat#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX databus: <https://databus.dbpedia.org/system/ontology#>
PREFIX sec: <https://w3id.org/security#>

# Creates a collection
CONSTRUCT
{
  ?dataset a dataid:Dataset .
  ?dataset ?p ?o .
  ?proof ?q ?r .
  ?version a dataid:Version .
  ?artifact a dataid:Artifact .
  ?distribution a dataid:SingleFile .
  ?distribution ?e ?f .
  ?cvProperty ?j ?k .
}
WHERE
{
  ?dataset a dataid:Dataset .
  ?dataset dcat:distribution ?distribution .

  {
    ?dataset ?p ?o .
  }
  UNION
  {
    ?distribution a dataid:SingleFile .
    ?distribution ?e ?f .
  }
  UNION
  {
    ?dataset dataid:version ?version .
    ?version a dataid:Version .
  }
  UNION
  {
    ?dataset dataid:artifact ?artifact .
    ?artifact a dataid:Artifact .
  }
  UNION
  {
    ?dataset sec:proof ?proof .
    ?proof ?q ?r .
  }
  UNION
  {
    ?distribution ?cvProperty ?cvLiteral .
    ?cvProperty ?j ?k .
    ?cvProperty rdfs:subPropertyOf dataid:contentVariant .
  }
}

... and it's super fast and appears to be correct.
UNION is okay to use as there is a SHACL validation after this step.

gitbook-com bot pushed a commit that referenced this issue Jan 25, 2022
manonthegithub added a commit that referenced this issue May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants