UDD export as RDF data through D2R and a corresponding mapping definition

(this is a copy of a post I made on the debian-qa list)

I’ve been pursuing previous ideas about the use of Semantic Web standards for publication of facts about Debian using the UDD database as a source.

FYI, I have a running prototype (unfortunately only on my laptop at the moment, hopefully on the Web some day soon) which uses D2R, a server which can publish RDBMS tables as RDF documents, using a mapping description.

I’m documenting bits of what I’ve done in https://picoforge.int-evry.fr/cgi-bin/twiki/view/Helios_wp3/Web/UltimateDebianDatabaseToRDF should you be interested.

This first prototype allow to express queries such as :

SELECT DISTINCT ?b ?s WHERE {
?b vocab:bugs_package .
?b debbugs:hasSubmitter ?s
}

which returns a list of all bugs id and corresponding bug reporters, i.e. “bugs.submitters” (From) addresses on the package sympa in the form of links to resources like :

http://localhost:2020/resource/debbug/169102 - http://localhost:2020/resource/foafdebbug/Olivier+Berger+%3Colivier.berger%40it-sudparis.eu%3E

which each point to different RDF documents like :

http://localhost:2020/resource/debbug/169102 :
debbugs:bugUrl
debbugs:hasSubmitter db:foafdebbug/Olivier+Berger+%3Colivier.berger%40it-sudparis.eu%3E
debbugs:number 169102
debbugs:summary "sympa: Lack of exim configuration for pipe transport"
vocab:bugs_arrival "2002-11-14T16:48:05"^^xsd:dateTime
vocab:bugs_package db:package/sympa
vocab:bugs_severity debbugs:Important
rdf:type debbugs:Debbug
rdfs:label "Debian bug #169102"

and http://localhost:2020/resource/foafdebbug/Olivier+Berger+%3Colivier.berger%40it-sudparis.eu%3E :

rdf:type foaf:Person
foaf:mbox
foaf:mbox_sha1sum "6688a14521cd97db162af8f9757f2e2232300e50"
foaf:name "Olivier Berger"

and with inverse relationships :
debbugs:hasSubmitter db:debbug/169102
debbugs:hasSubmitter db:debbug/208203
...

This is obtained with a simple mapping of data from the bugs table in UDD to several ontologies like “foaf“, and “debbugs” (a fictive one).

When running such sparql queries, or navigating the URLs provided by D2R, the mapping is applied and corresponding SQL queries are done on the UDD database (like :

SELECT DISTINCT 1 FROM "d2r_bugsubmitters" WHERE
"d2r_bugsubmitters"."submitter" = 'Olivier Berger '

SELECT DISTINCT "d2r_bugs"."id" FROM "d2r_bugs" WHERE
"d2r_bugs"."submitter" = 'Olivier Berger '

Of course, this would be great if such a service was on the (Semantic) Web, with URI for resources becoming real URLs and which may then be navigated to discover links to other resources (like my other FOAF profiles searching for my foaf:mbox_sha1 on the Web, which would return http://www-public.it-sudparis.eu/~berger_o/foaf.rdf for instance).

I hope this will trigger some interest by UDD maintainers, and maybe we can continue discussing that, and maybe plan to have a D2R server alongside UDD on udd.debian.org some day.

2 thoughts on “UDD export as RDF data through D2R and a corresponding mapping definition”

  1. Bellow a copy of a followup I’ve sent to debian-qa

    FYI, here’s the return of our tool to do some UDD facts export as RDF.

    Here’s a server we have setup, which provides some RDF descriptions of bugs out of what is contained in the UDD database :
    http://testforge.int-evry.fr/d2r-server/

    This is work in progress and not meant for end-users. For reference on how we installed D2R to deploy this :
    https://picoforge.int-evry.fr/cgi-bin/twiki/view/Helios_wp3/Web/UltimateDebianDatabaseToRDF

    One will find metadata about bugs, but also some metadata about Debian packages (minimal), or Debbugs users (bug submitters), and Debian developers (out of carnivore), as SIOC and FOAF bits.

    Given one bug number, one may get an XML/RDF version using :
    curl -H 'Accept: application/rdf+xml' http://testforge.int-evry.fr/d2r-server/data/debbugs/585740

    or the N3 version with :
    curl -H 'Accept: text/rdf+n3' http://testforge.int-evry.fr/d2r-server/data/debbugs/585740

    Btw, on our D2R server, one can also query custom searches using SPARQL. For instance :
    http://testforge.int-evry.fr/d2r-server/snorql/?query=SELECT+DISTINCT+*+WHERE+{%0D%0A++%3Fs+%3Fp+%3Fo.%0D%0A++%3Fs+owl%3AsameAs+%3Chttp%3A%2F%2Fbugs.debian.org%2F585740%3E.%0D%0A}%0D%0ALIMIT+10 which will search bug properties for bug #585740 :
    SELECT DISTINCT * WHERE {
    ?s ?p ?o.
    ?s owl:sameAs .
    }
    LIMIT 10

    About people, for an example, here’s my FOAF profile :
    http://testforge.int-evry.fr/d2r-server/resource/foafcarnivore/6688a14521cd97db162af8f9757f2e2232300e50

    Given one’s email address, it’s easy to access one’s FOAF profile there with for instance :
    curl -H 'Accept: application/rdf+xml' http://testforge.int-evry.fr/d2r-server/data/foafcarnivore/`echo -n "mailto:$DEBEMAIL" | sha1sum | sed 's/ .*$//'`

    FYI, on related topics, I’ve filed to wishlists, respectively for debbugs and the PTS :

    * #590931: Would be great to integrate RDFa metadata into debbugs pages

    * #585740: Would be great if the PTS could provide RDFa metadata

Leave a Reply

Your email address will not be published.