Experimenting with Linked Open Data about FLOSS projects : matching Debian upstream projects

I’ve been experimenting with Linked Open Data about FLOSS projects harvested from different sources of DOAP or ADMS.SW descriptions. I’ve tried and match upstream projects of Debian packages with upstream projects hosted at Apache, Gnome, or Alioth.debian.org, or catalogued on Pypi.

I’m matching them on identical values of the Homepage field (comparing the Homepage Control field set by Debian packagers with the doap:homepage meta-data in the RDF documents harvested from the upstream project catalogues).

Here are initial results of my little experiment, for number of matched projects, and results on project name’s similarity :

Upstream catalogue Total matching projs Exact same project name Same project name (case independant)
apache 31 0 (0 %) 0 (0 %)
alioth 16 13 (81 %) 13 (81 %)
pypi 439 217 (49 %) 273 (62 %)
gnome 21 0 (0 %) 7 (33 %)
Total 507 230 (45%) 293 (58 %)

The data set contains tens of thousands of projects, with probably many duplicates, but from all of these, only 507 have common homepages.

As you can see, in some cases, the Debian source package names match the upstream project name (sometimes with lower/upper case variants), but in general, the project names aren’t identical, so it is interesting to try and match them by homepage.

For the curious ones, the Apache, Gnome and Pypi project catalogues use to provide RDF meta-data for quite some time. More recently have we introduced ADMS.SW meta-data for Debian source packages, and even more recently for the Alioth projects (through the ADMS.SW exporter plugin for FusionForge).

There are still some ways for improvements, for instance to normalize homepage URLs which tend to vary (trailing slashes, or different HTTP/HTTPS schemes).

Stay tuned for more details.

Periodic notification of received mails processed by dovecot’s sieve

I’m using dovecot’s sieve to filter incoming mail into different folders. As this works in the backround (through fetchmail + procmail + dovecot’s deliver), I may not know that new mail is available until I check in Gnus or notmuch.

So I’d like to be notified by notification popups in my Gnome desktop (I’m using gnome3’s fallback session, FWIW).

I’ve added the following crontab (crontab -e) :

* * * * * ~/bin/dovecot-logs-stats.sh

This is a shell script that will check the new lines added to dovecot’s sieve log file, counting how much new mails have been added in each folder.
Continue reading “Periodic notification of received mails processed by dovecot’s sieve”

Vidéos de certaines confs du thème Développement des RMLL 2009

Free Electrons vient de mettre en ligne des vidéos des conférences qu’ils ont filmées aux RMLL 2009.
Une partie des vidéos porte sur le thème “Développement” que j’ai coordonné.

Voici la liste des interventions filmées :

  • Migration à Git du projet GNOME, par Frédéric Peters (GNOME)
  • Méthodes de vote : comment consulter les développeurs d’un projet sans fausser le résultat avant de poser la question, par Lucas Nussbaum (Debian)
  • Développement Open Source : un retour des tranchées, le projet Jajuk, par Bertrand Florat (Jajuk)
  • Comment les outils et l’esprit Open Source permettent de faire de meilleurs projets, par Erlé Le Gac, Bertrand Florat (Capgemini)

Le son n’est pas toujours top, mais c’est fait avec les moyens du bord, bénévolement 😉

Vous trouverez les pointeurs vers les vidéos et les slides chez Free Electrons (en bas de liste).

Et n’oubliez pas : l’édition 2010 remet ça, avec un appel à contributions toujours ouvert.

Quick visualization of the social network of the Gnome projects’ maintainers

Thanks to Frédéric Peters, I’ve discovered that Gnome is using DOAP to describe projects in its Git.

At http://git.gnome.org/repositories.doap, one will find the complete agregated DOAP and FOAF description for the projects of Gnome and their maintainers.

My collegue Vu, who’s researching on such social networks has graphed the following image :
which shows projects in red, and their maintainers in green.

More analysis later maybe, but the image is already nice in its own 🙂

Update : a more detailed representation is available (see comments bellow).