Building a lab VM based on Debian for a MOOC, using Vagrant + VirtualBox

We’ve been busy setting up a Virtual Machine (VM) image to be used by participants of a MOOC that’s opening in early september on Relational Databases at Telecom SudParis.

We’ve chosen to use Vagrant and VirtualBox which are used to build, distribute and run the box, providing scriptability (reproducibility) and making it portable on most operating systems.

The VM itself contains a Debian (jessie) minimal system which runs (in the background) PostgreSQL, Apache + mod_php, phpPgAdmin, and a few applications of our own to play with example databases already populated in PostgreSQL.
Continue reading “Building a lab VM based on Debian for a MOOC, using Vagrant + VirtualBox”

Debian docker containers using a modified baseimage-docker

I have been testing Docker for a few weeks now, and investigated the use of baseimage-docker, which provides support for supervising services with runit, and includes OpenSSH, among other things, based on an Ubuntu base system. Of couse, I’m interested in a Debian counterpart.

I had initially followed instructions provided by Steve Kemp which also prepared a Debian image including OpenSSH and runit, but it appears that baseimage-docker provides more tiny bits that avoid reinventing the wheel.

I’ve then forked the baseimage-docker to do a quick and dirty adaptation for Debian. There’s a sid variant (my ‘debian’ branch) and a wheezy one (my ‘wheezy’ branch, unsurprisingly). I haven’t used all features of baseimage-docker, so things might break for sure.

For the records, I’m playing with it as a base image to construct a docker-based container running the FusionForge test suite.

Did I warn you it’s quick and dirty and without any warranty ? Hoping that this is useful anyway.

Experimenting with Linked Open Data about FLOSS projects : matching Debian upstream projects

I’ve been experimenting with Linked Open Data about FLOSS projects harvested from different sources of DOAP or ADMS.SW descriptions. I’ve tried and match upstream projects of Debian packages with upstream projects hosted at Apache, Gnome, or Alioth.debian.org, or catalogued on Pypi.

I’m matching them on identical values of the Homepage field (comparing the Homepage Control field set by Debian packagers with the doap:homepage meta-data in the RDF documents harvested from the upstream project catalogues).

Here are initial results of my little experiment, for number of matched projects, and results on project name’s similarity :

Upstream catalogue Total matching projs Exact same project name Same project name (case independant)
apache 31 0 (0 %) 0 (0 %)
alioth 16 13 (81 %) 13 (81 %)
pypi 439 217 (49 %) 273 (62 %)
gnome 21 0 (0 %) 7 (33 %)
Total 507 230 (45%) 293 (58 %)

The data set contains tens of thousands of projects, with probably many duplicates, but from all of these, only 507 have common homepages.

As you can see, in some cases, the Debian source package names match the upstream project name (sometimes with lower/upper case variants), but in general, the project names aren’t identical, so it is interesting to try and match them by homepage.

For the curious ones, the Apache, Gnome and Pypi project catalogues use to provide RDF meta-data for quite some time. More recently have we introduced ADMS.SW meta-data for Debian source packages, and even more recently for the Alioth projects (through the ADMS.SW exporter plugin for FusionForge).

There are still some ways for improvements, for instance to normalize homepage URLs which tend to vary (trailing slashes, or different HTTP/HTTPS schemes).

Stay tuned for more details.