Building a lab VM based on Debian for a MOOC, using Vagrant + VirtualBox

We’ve been busy setting up a Virtual Machine (VM) image to be used by participants of a MOOC that’s opening in early september on Relational Databases at Telecom SudParis.

We’ve chosen to use Vagrant and VirtualBox which are used to build, distribute and run the box, providing scriptability (reproducibility) and making it portable on most operating systems.

The VM itself contains a Debian (jessie) minimal system which runs (in the background) PostgreSQL, Apache + mod_php, phpPgAdmin, and a few applications of our own to play with example databases already populated in PostgreSQL.

As the MOOC’s language will be french, we expect the box to be used mostly on machines with azerty keyboards. This and other context elements led us to add some customizations (locale, APT mirror) in provisioning scripts run during the box creation.

At the moment, we generate 2 variants of the box, one for 32 bits kernel (i686) and one for 64 bits kernel (amd64) which (once compressed) represent betw. 300 and 350 Mb.

The resulting boxes are uploaded to a self-hosting site, and distributed through vagrantcloud. Once the VM are created in VirtualBox, the typical VMDK drives file is around 1.3Gb.

We use our own Debian base boxes containing a minimal Debian jessie/testing, instead of relying on someone else’s, and recreate them using (the development branch version of) bootsrap-vz. This ensure we can put more trust in the content as it’s a native Debian package installation without MITM intervention.

The VM are meant to be run headless for the moment, keeping their size to the minimum, even though we also provide a script to install and configure a desktop environment based on XFCE4.

The applications are either used through vagrant ssh, for instance for SQL command-line in psql, or in the Web browser, for our own Web based SQL exerciser, or phpPgAdmin (see a demo screencast (in french, w/ english subtitles)), which can then be used even off-line by the participants, which also means this requires no servers availability for our IT staff.

The MOOC includes a section on PHP + SQL programming, whose exercises can be performed using a shared sub-folder of /vagrant/ which allows editing on the host with the favourite native editor/IDE, while running PHP inside the VM’s Apache + mod_php.

The sources of our environment are available as free software, if you’re interested to replicate a similar environment for another project.

As we’re still polishing the environment before the MOOC opening (on september 10th), I’m not mentioning the box URLs but they shouldn’t be too hard to find if you’re investigating (refering to the fusionforge project’s web site).

We don’t know yet how suitable this environment will be for learning SQL and database design and programming, and if Vagrant will bring more difficulties than benefits. Still we hope that the participants will find this practical, allowing them to work on the lab / exercises whenever and wherever they chose, removing the pain of installing and configuring a RDBMS on their machines, or the need to be connected to a cloud or to our overloaded servers. Of course, one limitation will be the requirements on the host machines, that will need to be reasonably modern, in order to run a virtualized Linux system. Another is access to high bandwidth for downloading the boxes, but this is kind of a requirement already for downloading/watching the videos of the MOOC classes 😉

Big thanks go to our intern Stéphane Germain, who joined us this summer to work on this virtualized environment.

10 thoughts on “Building a lab VM based on Debian for a MOOC, using Vagrant + VirtualBox”

  1. Hi

    We’re also looking at using a headless Linux VM originally built using vagrant/puppet for use in an Open University distance education course on data and databases. Our VM inclues postgresql and MongoDB, with IPython notebooks delivering the activities. As with you, we’re looking at making 32 and 64 bit versions available. Students will download and run VMs on their own computers.

    I put tty.sh into the box as well to provide ssh access if required.

    At the moment the plan is to get students to boot a prebuilt VM using vagrant. The script will provision running services but not do updates etc. I guess we could instead give a prebuilt box that autoruns services, but the vagrant startup route provides us with some flexibility aside from shipping new boxes.

    Some old notes documenting my thinking several months ago can be found at:
    http://blog.ouseful.info/2014/05/15/confused-again-about-vm-ecology-i-blame-not-blogging/
    and a more recent review at:
    https://docs.google.com/document/d/1A4voiCM22-3KmHkq_FhKUxSXVI4mFKV8_A0-brB2IgE/edit?usp=sharing

    We’re also looking at using docker within the VM to fire up several containers containing separate mongo instances to demonstrate mongo replica sets in one particular activity.

    I’d be really keen to hear how you got on….:-)

  2. Hi Tony.

    Glad you noticed this post of mine 🙂

    Actually, I think the experiment went quite well, as far as I can judge (you don’t hear so much from satisfied participants in the MOOCs, or at least less than those having issues 😉

    Still there have been a few failures, linked to some VirtualBox networking issues, mainly, AFAIU, but it’s hard to determine exactly what breaks when you’re not a powerful Windows user, and when the only support channel is the MOOC forums :-/

    We’re currently preparing a communication (a paper to be submitted to forecoming CfP) that will try to summarize the experiment.

    I’ll read your documents and will get back to you.

  3. Re: “there have been a few failures, linked to some VirtualBox networking issues” – our VM will undergo some quite serious testing as part of the OU quality process, so I’m looking forward (?!, erm, that’s not quite right, is it?!) to what they come back with.

    The proof will come in the first presentation – what I’m hoping is that if we get any issues some of the students who may encounter them are willing to work with us in trying to properly debug them. I will then blog whatever gotchas we find and what fixes we come up with.

  4. I’m not sure whether it’s possible to get assistance from Oracle for instance, on VirtualBox. In our case, that wouldn’t be possible given the free (as in free beer) MOOC and low available budget on our side, but YMMV 😉

Leave a Reply

Your email address will not be published.