Tagged a first version of the TWiki to FusionForge’s MediaWiki converter

As announced previously, I’ve been hacking on a migration tool allowing to import into the MediaWiki of a FusionForge project, a conversion of the contents of a TWiki wiki.

I’ve succesfully imported a first project (from PicoForge to FusionForge) using the tool, so I’ve decided to tag a first release and make the Git repo accessible.

More details at : https://fusionforge.int-evry.fr/projects/pytwiki2mediawi/

Feel free to ask here in the comments or by email, in case of need.

And, yes, my Python is most likely awful, but at least, this works, and much more featureful than existing tools I could test.

Working on a TWiki to MediaWiki converter (targetting FusionForge wikis)

I’m currently working on a wiki converter allowing me to transfer old TWiki wikis (hosted on picoforge) to MediaWikis hosted on FusionForge.

Unlike existing tools that I’ve found that more or less target the same needs, mine will address two peculiarities :

  • using MediaWiki’s API to perform the import, where many tools seemed to use SQL requests: this should allow non-administrator users to do the job,
  • importing to wikis of projects hosted on FusionForge instances, even when the project is not public, which means that the API calls need to authenticate to FusionForge first.

The tool is written in Python, and will include my own crappy wiki syntax converter in Python, instead of spawning existing Perl scripts, as others did.

It may happen to work for FosWiki too, but I don’t intend to use it beyond our old TWiki installations, for a start.

Stay tuned for more progress updates.

Edit: I’ve now released a first version.

Offline backup mediawiki with httrack

I’ve had the need to restore the contents of a wiki which ran mediawiki, recently. Unfortunately there were no backups, and my only solution was to restore from an outdated version that was available in Google’s cache.

The problem was that I only had the HTML “output” version and copy-pasting it into the Wiki sources on restore time lost all formatting and links.

Thus I’ve come up with the following script which is con-ed to make systematic backups in the background, both of an offline viewable version of the wiki, in static HTML pages, and of the wiki pages’ sources, for eventual restoration.

It uses the marvelous httrack and wget tools.

Here we go :

#! /bin/sh

site=wiki.my.site
topurl=http://$site

backupdir=/home/me/backup-websites/$site

httrack -%i -w $topurl/index.php/Special:Allpages \
-O "$backupdir" -%P -N0 -s0 -p7 -S -a -K0 -%k -A25000 \
-F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F '' \
-%s -x -%x -%u \
"-$site/index.php/Special:*" \
"-$site/index.php?title=Special:*" \
"+$site/index.php/Special:Recentchanges" \
"-$site/index.php/Utilisateur:*" \
"-$site/index.php/Discussion_Utilisateur:*" \
"-$site/index.php/Aide:*" \
"+*.css" \
"-$site/index.php?title=*&oldid=*" \
"-$site/index.php?title=*&action=edit" \
"-$site/index.php?title=*&curid=*" \
"+$site/index.php?title=*&action=history" \
"-$site/index.php?title=*&action=history&*" \
"-$site/index.php?title=*&curid=*&action=history*" \
"-$site/index.php?title=*&limit=*&action=history"

for page in $(grep "link updated: $site/index.php/" $backupdir/hts-log.txt | sed "s,^.*link updated: $site/index.php/,," | sed 's/ ->.*//' | grep -v Special:)
do
wget -nv -O $backupdir/$site/index.php/${page}_raw.txt "$topurl/index.php?index=$page&action=raw"
done

Hope this helps,