Training bayesian filters to recognize discussion topics in mailing list archives ?

We’ve discussed this idea at the OSS2008 WoPDaSD 2008 workshop with Sander Striker, and I’d be curious to know if it’s been attempted to analyse contents of mailing-lists with bayesian filter of spamassassin in order to detect particular topics of discussion, instead of filtering out spam.

Sometimes researchers on FLOSS projects try to analyse mailing-lists to detect communication patterns.

Maybe spamassassin could be used as a common bayesian analyser to be train to recognize subjects of discussions in order to help detect common topics in mailing-lists ? Of course this may have allready been tried… will be curious of our comments on that idea.

Update 2008/09/22 : more accurate title, as it is not all about spamassassin (see comment bellow).

2 thoughts on “Training bayesian filters to recognize discussion topics in mailing list archives ?”

  1. Actually, I think spamassassin isn’t of course the only app which can be used to do so… as we were speaking with an Apache project representative, SpamAssassin came to our mind, but the need would be any Bayesian filter solution, of course.

    Maybe using bogofilter would be more convenient even, as spamassassin doesn’t rely only on bayesian technique to filter spam.

    Of course, a set of “spam” messages would be trained for each conversation which needs to be observed.

    I wonder what it would take to develop something like that for evolution or thunderbird for instance…

Leave a Reply

Your email address will not be published.