[SLL] Mailman/pipermail plain text archives: convert to mbox?

Paul paul at oz.net
Mon Feb 20 22:22:26 EST 2006


I think I figured it out.

First, as you found out, my script had a bug generating the very first
line feed.  I didn't catch it because pine is more forgiving.  :p

Second, it looks like mutt is a bit more anal about the mbox format;
in particular the date format and it requires a real email address in
the envelope From.

  http://www.mutt.org/doc/manual/manual-4.html#ss4.6
  http://lists.debian.org/debian-68k/2003/08/msg00027.html

Third, the particular mbox archives you were looking at had obfuscated
the sender email addresses (e.g. from "foo at bar.com" to "foo at bar.com").
Mutt doesn't like this.

So attached below is a revised Python script.  It's not a complete
solution (it doesn't bother to unobfuscate all email addresses, just
the envelope From), but it seems to work:

  $ ./to-mbox.py 2004-January.txt  out
  $ mutt -f out                        # (shows the email thread)

Cheers,

-- Paul

$ cat to-mbox.py
#!/usr/bin/env python
"""
to-mbox.py:  Insert line feeds to create mbox format

Usage:   ./to-mbox.py  infile outfile
"""
import sys

if len(sys.argv) != 3:
    print __doc__
    sys.exit()

out = open(sys.argv[2],"w")

start = True
for line in open(sys.argv[1]):
    if line.find("From ") == 0:
        if start:
            start = False
        else:
            out.write("\n")
        line = line.replace(" at ", "@")
    out.write(line)

out.close()




More information about the linux-list mailing list