File List
Not logged in

Files of check-in [b3e874270c] in the top-level directory


mhrecoll

nmh/MH frontend for recoll queries

Installation

Install recoll and the Python 2.7 API (and of course the Python 2.7 interpreter). Python 3 will require only small changes, but I have not tried using it yet because the recoll package I have installed (from the FreeBSD package repository) only builds the Python 2.7 API.

You do not need nmh.

mhrecoll itself is distributed as a single file that you can download and put in your path. Something like this will work, where 0.3 is the mhrecoll version.

wget -O /usr/local/bin/mhrecoll https://apenstaartje.catsy.org/fossil/mhrecoll/uv/build/mhrecoll-0.3
chmod +x /usr/local/bin/mhrecoll

Configuration

mhrecoll itself uses only the mh_profile as a configuration file. You may need to edit this file even if you are using a different mail user agent.

You may also want to edit the recoll.conf.

mh_profile

mhrecoll uses only one parameter from this file, Path, and even that is optional. Set this to the root of your mails. The results will be saved in a folder under this, which you can set with -o.

If the Path is relative, it is treated as relative the home directory. If no Path is set, Mail is used as the path.

recoll.conf

If you wanted to search just your emails, you could use a ~/recoll/recoll.conf like this. The default pyloglevel is 3, and this results in more logging than I like.

topdirs = ~/Mail
skippedNames = ,*
pyloglevel = 1

(I search all my files, so my recoll.conf is much longer.)

The above recoll configuration is similar to the below mairix configuration.

base=~/Mail
mh=...
database=~/.cache/mairix
mfolder=mairix
mformat=mh

Usage

The user interface is inspired by that of mairix.

mhrecoll query ...
scan +recoll

For example, here are some emails, some linked from individual files in MH directories and others unpacked from mbox files.

$ mhrecoll recipient:nmh-workers date:2019-09-09/
$ scan +recoll -width 72
   1   Mon *Krullen Van De Tr  Re: Your confirmation is required to join
   2   Mon *Krullen Van De Tr  Re: Your confirmation is required to join
   3  11:23 Ralph Corderoy     Re: [nmh-workers] mhrecoll<<Return-Path: 
   4  20:00 Andy Bradford      [nmh-workers] inc: Unable to find a line 
   5   Mon  Krullen Van De Tr  [nmh-workers] mhrecoll<<Return-Path: <nmh
   6  21:45 Michael Richardso  Re: [nmh-workers] strace show # pauses/ta
   7  00:50 Ken Hornstein      Re: [nmh-workers] strace show # pauses/ta
   8  21:04 Ken Hornstein      Re: [nmh-workers] inc: Unable to find a l
   9   Mon  nmh@trodman.com    Re: [nmh-workers] strace show # pauses/ta
  10  20:10 Michael Richardso  Re: [nmh-workers] strace show # pauses/ta

Next are two copies on my computer of two different files from the recoll source code. The first two messages has the From and Subject field because the original files are HTML and have appropriate annotations. The latter two messages are C++ files, and no recoll recognizes no special annotation in their content for the From and Subject headers, so these headers are set to the file's owner's name and the last component of the file's, respectively.

$ mhrecoll filename:mh_mail.cpp/helpernotes.html
$ scan +recoll -width 72
   1  Nov18 Jean-Francois Doc  RECOLL: a personal text search system for
   2  Dec18 Jean-Francois Doc  RECOLL: a personal text search system for
   3  Nov18 krullen            mh_mail.cpp<<Content-Type: text/x-c Conte
   4  Dec18 krullen            mh_mail.cpp<<Content-Type: text/x-c Conte

Now we have files from mhrecoll package. Message 1 is a file named __init__.py, and message 3 is the same content but inside a zip archive called mhrecoll-0.2.zip. Message 2 is the file from mhrecoll version 0.1.

$ mhrecoll rfc822 recoll external_body
$ scan -width 72
   1  12Aug krullen            __init__.py<<Content-Type: text/x-python 
   2  12Aug krullen            mhrecoll/__init__.py<<#!/usr/bin/env pyth
   3+  Sun  krullen            mhrecoll/__init__.py<<#!/usr/bin/env pyth
$ mhpath first-last | xargs head
==> /home/krullen/mh/recoll/1 <==
From: krullen
To: mhrecoll
Subject: __init__.py
Date: Mon, 12 Aug 2019 02:50:35 -0000
Content-Type: message/external-body; access-type=local-file;
 name="/shared/src/mhrecoll/mhrecoll/__init__.py"

Content-Type: text/x-python
Content-Disposition: attachment; filename="__init__.py"
Content-Description: __init__.py

==> /home/krullen/mh/recoll/2 <==
From: krullen
To: build
Subject: mhrecoll/__init__.py
Date: Mon, 12 Aug 2019 02:52:07 -0000
Content-Type: text/x-python
Content-Disposition: attachment; filename="mhrecoll-0.1.zip"
Content-Description: mhrecoll-0.1.zip

#!/usr/bin/env python2.7
from os.path import join

==> /home/krullen/mh/recoll/3 <==
From: krullen
To: build
Subject: mhrecoll/__init__.py
Date: Sun, 08 Sep 2019 23:35:48 -0000
Content-Type: text/x-python
Content-Disposition: attachment; filename="mhrecoll-0.2.zip"
Content-Description: mhrecoll-0.2.zip

#!/usr/bin/env python2.7
__all__ = ('main',)

Tips

Mount a RAM filesystem on +recoll. On FreeBSD I do this.

directory=/home/krullen/mh/recoll

own="$(stat -f %u:%g "$directory")" mod="$(stat -f %p "$directory" | tr -d \\n | tail -c3)" dev="/dev/$(mdconfig -s 1g)" newfs -U "$dev" mount "$dev" "$directory" chown "$own" "$directory" chmod "$mod" "$directory"

You can configure how to open the documents in mh_profile with the mhshow-show-(mimetype) parameters.

At the same time that you incorporate new mail, you can also index just the new mails with recollindex -i.

See also