Last modified 2005 MAR 30 15:48:42 GMT |
Read the site Disclaimer |
See also the PROCMAIL UNSUBSCRIBE FAQ, PROCMAIL BLOG, PROCMAIL SMTP FILTERING FAQ, SPAM filtering using PROCMAIL, and How to configure procmail as a sendmail rule-invoked filter.
So, you think my response on the Procmail discussion list was rude? Please think again. (FTR that document is authored by the same chap who wrote the fetchmail tool commonly used to retrieve email from other hosts).
If you reply to one of my posts with a message whose body is an attachment, chances are good I won't bother reading it. If you wonder why you don't get a reply from me (or perhaps other procmail list participants), particularly if you were following up on a discussion in which I was participating, that may very well be the reason.
On the procmail mailing list, people often post questions asking for assistance with accomplishing certain tasks. More than occasionally, I will take my time and post a procmail recipe or framework which should be suitable for that person's needs. As I am not being paid to do this, and I already expend a large amount of my time debugging other people's problems, my posts are not always complete solutions, but rather the framework necessary for that user to proceed to completing their own work.
Unless otherwise indicated in that post, all such recipes are untested, and work only in theory. Typos may exist, or specific file permissions may have been overlooked, etc. Nevertheless, the recipe should serve as a good start to the solution to your problem, and is worth at least twice what you paid me for it, if not more.
It is important to remember that there are usually many ways to accomplish something, and that not all platforms are the same.
This author typically runs the current stable RELEASE version of procmail (i.e. not the DEVELOPMENT version), and at worst, one rev prior on his various servers. If you are running an archaic version of Procmail (which is difficult for this author to determine, as most individuals DO NOT post their procmail version or any other system specifics - or seem too preoccupied with describing how they're using RedHat version-somethingorother, which is meaningless to people who long ago quit keeping track of the excuses that RedHat calls releases), some features may not be available to you. This author also runs Sendmail as his MTA, rather than Postfix, QMail or some of the other SMTP agents now available for UNIX based OS' - your mileage may vary depending on the SMTP agent and specific configuration your host is run with (including whether procmail is the LDA (as it is here), or must be invoked via a .forward file).
You can determine your procmail version by executing the following at a shell prompt:
procmail -v
You may be advised to check the procmail FAQs, which are linked from the procmail homepage, or to search the list archives (also linked from the same page). The man pages are also a useful resource, and directly answer an alarming number of the inquiries posted on the procmail list:
man procmail man procmailex man procmailrc man procmailsc man formail
There is NO EXCUSE for failing to check the documentation or list archives before asking other people to do your work for you. Failure to RTFM will make you appear to many to be a mooch - here expecting others to do your work for you because you don't want to spend your time reading the provided documentation (which includes the various FAQs and quickstarts).
There's a book, The Procmail Companion written by a list regular, Martin McCarthy. If you find that the manpages don't explain things well enough for you, the book should help a lot.
Experienced procmail users (indeed, experienced computer users) know that placing a script into active use without first testing it in a controlled environment such as a sandbox, is a foolhardy thing to do and often leads to disaster (in procmail, it is trivial to generate a mail loop which can consume all disk space and generate volumes of email -- a move which generally makes you unpopular with your sysadm as well as anyone who may have been receiving your messages). Thus, standalone procmail scripts (with a few key changes, such as mail folder and forward addresses - or no changes whatsoever if you have a good sandbox wrapper), invoked manually against a static mailbox containing test messages, is generally an expected test environment. Advanced procmail users will feel more at home making certain changes directly to their live procmail configurations, but this is something which only avanced users should do - beginning users, especially when putting an entirely new to them recipe into use, should never simply make it the live config without first thoroughly testing it.
When you see posts containing expressions similar to:
* ^From:[ ]*\/[^ ].*
(specifically the [ ] or [^ ] parts, where the brackets appear to contain more than one whitespace), it it generally safe to assume that the bracket contains both a SPACE and a TAB (although the message may have been reformatted somewhere between the original author and you). Those familiar with regular expressions know that there is no purpose to having more than one occurrence of the same character within the brackets since it defines a character CLASS, not a discreet string. This expression is so commonly used that this author now rarely expressly points out that "the brackets contain a space and a tab."
It is certainly NOT the responsibility of this author to ensure that you follow accepted procedure and test any scripts which this author may publish, either to the procmail list, via the web, or in direct communication. That responsibility rests squarely on the shoulders of the individual sitting between your chair and keyboard.
I wrote the procmail diagnostic script - a shell script which attempts to collect data about your configuration, and offers up some warnings about file security if noted. The output of this script helps those on the procmail list from having to play "20 questions" with you to extract information about your config which may be at the root of a problem you are experiencing. If you're having a procmail problem - not a script logic problem, but a problem actually invoking procmail, which may appear in your system logfiles, etc. - you should retrieve this script, review it (view the file, don't just run it because I said to), and when you're confident that it isn't bogus, run it as per the included documentation and evaluate the output. If you're still unclear, THEN post a description of your problem, along with the output of the script (NOT AS AN ATTACHMENT) to the procmail list. Please DO NOT send the output to me directly expecting one-on-one support unless you're planning to pay me for my time.
Not everything about your email can be automated in procmail. While procmail is a powerful mail processing tool, it does have limits - chiefly among them are that procmail IS NOT an SMTP agent, and therefore cannot reject mail at the time of the SMTP transaction (thus, BOUNCING SPAM is a dumb idea - you're not REJECTING the SPAM during the SMTP transaction, but sending it back out to an envelope address you must assume to be valid). Further, not all SMTP configurations are created the same, and when you use procmail to try to handle multiple mail aliases through one account, multiple CC's and BCC's will pose problems as procmail will often not be advised of the true addressee of this individual message. Both of these broad issues are generally resolved within the SMTP agent configuration, which is the proper place to do these things.
Procmail may not always be configured as the LDA (Local Delivery Agent) in your MTA (Mail Transfer Agent, typically Sendmail) configuration. If not, you will need to create a .forward file to invoke procmail. The manpages should contain sufficient information to accomplish this, but as long as you're reading this page, I'll show you here:
Create a file in your home directory named .forward, containing EXACTLY the following (doublequotes, apostraphes, and spaces), replacing only your userid and correcting the path to procmail to match that on your system (use which procmail to easily locate it):
"|IFS=' ' && exec /usr/local/bin/procmail -Yf- || exit 75 #youruserid"
or, better, an alternative invocation:
"|IFS=' ' && p=/usr/local/bin/procmail && test -f $p && exec $p -Yf- || exit 75 #youruserid"
Then chmod 704 ~/.forward to properly secure the file from other users. As per the procmail manpage, .forward MUST be world readable.
The || exit 75 bit causes the LDA to return the message to the mail queue to retry delivery again later (in the event that you have a problem with your procmail script or configuration). The magic number 75 can be found in the sendmail (or other MTA) sourcecode, as EX_TEMPFAIL:
# define EX_TEMPFAIL 75 /* temp failure; user is invited to retry */
The alternative invocation merely assigns the procmail path to a variable (for consistency between the two uses of it), tests to make sure that it is an executable, and if so, invokes it. It is otherwise functionally the same as the first example provided. It is perhaps more appropriate for use within an environment where you can't rely on procmail actually being present all the time (say, because you have a clueless sysadmin who is terrified of procmail, or moves it because someone abuses it in some fashion). Additionally, with the latter invocation, if procmail isn't present, the mail will simply be delivered to your regular system mailbox. With the former invocation (the one not actually testing for the presence of the procmail binary), if procmail isn't present, the message will be BOUNCED to the sender with an error, which isn't exactly an optimal way to handle a local error, is it? The Y commandline parm which appears is optional - I include it here because in a typical Procmail-as-LDA configuration (in the MTA config), it's an option which generally appears there. See the procmail manpage for an explanation.
Here's an example of the mail queue if the procmail binary were to become unavailable when using the alternative invocation:
-----Q-ID----- --Size-- -----Q-Time----- ------------Sender/Recipient----------- h2T0OQZe032536 234 Fri Mar 28 16:24 <fromaddress@from.domain.tld> (Deferred: prog mailer (/bin/sh) exited with EX_TEMPFAIL) "|IFS=' ' && p=/usr/local/bin/procmail
The #youruserid which trails the formail syntax is intended to ensure that each .forward in a system is sufficiently unique (without it, your .forward and dozens of other users may actually be identical). If each .forward isn't unique, it is actually possible that some MTAs will optimize the duplicate invocations (normally expecting an address list) and discard some. While you shouldn't necessarily care whether some other sod on your server loses his mail because he's doing things improperly, YOU don't want that to happen to YOUR mail, so be sure to put your own userid there.
An unfortunate limitation in the newer releases of Sendmail is that for EX_TEMPFAIL deliveries, for some odd reason, it doesn't actually re-fetch the .forward contents (or for that matter, even check that a .forward still exists), having already resolved them once and stored the original content of the .forward into the mail queue file. Thus, if you have an error in the .forward itself (say, an invalid path to a program called from there), editing the contents of, or even completely removing the .forward file, WILL NOT correct the problem - Sendmail will refer to its saved version of the original .forward when attempting redelivery of a previously queued file. Of course, NEW messages arriving after the .forward has been fixed will not suffer this limitation, as Sendmail will evaluate the .forward anew. One can hope that the Sendmail development team will resolve this in future versions of Sendmail. The same may hold true with other MTAs as well. The bottom line: doublecheck the validity of your .forward syntax before you put it into play.
Email processing has a variety of curious acronyms. Although the definitions are readily available, I'll briefly define the common ones which see frequent use on the procmail list:
MTA | Mail Transfer Agent - the program which manages inter-system message delivery (and which also determines which type of delivery should be used, even for messages determined to be for local delivery). Sendmail, Postfix, Qmail, etc are all examples of an MTA. |
MSA | Mail Submission Agent - in newer versions of Sendmail, messages delivered to the mail system from the local host are queued through this process, which in turn passes them to the MTA. |
LDA | Local Delivery Agent - the delivery agent used by the MTA to deliver messages to local users. procmail may be configured in the MTA as the LDA, but may not necessarily be, in which case users must invoke it via .forward (presuming of course that the configured LDA pays any attention to the .forward file). |
MDA | Mail Delivery Agent - often assumed to be synonymous with LDA, but actually, this may refer to one of several delivery agents, including the Program delivery and file delivery. |
MUA | Mail User Agent - the program you use to read and write email (Elm, Pine, Emacs, Mutt, etc). |
SMTP | Simple Mail Transfer Protocol - the primary protocol through which email is passed from one mail host to another. Also what most remote MUAs use to initially send a message. |
ESMTP | Extended Simple Mail Transfer Protocol. An enhanced version of SMTP, offering some additional features. |
POP/POP3 | Post Office Protocol, or POP version 3 (which an unqualified "POP" is generally taken to be these days). This is the protocol which some MUAs - notably those which are remote from the mail host, use to retrieve mail from the user mailbox. |
IMAP | Internet Message Access Protocol. Sort of a combined remote MUA interface for SMTP and POP functionality, permitting mailboxes and retention of sorted email. |
mbox | Unix Mailbox format - a From line (with no colon) containing an address and a date preceeds each message. |
DNSBL | DNS BlackList - a system through with either the MTA or a script running at the LDA can check the IP address of the sending host (and in some cases, intermediate hosts as well) to see whether it is published in a DNS-managed list of hosts which meet criteria as specified by each maintainer of that given DNSBL zone. DNSBLs implemented at the MTA level can be very effective at blocking spam, since you do not accept the spam body, but rather reject the message during after initial SMTP identification handshaking. |
greenlist/whitelist | terms for an address list which is considered implicitly trusted. |
redlist/blacklist | terms for an address list which is considered explicitly distrusted. See also DNSBL. |
greylist | (or greylisting), a process whereby a message is temporarily refused during reception by the MTA, with the expectation that a valid sender will requeue the message, but a spammer will simply fail and not return (on THIS message at least). |
If you or your company are looking for a tested and fully functional procmail script for some special purpose, this author is available at his standard consulting rates to develop such solutions for you.
Professional Software EngineeringEMail to: PSE-L@mail.professional.org