HomePython Page 4 - Python Email Libraries, part 2: IMAP
Fetching Message Information - Python
The first article in this series discussed how to access a POP3 server with a Python script. While that protocol is useful for learning the basics of how email works, IMAP is the protocol most used today. This article covers this more complicated protocol.
Getting message information from the server is the most common task with IMAP, and it is also one of the more complicated ones. To get various parts of a message or information about a message, you use the “fetch” method. However, what makes this somewhat confusing is that the programmer must supply the actual IMAP protocol arguments to the “fetch” method, similar to “search.” That is, the programmer must create a string of arguments that get sent verbatim to the server as the actual arguments for that command.
This means that we have now left the sole realm of Python and now are dealing with the actual networking protocol as well, and the structure of the messages sent over the network. Luckily, these arguments can usually be seen as a relatively simple string list of objects or data that the server can return. Some of these arguments have nested sub-arguments, but these are all handled singly, which adds to the complexity.
The “fetch” command takes two arguments, one defining the group of messages to be retrieved and another describing what parts of the messages should be retrieved. The first argument should be a list of message sequence numbers, not UIDs. It is possible to use UIDs, but that requires the use of another command, the “uid” command, which notifies the sever to process the following command using UIDs rather than message sequence numbers.
The first argument can be either an inclusive list giving just the first and last sequence numbers separated by a “:” or it can simply be a comma separated string listing each number. This means that you can take the list from a “search” query, simply replace the spaces in the data string with commas, and use it as an argument to “fetch.” While this may not be the most size efficient practice, it is often much simpler than compressing the list down to ranges of sequence numbers.
The second argument is a string that lists what information you want from the server. There are many different choices here, and the full listing is found in section 6.4.5 of RFC 2060. However, some important ones are:
RFC822 – Grabs entire message, useful for importing into the Python email parser.
UID – Gets the UID, which is the best way to identify messages across different connection contexts.
BODY[<part>] – This is the main workhorse. By replacing the “<part>” section, you can access the main text body of a message, attachments, and headers.
INTERNALDATE – Useful for getting the date the message was received.
FLAGS – This gives you the server’s own metadata about the message, like whether it’s been seen yet, or whether it is marked for deletion.
ENVELOPE – This is a quick way to get all of the default RFC822 fields for a particular message parsed into a list of strings of text.
Some examples are useful in these cases:
r, data = server.fetch(‘6’, ‘(UID BODY[TEXT])’)
r, data = server.fetch(‘6’, ‘(UID ENVELOPE)’)
r, data = server.fetch(‘2:5’, ‘(BODY[HEADER.FIELDS (SUBJECT FROM)])
These lines do variously simple or complex things. The first line gets the UID and the body text of message number 6. The second line grabs the envelope and UID for message 6, and the third line gets the newline-delimited Subject and From header fields for messages 2 through 5. The returned data is structured differently depending upon how you requested it; however, it is far easier to look through the returned information and learn it directly than to describe it in words. Suffice it to say that the returned information is a nested set of lists.
Overall, the most powerful abilities of IMAP lie in the capability to create virtual file systems on the remote server and the ability to filter messages before actually downloading their data. The searching capability is priceless when writing an automated utility for email, allowing you to quickly filter down the data. IMAP is a complex but very useful protocol. Before working with the Python libraries, make sure you are well versed in the protocol definition itself, as this will smooth the road when learning how to interact with the server in Python.