By: zimbra
In this article you will learn how to import emails and folders from Outlook PST files into Zimbra using the command line. This article is targeted to system administrators who are interested in automation of PST imports. Some knowledge of Bash and PowerShell scripting is recommended as you may want to change the scripts used in this article to fit your automation needs.
This article provides a two step approach to import PST files into Zimbra. Step one will be the conversion of the Outlook PST file into a Zimbra TGZ file.
This first step can be done on a Windows workstation with Zimbra Desktop installed or on a Linux machine with the command readpst installed.
It is recommended to use Windows/Zimbra Desktop as the first step as this leverages a commercially supported PST parser which may provide better Outlook/PST compatibility.
Please note that the conversion script does not require Zimbra Desktop to be configured to an account or Zimbra server. It just needs to be installed on your Windows machine so that the script can use Zimbra Desktop components.
The second step is always executed on the Zimbra Mailbox server.
Prepare your Windows machine:
To perform the conversion from PST to TGZ open a PowerShell window and execute the script as follows:
.\pst-to-tgz.ps1 "C:\Users\user\Downloads\rogers b.pst"
Change the example path to the path to your .pst file. Once completed the script will put a .tgz file next to the original .pst file. So in this example a file rogers b.pst.tgz
will be created.
Please note that logs and eml export remain in C:\tmp\pst-to-tgz
, please remove these after manual verification.
Not recommended
Prepare your Linux machine:
To perform the conversion from PST to TGZ open a terminal and execute the script as follows:
./pst-to-tgz.sh /path/to/export.pst
Change the example path to the path of your .pst file. Once completed the script will put a .tgz file next to the original .pst file. So in this example a file export.pst.tgz
will be created.
Please note that logs and eml export remain in /tmp/pst-to-tgz
, please remove these after manual verification.
This step should be done on a Zimbra Mailbox server and all commands are run as the OS user zimbra.
Prepare your Zimbra machine:
To perform the import execute the script as follows:
./import-tgz.sh /path/to/import.tgz admin@example.com
So the first argument should be the location of the tgz file and the second argument is the target mailbox.
Notes:
Run the following commands in a separate terminal to see errors in real time:
tail -f /opt/zimbra/log/mailbox.log
tail -f /tmp/zmmailbox-screen/output.log
tail -f /opt/zimbra/log/mailbox.log | grep -i error | grep "ParsedMessage"
This script was tested with 2GB PST files with thousands of emails and folders.
Extensive logging in /tmp/zmmailbox-screen/output.log
will help you to track import failures. In case one item fails to import, the script will continue and import the next item.
The scripts of step one converts a PST file into a folder structure equal to the original mailbox structure. Inside the folder structure each email will be saved as an EML file. The entire structure is then compressed into a single big TGZ file.
Zimbra Insiders will be tempted to import this TGZ file directly into Zimbra, but often this will fail and since there is no Zimbra meta data in the TGZ file the date for all emails will be shown as the import date in the email list view.
The second step performed by the import-tgz.sh script uses existing API's but with a different approach. The script works as follows:
screen
session with the zmmailbox
command in interactive mode.Then for each TGZ file:
Finally:
Benefits of this approach as compared to regular TGZ import
Why is this better?
Zimbra REST API does allow to import a TGZ file with a complete mailbox, even if the TGZ is many GB's. However, often the import will fail to complete, resulting in an error similar to:
2024-03-27 16:47:22,384 WARN [qtp921760190-341:https://zimbra10.barrydegraaff.nl/home/admin@barrydegraaff.nl/?fmt=tgz&subfolder=import×tamp=0&resolve=skip] [name=admin@barrydegraaff.nl;mid=2;oip=192.168.1.98;port=33166;] misc - ArchiveFormatter addError:Early EOF: path=zl_beck-s_000.pst/beck-s/Australia/9822.eml
com.zimbra.cs.service.formatter.FormatterServiceException: Early EOF
at com.zimbra.cs.service.formatter.FormatterServiceException.UNKNOWN_ERROR(FormatterServiceException.java:118) ~[zimbrastore.jar:10.0.7_GA_4598]
at com.zimbra.cs.service.formatter.ArchiveFormatter.addData(ArchiveFormatter.java:1742) ~[zimbrastore.jar:10.0.7_GA_4598]
at com.zimbra.cs.service.formatter.ArchiveFormatter.saveCallback(ArchiveFormatter.java:976) ~[zimbrastore.jar:10.0.7_GA_4598]
at com.zimbra.cs.service.formatter.Formatter.save(Formatter.java:162) ~[zimbrastore.jar:10.0.7_GA_4598]
This error does NOT indicate a broken TGZ or EML file. My suspicion is that Java has to do this import all in memory and some memory has been purged by garbage collection before the import was completed. The problem with this error is that it is almost impossible to debug as it will occur randomly. Making it very hard to recover.
With the new approach the TAR formatter exception should be avoided. And should it still happen the import will continue while providing meaningful logs that are easy to correlate to the TGZ/EML file. Making it easier to debug.
Once the TAR formatter has done it's work, Zimbra's EML parser will need to parse the EML file. Some EML items stored in a PST file are not actually emails. These can be recognized by missing headers that are required in email. These are not understood by Zimbra and cannot be imported. These errors are not returned by zmmailbox and can only be seen via:
tail -f /opt/zimbra/log/mailbox.log | grep -i error | grep "ParsedMessage"
In addition some very old attachment formats and encoding types are not supported by Zimbra, and these can not be imported. These documents should be converted to an archive format (such as PDF) and stored as documents in Zimbra. Or if you are in a situation where a small percentage of your EML files fall in this category then perhaps it is an option to store an archival copy of the PST after importing into Zimbra.
Rating | No ratings or reviews |
Downloads | 401 |
Latest Version | 0.0.1 |
Categories | Migration |
Compatibility | ZCS 10.0.x , ZCS 8.8.x , ZCS 9.0.x |
License | NA |
Created | on 5/25/24 |
Updated | on 5/25/24 |