Exchange 2010 Message compression and de-duplication


  • I recently migrated my companies e-mail from a hosted Exchange 2003 environment to an in-house Exchange 2010 server. My experience in the past has generally been that when you export to PST, you loose the de-duplication that Exchange is capable of, so mailbox sizes can increase and the store size overall can increase. This is the first time I've done a 2003 > 2010 migration by exporting mail from the old server and importing it into the new one and paid close attention to the mailbox sizes. What I've noticed this time is varying sizes. One mailbox that I imported and exported myself, and verified no sync conflicts or errors, was 605MB on the 2003 server, the exported PST was 1.26GB, and the imported size is 541.7MB. Item count is higher on the new mailbox indicating a full import plus mail recieved sine the cut.

    Another user was 844MB on the 2003 server, the imported item count is higher in the new mailbox, but the size of the new mailbox 374MB.

    Edit: After reviweing the list and comparing mailboxes, it seems virtually every mailbox is smaller on the 2010 server than it was on the 2003 server. Some don't look like the imports are complete based on item count, but that's a different matter.

    Is this normal behavior after importing mail that was exported to PST? Is Exchange 2010 capable of recognizing identical attachments from imported mail the wasy it is for messages sent to users?

    Is there any way for me to verify on the server side that the users cached copy is in sync with the server?



    Monday, June 13, 2011 6:19 PM


  • Hi Dave,

    Everything looks OK from what you've written.

    An important point is there is no deduplication on Exchange 2010. In Exchange 2007 and previous versions, Single Instance Storage did work on a per-database level. However, people still had many databases reducing the amount of dedupe possible.. Plus you would always plan for no dedupe anyway.

    BTW.. SIS is dropped from Exchange 2010 for a few reasons, mainly because it helps enable the use of big slower disks, like 2TB MDL SAS/SATA. By ditching SIS and instead using compression for attachments, Exchange can make better use of sequential DB writes to disk. So it's a win because you can use massive, big, disks.

    There isn't a server-side method I am aware of to compare the Outlook OST to the mailbox.


    Steve Goodman
    Check out my Blog for more Exchange info or find me on Twitter

    Monday, June 13, 2011 8:22 PM