none
Temp and Mail.que growing very large

    Question

  • For some reason the mail.que file in my TransportRoles\data\Queue folder is growing and is currently over 14.5gb, and the temp folder has files dating back 7 months and has grown to about 15gbs. I didn't see any events in the event logs that could be causing it. Any idea on how to shrink these directories back down to normal? It looks like something that should be pruning them is not working.


    Vincent Sprague

    Tuesday, January 07, 2014 6:35 PM

Answers

  • Here is the latest from Microsoft Support:

    Hello Will,

    I tried calling you on phone numbers (xxx) xxx-xxx, (xxx) xxx-xxx to discuss on this case and reached your voice message.

    Regarding the mail.que growing, as you mentioned it is normal in exchange 2013 server. You can have adequate disk space on the drive where the queue is located. http://blogs.technet.com/b/exchange/archive/2013/05/06/ask-the-perf-guy-sizing-exchange-2013-deployments.aspx

    Once the queue database grows to an extent when the transport unable to process it, the mail.que will get renamed and new queue database will be created automatically. Then you can delete the old queue database manually to free up the disk space.

    You also wanted to transfer the case to development team, but we cannot involve them on this case since it is by-design behavior (http://support.microsoft.com/gp/proffaq/zh-tw). Please let me know your convenient time and phone number to discuss on this.

    Thanks and Regards,
    Dinesh


    Needless to say, I was furious with this response! Here is what I sent back to him:

    Dinesh,
    This is the first time any Microsoft engineer has said this behavior is by design! Not only that, this “by-design” behavior is flawed for several reasons:

    • There is no guidance on how much disk space will be needed or how to calculate the needed space.
    • In my case, it moves the mail.que but does not build a new one so mail flow is interrupted.
    • It does not move the old mail.que log files so that even if a new mail.que is built, the Transport service cannot start because the mail.que database and the log files are out of in sync and the log files must be removed!


    Now, the link to the TechNet blog article you sent discusses sizing for the transaction logs, but does not mention sizing for the mail.que database or how it will wildly grow out of control! So how can any Exchange Administrator properly size their installation?

    Now, I have been supporting Exchange since 5.5 and in environments much larger than this (20,000+ mailboxes across multiple countries and continents) and you cannot tell me that this behavior is by design. I have worked too long with Microsoft and Microsoft support to agree with your stance that this behavior is by design! Additionally, if this behavior was “by design”, don’t you think in all the Technet support forum posts that some Microsoft MVP or support engineer would have mentioned that!

    Dinesh, I have copied my account rep (XXXXXXX) on this e-mail and I will be posting the response you sent me below to the Technet forum posts as well. Please make sure that you understand my issue and the behavior we are experiencing and then please involve a more senior engineer OR transfer it to the development team as I requested!


    I have already spoken to my account rep and he is escalating this!

    By design my butt!!!


    Will Smothers


    Wednesday, February 12, 2014 7:46 PM

All replies

  • Hi Baron,

    There could be n number of reasons the mail.que increasing.

    1) Third party antivirus/antispam acting on the queue db and causing it to grow.( you can probably disable the 3rd party transport agents and find if it causes the issue. type get-transportagent and see how many third party agents are running and then run disable-transport agent -identity "agent name" to disable the agents"

    2) Monitor your queue for any bundles of spam mails. Because the queue  db growing in size might be because of unwanted  spam emails as well flooding into the queue.

    3) Check if there are any large  attachments stuck in outbox which will cause the db's transport and mdb's to grow in a larger size.

    If you wanna shrink these directories back to normal do the below steps

    1) Stop the microsoft exchange transport service

    2)Rename the queue folder which will be located in the following path

    Program Files\Microsoft\Exchange Server\V14\TransportRoles\data\Queue

    3)    Rename it to queue.old and then start the microsoft exchange transport service .This will create a new mail.queue which will be in normal size.Keep monitoring this que db for 2 days and you can check if the size grows.

    Please mark as helpful if you find my contribution useful or as an answer if it does answer your question.
    That will encourage me - and others - to take time out to help you.
    Regards,
    Sathish

    Wednesday, January 08, 2014 3:01 AM
  • Hi,

    Let’s begin with the following thread:
    http://social.technet.microsoft.com/Forums/exchange/en-US/c14861ab-cc0f-4c01-ad50-5427ae3ced23/exchange-2013-transportrolesdatatemp-filling-up-disk

    Additionally, mail.que is an ESE database with circular logging, hence it won't shrink. In Exchange 2013, Safety Net  also uses the mail.que database:  http://msexchangeguru.com/2013/04/15/safetynet/
    However, we can check if  Pipeline Tracing is enabled to avoid it increase rapidly:
    http://msexchangeguru.com/2011/06/02/mail-que/
    Note: Microsoft is providing this information as a convenience to you. The sites are not controlled by Microsoft. Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. Please make sure that you completely understand the risk before retrieving any suggestions from the above link.

    Thanks,

     


    Angela Shi
    TechNet Community Support

    Thursday, January 09, 2014 9:46 AM
  • I created the new mail.que database and so far it's staying fairly small. I'm still keeping any on it. I deleted files out of the temp directory to save space but I'm still concerned that there is an issue. I'm keeping an eye on it as well. I had the A/V admin put an exception in for the entire transportrole directory so I'm hoping that helps with the temp directory and mail.que files going forward. I will update this thread again in a day with status.

    Vincent Sprague

    Thursday, January 09, 2014 1:53 PM
  • Thanks for the update Vincent. Hopefully your mail.que will bot be increasing  from now.

    Monday, January 13, 2014 4:16 AM
  • Ok, I've given it a few days and I checked on it today, the mail.que has grown to over 9gb in just a few days. I also deleted several months worth of temp files out of the TransportRoles/Data/Temp directory but there are still several temp files being created every day, there were 40 temp files yesterday alone.

    Vincent Sprague

    Tuesday, January 14, 2014 7:15 PM
  • Ok, I've given it a few days and I checked on it today, the mail.que has grown to over 9gb in just a few days. I also deleted several months worth of temp files out of the TransportRoles/Data/Temp directory but there are still several temp files being created every day, there were 40 temp files yesterday alone.

    Vincent Sprague

    I'm more inclined to think this is a bug in 2013. ( The temp directory growing in size that is).

    I never had this problem in 2007/2010 but see it with 2013 with both CU2 and CU3. I have been simply cleaning it out ever few weeks. If you are running out of space, consider moving the temp drive to another local drive with space by stopping the transport serivce, modifying the EdgeTransport.exe.config config file under \\Program Files\Microsoft\Exchange Server\V15\Bin

    and change the location of the <add key="TemporaryStoragePath" value = "%ExchangeInstallDir%TransportRoles\data\Temp" /> to another dir.

    Be sure to make a backup of that file before making any changes if you go that route.


    Twitter!: Please Note: My Posts are provided “AS IS” without warranty of any kind, either expressed or implied.

    Tuesday, January 14, 2014 7:29 PM
    Moderator
  • It is a CU2 environment so that's possible, rather annoying, I would have thought Microsoft could have resolved this by now. My other 2013 servers are CU1 still. I've been waiting for CU4/SP1. before upgrading any other CU1 servers.

    Vincent Sprague

    Tuesday, January 14, 2014 8:01 PM
  • Has anyone opened a case with Microsoft on this? I have been running exchange since 5.5 and I have never seen an issue like this. We moved to Exchange 2013 and in the three months it has been in production it has crashed from this bug five times!!!

    Microsoft has got to fix this issue!


    Will Smothers

    Tuesday, January 21, 2014 8:22 PM
  • Just so everyone knows, I have opened a case with Microsoft on this.

    Will Smothers

    Friday, January 24, 2014 5:23 PM
  • Excellent, please let us know what you find out.

    Vincent Sprague

    Friday, January 24, 2014 9:17 PM
  • As some additional information, it happened again this morning to me. What I found is that the mail.que file had grown to 38,805,568 KB in size. At that point, Exchange did something and moved the mail.que database to a folder named "Messaging.old-20140129135432" which causes the transport service to crash. I have pulled the "corrupted" mail.que database off the server and I am going to upload it to Microsoft for their review.

    I will let everyone know as I get more information.


    Will Smothers

    Wednesday, January 29, 2014 2:45 PM
  • I heard back from Microsoft today and this was their response:

    From your description, I know you have copied the 38GB mail queue to another server to do a database cleanup. But it seems the database is dirty shutdown with one log missing. As the log is already missing, if there’s no backup for the log, we can only use  eseutil /p to make the database into clean shutdown. But this command will cause some data lose. See more details in the following link: http://technet.microsoft.com/en-us/library/aa997215(v=exchg.65).aspx.

    On the other hand, I know after rebuilding the mail.que file, the file is still growing fast. Here , I would like to say that it is a normal behavior in exchange 2013. As far as mail.que file is concerned, we cannot compare exchange 2013 with exchange 2007 or 2010. Exchange 2010 or 2007 do not have this feature called SafetyNet. SafetyNet feature is the one which is responsible for the large size of the mail.que in Exchange 2013.

    • In Exchange 2013, transport high availability is more than just a best effort for message redundancy. Exchange 2013 attempts to guarantee message redundancy. Because of this, you can't specify a maximum size limit for Safety Net. You can only specify how long Safety Net stores messages before they're automatically deleted.
    • The length of time successfully processed primary messages are stored in Primary Safety Net, and acknowledged shadow messages are stored in Shadow Safety Net. Unacknowledged shadow messages eventually expire from Shadow Safety Net
    • So also need to consider the point that depending change in the number of messages or amount of mail flow on an any given day is also going to impact the size of mail.que for an extended period of time as compared to exchange 2007 or exchange 2010.

    See more details in the following link: http://blogs.technet.com/b/exchange/archive/2013/05/06/ask-the-perf-guy-sizing-exchange-2013-deployments.aspx

    Given the situation, we can set the SafetyNetHoldTime to a little value such as 5 minutes to see if the mail.que is still large. See more details in the following link: http://technet.microsoft.com/en-us/library/jj657495.aspx

    In addition, I find you want to change the retry value. In General, we can change it via QueueGlitchRetryCount and QueueGlitchRetryInterval. See more details in the section “Configure the transient failure retry attempts, the transient failure retry interval, and the outbound connection failure retry interval” in the following link: http://technet.microsoft.com/en-us/library/aa998043(v=exchg.150).aspx.

    In addition, I have created a workspace for you, you can upload the information to this workspace: [Workspace information:]

    1. You will receive another Email named from, please check your Inbox and Junk Emails.
    2. Please download the Microsoft Data Transfer and Management tool (DTM.exe) and install it, then you can use temporarily logon information above Email has mentioned to logon the DTM tool.
    3. You will be requested to change the temporarily password when you first logon. If you forget the password you have changed, please check the following link to reset the password. https://filetransfer.support.microsoft.com/EFTClient/Account/LostPassword.htm
    4. Once you logged on the DTM tool, please upload the related data to us. Please see above information and if there’s anything unclear, feel free to contact us.

    Now, I checked my SafetyNetHoldTime and it was set to 2 days (the default) as is my ShadowMessageAutoDiscardInterval. That totals processed mail being held for up to four days. I do not think this explains why the mail.que database continually grows up to a certain point and then crashes.

    I have another theory on why this is happening: I do not think that the jet engine powering the mail.que is recovering its whitespace as it should. I have tried to test this theory but running eseutil /p against my 38GB "corrupt" mail.que database to bring it back to a clean state so that I can defrag it but the /p has been unsuccessful to this point.

    I am uploading the entire 38GB "corrupt" mail.que and its log files to Microsoft for analysis. Hopefully, they will have a better solution than for me to turn down my SafetyNetHoldTime!!!


    Will Smothers

    Thursday, January 30, 2014 4:03 PM
  • Any word back from Microsoft on this Will?  We are still running Exchange 2007 but ran across this issue this week.  After spending an entire day on the phone with Microsoft support, they recommended migrating all users to a new Information Store database insisting that corruption in the current database is what's causing our mail.que to grow at such a rapid pace.
    Friday, February 07, 2014 7:58 AM
  • Here is the latest from Microsoft Support:

    Hello Will,

    I tried calling you on phone numbers (xxx) xxx-xxx, (xxx) xxx-xxx to discuss on this case and reached your voice message.

    Regarding the mail.que growing, as you mentioned it is normal in exchange 2013 server. You can have adequate disk space on the drive where the queue is located. http://blogs.technet.com/b/exchange/archive/2013/05/06/ask-the-perf-guy-sizing-exchange-2013-deployments.aspx

    Once the queue database grows to an extent when the transport unable to process it, the mail.que will get renamed and new queue database will be created automatically. Then you can delete the old queue database manually to free up the disk space.

    You also wanted to transfer the case to development team, but we cannot involve them on this case since it is by-design behavior (http://support.microsoft.com/gp/proffaq/zh-tw). Please let me know your convenient time and phone number to discuss on this.

    Thanks and Regards,
    Dinesh


    Needless to say, I was furious with this response! Here is what I sent back to him:

    Dinesh,
    This is the first time any Microsoft engineer has said this behavior is by design! Not only that, this “by-design” behavior is flawed for several reasons:

    • There is no guidance on how much disk space will be needed or how to calculate the needed space.
    • In my case, it moves the mail.que but does not build a new one so mail flow is interrupted.
    • It does not move the old mail.que log files so that even if a new mail.que is built, the Transport service cannot start because the mail.que database and the log files are out of in sync and the log files must be removed!


    Now, the link to the TechNet blog article you sent discusses sizing for the transaction logs, but does not mention sizing for the mail.que database or how it will wildly grow out of control! So how can any Exchange Administrator properly size their installation?

    Now, I have been supporting Exchange since 5.5 and in environments much larger than this (20,000+ mailboxes across multiple countries and continents) and you cannot tell me that this behavior is by design. I have worked too long with Microsoft and Microsoft support to agree with your stance that this behavior is by design! Additionally, if this behavior was “by design”, don’t you think in all the Technet support forum posts that some Microsoft MVP or support engineer would have mentioned that!

    Dinesh, I have copied my account rep (XXXXXXX) on this e-mail and I will be posting the response you sent me below to the Technet forum posts as well. Please make sure that you understand my issue and the behavior we are experiencing and then please involve a more senior engineer OR transfer it to the development team as I requested!


    I have already spoken to my account rep and he is escalating this!

    By design my butt!!!


    Will Smothers


    Wednesday, February 12, 2014 7:46 PM
  • Will

    I can see it being by design to a certain extent, but it should be calculate"able".

    In theory, any sent message goes through this queue, whether or not it makes it outside or results in an NDR, so if the default value is 2 days for the SafetyNetHold and 2 Days for the NDR, then it stands to reason that the maximum a message could be in that queue is 4 days. Thus if you had enough disk space to cover that mail.que file to grow to a size of 4 days worth of sent mail you should be good to go.

    Now I'm in the same boat as you, mine seems exponential in growth, not just 4 days.  If it was truly acting like an EDB and recycling it's available space, then at the 4 day mark it shouldn't really grow, or at least by much.

    I put a file size check on that file to see if I can graph the rate of increase to see if it's something that is a steady growth, or more exponential. 

    Tuesday, February 18, 2014 10:15 PM
  • I have also seen this issue on two environments now. Did you get any answers?

     
    Friday, April 04, 2014 1:29 PM
  • This is a recurring issue for me, as well. The mail.que file is currently 37GB and is filling the system partition repeatedly, even after adding 10GB to that drive twice, followed by adding another 20GB. This file acts like a tumor, continuing to grow without bounds. I fail to understand how this is "behaving as designed."

    Every ESE database I've managed fails to reclaim the whitespace efficiently without an offline defrag or moving all the mailboxes to a new database and deleting the original, bloated one. I do not understand why Microsoft thought this was a good idea.

    Has anyone learned whether this is resolved in CU4/SP1? I have downloaded it but not yet installed it. CU5 should be out before long.

    Wednesday, May 21, 2014 5:27 PM
  • Cumulative Update 5 for Exchange Server 2013

    http://support.microsoft.com/kb/2936880/


    This update resolves the issues that are described in the following Microsoft Knowledge Base (KB) articles:
    • 2963590 Message routing latency if IPv6 is enabled in Exchange Server 2013
    • 2963566 Outlook Web App accessibility improvement for UI appearance in Exchange Server 2013
    • 2962439 You cannot sync contacts or tasks in Microsoft CRM client for Outlook in an Exchange Server 2013 environment
    • 2962435 CRM synchronization fails if the time zone name of a meeting is not set in an Exchange Server 2013 environment
    • 2962434 Slow performance in Outlook Web App when Lync is integrated with Exchange Server 2013
    • 2958430 "Some or all Identity references could not be translated" error when you manage DAG in Exchange Server 2013 SP1 in a disjoint namespace domain
    • 2957592 IME is disabled in Outlook Web App when you press Tab to move the focus in an email message in Exchange Server 2013
    • 2942609 Exchange ActiveSync proxy does not work from Exchange Server 2013 to Exchange Server 2007
    • 2941221 EWS integration for Lync works incorrectly in an Exchange Server 2013 and 2007 coexistence environment
    • 2926742 Plain-text message body is cleared when writing in Outlook Web App by using Internet Explorer 8 in Exchange Server 2013
    • 2926308 Sender's email address is broken after importing a PST file into an Exchange Server 2013 mailbox
    • 2925559 Users always get the FBA page when they access OWA or ECP in Exchange Server 2013
    • 2924519 "SyncHealth\Hub" folder is created unexpectedly after installing Cumulative Update 2 for Exchange Server 2013
    • 2916113 Cannot open .tif files from email messages by using Windows-based applications in an Exchange Server 2013 environment
    • 2592398 Email messages in the Sent Items folder have the same PR_INTERNET_MESSAGE_ID property in an Exchange Server 2010 environment
    • 2938292 Default apps in Outlook Web App do not work if Exchange is installed in Window Server 2012 R2

    Anyone who have installed CU5 and can report if problem is still present?
    Thanks,

    Riccardo

    • Edited by ricger Wednesday, June 04, 2014 9:47 AM
    Wednesday, June 04, 2014 9:45 AM
  • Yes, this occurs with Exchange 2013 CU5.  My client's server just filled up the default location drive and stopped receiving SMTP mail.  I have temporarily moved the location of the transport database using these instructions: 

    http://technet.microsoft.com/en-us/library/bb125177(v=exchg.150).aspx

    After things were running again, I double-checked the Exchange 2013 Storage Calculator spreadsheet for the transport database size requirements.  Sure enough, I found it in a closed part of the Role Requirements sheet.  Take a look at cell B185.  I assume this is the size required to support the transport database.  Any thoughts?

    -Chris

    Wednesday, June 11, 2014 7:24 PM
  • The reason is; In Exchange 2013, Microsoft replaced the transport dumpster with Safety Net.
    Shadow redundancy keeps a redundant copy of the message while the message is in transit. Safety Net keeps a redundant copy of a message after the message is successfully processed. So it means Safety Net begins where shadow redundancy ends.
    The primary Safety Net exists on the Mailbox server that held the primary message before the message was successfully processed by the transport service.
    Message resubmissions from Safety Net are initiated by the Active Manager component of the Microsoft Exchange Replication service that manages DAGs and mailbox database copies. No manual actions are required to resubmit messages from Safety Net.
    Safety Net is a queue that's associated with the Transport service on a Mailbox server. This queue stores copies of messages that were successfully processed by the server.  Safety Net uses the mail.que worth of email in this queue, expect the mail.que database to be larger than previous versions of Exchange.
    The mail.que database file uses the Extensive Storage Engine (ESE), the same database technology which is used by the mailbox databases themselves.
    Similarities between Safety Net and Transport Dumpster
    • Just as in a transport dumpster, safety Net is also a queue that is related to the Transport service on a Mailbox server
    •  It stores copies of messages already processed by the mailbox
    •  The duration for which the messages remain in the queue can be specified as in a dumpster. The default is 2 days
    Why Safety Net is better than Transport Dumpster
    • Safety Net is not just applicable for DAGs but also for Public Folders and other Mailboxes which are not a part of DAGs unlike a transport dumpster
    • Due to the redundant nature of Safety Net it is never a single point of failure. Because of the availability of the Primary Safety Net and the Shadow Safety Net, even if the Primary Safety Net is unavailable for more than 12 hours, resubmit requests are forwarded to shadow resubmit and act as shadow resubmit requests, and messages are re-delivered from the Shadow Safety Net thus ensuring message delivery even if one of the safety net fails
    • Another advantage of safety net is that safety net do net limit the message storage based on size but only by duration. For example if you set 12 days as the duration limit, the messages will be deleted only after 12 days of being in the inbox
    • Safety Net does not require manual resubmission of messages. Message resubmission is initiated by the Active Manager component of the Microsoft Exchange Replication service
    •  Another improvement with Exchange 2013 Safety Net over Transport Dumpster is redundancy.  Safety Net itself is now redundant, and is no longer a single point of failure. This introduces the concept of the Primary Safety Net and the Shadow Safety Net. If the Primary Safety Net is unavailable for more than 12 hours, resubmit requests become shadow resubmit requests, and messages are re-delivered from the Shadow Safety Net.
    Thursday, June 19, 2014 10:00 AM

  • HI

    "For the by design behavior, you can refer to the following workarounds:

    1. Moved queue database to another drive.
    2. Reduce the SafetyNetHoldTime and MessageExpirationTimeouton values to one day 
    (default values are 2 days). For this we can run below commands:
    Set-TransportConfig SafetyNetHoldTime 1.00:00:00
    Get-TransportService | Set-TransportService -MessageExpirationTimeout 1.00:00:00"

    But on the Edge 2013 CU5 with EdgeSync Enabled SafetyNetHoldTime=7 
    (seven) days by default and is not configurable.  SafetyNetHoldTime=2 days is set by default
    for INTERNAL tranpors servers! How to configure  SafetyNetHoldTime on 
    the Edge Server with EdgeSync or SafetyNetHoldTime=7 is also "by design"?

    Thanks

    Saturday, August 02, 2014 3:04 PM
  • Hello,

    We has exactly the same problem and we found solution in our case.

    We have added our internal domains to "BypassedSenderDomains" in ContentFilterConfig.

    Till that time mailbox queue stayed on the same level.

    Hope it helps.


    Mac

    Thursday, September 11, 2014 12:43 PM
  • Hi MacKrK

    Please let me know if adding internal domains to "BypassedSenderDomains" in ContentFilterConfig has to be done on Edge or MBX Servers.

    And do you find any root cause for this behavior.

    Best Regards,

    K2

    • Edited by K 2 Sunday, January 18, 2015 12:42 PM
    Sunday, January 18, 2015 12:41 PM
  • Still an issue till this day.  My DB grows for no reason.

    David Jenkins

    Wednesday, March 22, 2017 3:12 PM
  • Is this by DESIGN?
    Friday, August 04, 2017 2:29 AM