Abstract: You discovered that something on your Exchange Server (2013/2016) keeps eating your disc space. During troubleshooting you found out that the mail.que file has a size from many GB.
By default, the transport queue database (mail.que) is located at C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue:
In the example above you can see that the mail.que is already 140 GB in size, which is really big and caused therefore a lot of space issues. According to Microsoft the mail.que is an ESE Database, so its similar like to an *.EDB database which holds the mailboxes. So there is currently no build in option to "shrink" the mail.que to a smaller file. The only way to get a smaller mail.que would be to rebuild it. But before we run true the steps to rebuild it, you should check some settings which might caused the issue:
1.) Check the SafetyNetHoldTime value. This parameter specifies how long a copy of a successfully processed message is retained in Safety Net (as written here). Unacknowledged shadow copies of messages auto-expire from Safety Net based on adding the values of the SafetyNetHoldTime parameter and the MessageExpirationTimeout parameter
Run the command below to check if your SafetyNetHoldTime was changed or is still on 2 days:
Get-TransportConfig | select SafetyNetHoldTime
The default value here is "2.00:00:00" (2 days). If it was changed, it might cause that your mail.que file grows unexpected. If you wish to reset that back to the default value run:
Set-TransportConfig -SafetyNetHoldTime 2.00:00:00
2.) Check the ShadowMessageAutoDiscardInterval. This parameter specifies how long a server retains discard events for shadow messages (as written here). A primary server queues discard events until queried by the shadow server. However, if the shadow server doesn´t query the primary server for the duration specified in this parameter, the primary server deletes the queued discard events.
Run the command below to check if your ShadowMessageAutoDiscardInterval was changed or is still on 2 days
Get-TransportConfig | select ShadowMessageAutoDiscardInterval
The default value here is "2.00:00:00" (2 days). If you wish to reset that back to the default value run:
Set-TransportConfig -ShadowMessageAutoDiscardInterval 2.00:00:00
3.) Another issues mentioned here which you should check when you troubleshoot space issues is the "PipelineTracingEnabled". The PipelineTracingEnabled parameter specifies whether to enable pipeline tracing (as mentioned here). Pipeline tracing captures message snapshot files that record the changes made to the message by each transport agent configured in the transport service on the server. Pipeline tracing creates verbose log files that accumulate quickly.
You can check the status via:
get-TransportServer -Identity exch01 | select PipelineTracingEnabled
If this is enabled you can disable it via:
set-TransportServer -Identity exch01 -PipelineTracingEnabled $false
After the steps above are checked, which could cause this kind of issue, you can rebuild the mail.que via the following steps (preferred during a time frame where only low workload exists on the server)
1.) run the following powershell command:
and make s note from the count of messages on the server
2.) Now goto services.msc and Pause (not stop!) the "Microsoft Exchange Transport" service or use powershell:
suspend-service -name "Microsoft Exchange Transport"
3.) run again the following powershell command:
4.) If the queues are empty (double check that to avoid losing emails) stop the "Microsoft Exchange Transport" service
stop-service -name "Microsoft Exchange Transport"
5.) Move all files inside the C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue folder into a backup folder (e.g. on another HD)
6.) Start the "Microsoft Exchange Transport" service
start-service -name "Microsoft Exchange Transport"
The Exchange server will rebuild the files now we moved from the queue folder
7.) run again the following powershell command:
make sure that the queues & mail flow are working as expected.
The 2nd solution to solve the growing disc space is to change the location of the queue database as explained here and put that on a drive which has enough disk space.
As Exchange 2010 didn´t had the Exchange 2013/2016 feature called SafetyNet (more here) the default value from 2 days might be to high for your environment so you might start to try a smaller value, for example 1 day (instead 2 days which is the default value). SafetyNet is an improved version from the "Transport dumpster" from Exchange 2007/2010. In Exchange 2010, the transport dumpster helped protect against data loss by maintaining a queue of successfully delivered messages that hadn`t replicated to the passive mailbox database copies in the DAG. When a mailbox database or server failure required the promotion of an out-of-date copy of the mailbox database, the messages in the transport dumpster were automatically resubmitted to the new active copy of the mailbox database (as written here).
Set-TransportConfig -SafetyNetHoldTime 1.00:00:00
Currently you can change that only organization wide, there is no server setting!
If you do that you might see the following warning:
WARNING: Setting 'SafetyNetHoldTime' to a value lower than 'ReplayLagTime' in a Mailbox database copy can cause irrecoverable data loss. Please ensure that the 'SafetyNetHoldTime' parameter is set to a value equal or greater than the 'ReplayLagTime' parameter, which is set using the Set-MailboxDatabaseCopy cmdlet.
So keep also noted that you might wish to adjust the ReplayLagTime via Set-MailboxDatabaseCopy to 2 days as well:
Set-MailboxDatabaseCopy -Identity DB2\MBX1 -ReplayLagTime 2.0:0:0