Recently, authors were reporting that no more Analytics data was being displayed since a certain date.
When analyzing the logfiles of the processing/reporting server, there were many of these entries:
Exception: Sitecore.Xdb.Processing.Queue.ProcessingPoolException Message: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
The reason for this was that the Processing.Pools.InteractionLiveProcessingPool table had ~120’000 entries in it and was constantly growing. The attempts count on the top records was very high:
The processing server was running into a timeout when accessing this large amount of entries. Copying the entries into a temporary table and remove all entries from the InteractionLiveProcessingPool table helped fix this temporarily.
INSERT INTO tmp_ InteractionLiveProcessingPool SELECT * from InteractionLiveProcessingPool
Processing immediately started working again as new interactions were processed and processed entries were removed from the table every few seconds.
I then tried to copy blocks of the tmp_ InteractionLiveProcessingPool table back:
INSERT INTO [xdb_processing_pools].[InteractionLiveProcessingPool] SELECT TOP 1000 * FROM dbo.tmp_InteractionLiveProcessingPool ORDER BY Created DESC</code> SELECT count(*) FROM [xdb_processing_pools].[InteractionLiveProcessingPool] DELETE FROM dbo.tmp_InteractionLiveProcessingPool WHERE [tmp_InteractionLiveProcessingPool].InteractionId IN (SELECT TOP 1000 InteractionId FROM dbo.tmp_InteractionLiveProcessingPool ORDER BY Created DESC)
But processing was very slow. It could hardly keep up with the live interactions that were being generated from site traffic.
To optimize this, I had to increase processing power by
- Increasing the batch size
- Increasing aggregation worker threads
<aggregation role:require="Standalone or Processing"> <aggregator> <!-- Increase processing batch size from 64 to 256 --> <param type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" desc="maximumBatchSize" factoryMethod="ToShort" arg0="256"/> </aggregator> <module type="Sitecore.Analytics.Aggregation.AggregationModule" singleInstance="true"> <BackgroundServices hint="list:Add"> <aggregator type="Sitecore.Analytics.Core.AsyncBackgroundService"> <!-- Increase aggregation worker threads to 4 --> <param desc="maxAgents" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToInt" arg0="4" /> </aggregator> </BackgroundServices> </module> </aggregation>
Watch your Processing Server performance after these changes!
If you’d have the same issue during a Reporting DB rebuild (InteractionHistoryProcessingPool) table, you would need to tweak the historyWorker background service instead. For details, see https://doc.sitecore.com/developers/91/sitecore-experience-platform/en/configure-processing-agents.html
Caveat: The documentation states:
«Adjust the MaxPoolSize property for the reporting database. The pool capacity should be at least about 120% of the number of agents (including processing, history, automation workers, and clean-up) that you have configured»
Not having the maxPoolSize set explicitly on the connection string means that the default value of 100 will be used. My total number of workers was still nowhere near this value hence the maxPoolSize didn’t need to be adjusted.
After these changes, processing was noticeably faster. I copied back all the entries from the temporary table and processing took roughly 6 hours to handle the remaining ~100’000 interactions.
INSERT INTO [xdb_processing_pools].[InteractionLiveProcessingPool] SELECT * FROM dbo.tmp_InteractionLiveProcessingPool ORDER BY Created DESC SELECT count(*) FROM [xdb_processing_pools].[InteractionLiveProcessingPool] DELETE FROM dbo.tmp_InteractionLiveProcessingPool
And voilà, the reports were back:
Thank you, you saved my bacon 🙂