Answered by:
Query Store sporadically goes in and out of the ERROR state

-
Hello,
We have a SQL Server that has a database with Query Store enabled.
For some unknown reason, every now and then, this Query Store enters the ERROR state. Often times, it also goes back to the READWRITE state on its own after a while. It happens on seemingly random times and we can't correlate it with anything.
Based on Microsoft's documentation, this should not happen normally. But on this particular server it's relatively frequent.
Unfortunately, I couldn't find anything else in Microsoft's documentation about this ERROR state and how to troubleshoot it or even know why it happens.
I tried creating an extended event session to record all QueryStore-related failure events that I could find, in order to better investigate the reason for the errors. But no events are captured there when the problem re-occurs.
I uploaded to GitHub the script that I used (I don't know, maybe I did something wrong?):
https://gist.github.com/EitanBlumin/fb0c4d0a1e3e34610765b50e970af974
So, basically what I'm asking is:
How can I investigate why a Query Store enters the ERROR state? Is it even possible??
SQL Server Version: 2016 SP2 Enterprise Edition
Thanks,
Eitan
Eitan Blumin; SQL Server Consultant - Madeira Data Solutions
- Edited by EitanBlumin Monday, July 9, 2018 11:31 AM
Question
Answers
-
Thanks Teige.
I submitted a suggestion in Azure feedback for this, as per your recommendation.
Anyone reading this, please vote for it here:
The suggestion I submitted is:
Add logging events for Query Store entering the ERROR state
As of right now, it's impossible to know why a Query Store enters the ERROR state. Microsoft's documentation only states that it happens due to "internal problems", but we don't know what these problems are, nor which one of these problems is the real cause for when a Query Store enters the ERROR state. It would be extremely helpful if, for example, there were extended events fired when a Query Store enters the ERROR state, with the reason for the error being part of the event data.
While Microsoft's documentation states that this happens only extremely rarely, I've personally seen situations where the Query Store enters the ERROR state periodically on a server, and we don't know how to stop it from happening because we don't know what's causing it. I've seen this happen on more than one server.
Please vote here: https://feedback.azure.com/forums/908035-sql-server/suggestions/34789006-add-logging-events-for-query-store-entering-the-er
Eitan Blumin; SQL Server Consultant - Madeira Data Solutions
- Marked as answer by EitanBlumin Tuesday, July 10, 2018 6:12 AM
All replies
-
Hi Eitan
Hmmm, I have ten databases on SQL Server 2016 SP2 Enterprise eanbled QS and have not seen such issue.
What are your QS settings? Have you seen something in the ERROR.LOG? I can see
2018-07-09 14:48:03.53 spid60 Setting database option query_store to on for database 'PerformanceV3'.
2018-07-09 14:48:03.62 spid60 Setting database option query_store to read_write for database 'PerformanceV3'.
2018-07-09 14:48:32.79 spid62 Setting database option query_store to read_only for database 'PerformanceV3'.Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/
MS SQL optimization: MS SQL Development and Optimization
MS SQL Consulting: Large scale of database and data cleansing
Remote DBA Services: Improves MS SQL Database Performance
SQL Server Integration Services: Business Intelligence
-
Hi Uri,
The only similar thing I see in the ERROR.LOG is the message "Query Store settings initialized with enabled = 1" upon the server startup.
I understand that having QS entering the ERROR state is a very rare occurence.
Like I said, this only happens in one of our servers.
What I want to know is if/how the QS error state can be troubleshooted, in general.
Eitan Blumin; SQL Server Consultant - Madeira Data Solutions
- Edited by EitanBlumin Monday, July 9, 2018 12:03 PM
-
>>>>What I want to know is if/how the QS error state can be troubleshooted, in general.
Have you performed SET QUERY_STORE CLEAR? Still error state ?
https://docs.microsoft.com/en-us/sql/relational-databases/performance/best-practice-with-the-query-store?view=sql-server-2017
/*
The difference between the
actual_state_desc
anddesired_state_desc
indicates that a change of operations mode occurred automatically. The most common change is for the Query Store to silently switch to read-only mode. In extremely rarely circumstances, Query Store can end up in the ERROR state because of internal errors.When the actual state is read-only, use the readonly_reason column to determine the root cause. Typically you will find that Query Store transitioned to read-only mode because the size quota was exceeded. In that case the readonly_reason is set to 65536. For other reasons, see sys.database_query_store_options (Transact-SQL).
*/
Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/
MS SQL optimization: MS SQL Development and Optimization
MS SQL Consulting: Large scale of database and data cleansing
Remote DBA Services: Improves MS SQL Database Performance
SQL Server Integration Services: Business Intelligence
-
Uri, I think you misunderstood my predicament.
My problem is not that I can't get the Query Store out of the ERROR state.
I very much can and it works fine, and sometimes it even happens on its own.The problem is that every once in a while it goes back to the ERROR state, and I don't know why this happens. I want to know WHY this happens.
As you've quoted yourself, about the only thing explained about the ERROR state of the Query Store is the sentence "Query Store can end up in the ERROR state because of internal errors.". That is hardly helpful, I'm sure you would agree.
Eitan Blumin; SQL Server Consultant - Madeira Data Solutions
-
Hi EitanBlumin,
Based on our test, we will not find any error message related to it when the problem "Query Store ERROR state" occurs, as mentioned by the error message itself, this problem can be caused by some internal errors like failover/restart happens on QDS cleanup.
As a workaround, we can use the code in https://blogs.msdn.microsoft.com/vikas_rana/2018/06/27/query-store-entering-error-state/ to check the database automatically.
By the way, I would suggest you opening a feedback about this problem in https://feedback.azure.com/forums/908035-sql-server
Best Regards,
Teige
MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com. -
Thanks Teige.
I submitted a suggestion in Azure feedback for this, as per your recommendation.
Anyone reading this, please vote for it here:
The suggestion I submitted is:
Add logging events for Query Store entering the ERROR state
As of right now, it's impossible to know why a Query Store enters the ERROR state. Microsoft's documentation only states that it happens due to "internal problems", but we don't know what these problems are, nor which one of these problems is the real cause for when a Query Store enters the ERROR state. It would be extremely helpful if, for example, there were extended events fired when a Query Store enters the ERROR state, with the reason for the error being part of the event data.
While Microsoft's documentation states that this happens only extremely rarely, I've personally seen situations where the Query Store enters the ERROR state periodically on a server, and we don't know how to stop it from happening because we don't know what's causing it. I've seen this happen on more than one server.
Please vote here: https://feedback.azure.com/forums/908035-sql-server/suggestions/34789006-add-logging-events-for-query-store-entering-the-er
Eitan Blumin; SQL Server Consultant - Madeira Data Solutions
- Marked as answer by EitanBlumin Tuesday, July 10, 2018 6:12 AM
-
-
Hi Conor,
I already created an XE session to capture all query-store related events that I could think of, query_store_severe_error_shutdown included, and yet nothing was captured when the problem recurred.
As mentioned by Teige earlier, the error state could also happen due to failovers and restarts and such. Would such cases also be addressable with updates/patches?
Eitan Blumin; SQL Server Consultant - Madeira Data Solutions