Query Store Hints, how can it help in performance issues?

Hi there!

Last December I got lucky since after upgrading from SQL Server 2012 to 2022, one of my clients had a process running in a particular database that went from around 10 minutes to around 10 hours!

You might be thinking: “It got 60 times worse, where’s the luck in that?”

Well, it presented me with a great opportunity to try a new feature in SQL Server 2022 called “Query Store Hints” (in a real production environment) and confirm the fabulous results that can be obtained by using it, with no effort at all. I thought it could be worth to share this experience and that’s what I’ll present in this post 🙂

Throughout this post, I’ll mention databases Compatibility Level (CL) several times. I’m assuming that you’re familiar with it, in the way that it sets T-SQL and query processing behaviors to be compatible with the specified version of the SQL engine. If you’re not familiar with it, you can find useful and extensive information about it here.

The Problem

An upgrade in place was made from SQL Server 2012 to 2022. By the way, as a side note I believe is worth mentioning, an upgrade in place will keep the master database with the original CL, while elevating the compatibility level for the other system databases, so after this in place upgrade, master kept CL = 110 whereas tempdb, model and msdb had CL changed automatically to 160. This behavior is by design:

I must say that I can’t fully agree with the statement: “This won’t affect user database behavior“, specially if there are users running T-SQL statements containing cross-database queries from the master over user databases (using three part name), it can really affect the behavior. But as I said, this is just a side note, not the problem I wish to explore in this post, we simply elevated the CL for the master database manually, no worries here.

Regarding the user database, immediately after the upgrade, we turned on Query Store and kept the original CL = 110 (SQL 2012) on purpose, for a couple of weeks. This database holds data for a software product implemented by a specific vendor that stated that we could elevate CL to 140 (SQL 2017). So, after a few weeks, having both the environment stable (with important metrics inside query store) and the vendor confirmation that we could elevate CL to 140, we actually did it.

A daily night process that used to take 10 minutes to execute, took 10 hours the first night after elevating CL to 140 and the exactly same thing happened on the second night! Thanks to the contents of query store, it was not difficult to pinpoint the query from the night process that used to take around 4 minutes and started to take 10 hours to execute right after elevating CL to 140. Yep, amazingly (and fortunately) this huge increase in time execution was all due to a single query!

The Solution

At this point, I already had the query responsible by the excessive execution time that suffered a tremendous regression, right after elevating CL from 110 to 140.

Since SQL Server 2014, the Cardinality Estimator (an important component of the Query Optimizer) had several improvements, resulting most of the times in better performance for the queries running in SQL Server, however it’s a known fact that for some queries, changing CL from SQL Server versions <= 2012 to >= 2014 can cause regression.

My first step was to confirm if this was the actual problem. Fortunately we can do that with a very simple test. I picked the “problematic” query from the query store, added the following hint at the end of it and executed in a testing environment (in a copy of the prod database) :

--available since SQL Server 2016 SP1, I placed this after the query
OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

This query hint is available since SQL Server 2016 SP1 and basically it will force the use of the “old” legacy cardinality estimator (the one used until SQL 2012) at the query level, not at database level, so it’s very granular (I can’t show the actual query here)

As I somewhat expected (and truly desired), this time the query only took around 4 minutes to run. Then I removed the query hint and there it was running “forever”. I tested it a few times, the behavior was always the same. Ok, now the question is:

How can I fix this, with the lowest impact?

It’s important to notice that the databases Compatibility Level does not “only” set query processing behaviors, it can also determine whether certain T-SQL statements will run or not. For example if the database CL is 110, certain functions will not work, such as STRING_SPLIT, OPENJSON or GENERATE_SERIES:

So, putting the CL back to 110 (it was now at 140) would mean that the database could not benefit from a very significant number of SQL Server improvements that became available since 2012, at the query processing level and on top of that, denying the possibility of using certain T-SQL statements that can be quite useful in new developments.

One of the many great things SQL Server 2016 brought us was “Database Scoped Configurations”, one of them being “LEGACY_CARDINALITY_ESTIMATION”. This option would already be better because it would allow us to maintain the database CL = 140 (supported by the product vendor) while taking care of that query 🙂

Yes, by now you might be thinking something like: “But that will still affect all the queries sent to the database and only one query regressed after CL elevation!” You’re right! It would still have a considerable impact… Let’s see, I couldn’t apply the query hint above, the one I used for troubleshooting purposes, since no changes could be made to this query because it was inside the software product.

So, before SQL Server 2022, I guess I’d choose to create a Plan Guide, yep that could do it but since I’m in SQL Server 2022, I’ve decided to use the new Query Store Hints feature.

Assuming that you’re already familiar with query store, using query store hints couldn’t be simpler, we just need the query_id of the target query, obtained directly from the query store (via sys.query_store_query) and the hint (or hints) we want to apply to it. No changes will be made to the query itself, we’ll use a stored procedure named sys.sp_query_store_set_hints to save this “instruction” inside the query store in the database, so that when this specific query executes, the respective hints are automatically applied to it.

So, in my case it resulted in running this simple statement:

EXEC sys.sp_query_store_set_hints @query_id= 1187, @query_hints = N'OPTION(USE HINT(''FORCE_LEGACY_CARDINALITY_ESTIMATION''))';

Done! As soon as it’s been created on the database, the query execution changed back to its previous behavior, using the legacy cardinality estimation and taking around 4 minutes (instead of 10 hours), thus minimum impact (this single query is affected) with maximum benefit, what a quick win, and it’s been like this for the past 3 months now 🙂

Alerts

While this was truly interesting for me, as real life experience that I’m sharing with you, there are some alerts I’d like to leave here:

  • There is a limited set of hints that can be used, so we shouldn’t think of this as a “silver bullet” for all the problems, far from it;
  • Even when it is applicable, this approach should be based on tested evidences not some basic assumptions;
  • Be well aware of the circumstances that made the solution based on the hint to work, because if the underlying circumstances change, the hint may not respond so well in the future and probably something will have to be done about it;
  • This solution tends to become “invisible” overtime. After six months, there’s a risk that nobody knows or remembers its existence anymore, therefore I strongly suggest that when used, all this gets properly documented stating the original problem, the rationale behind the solution and the evidence collected before and after its implementation. I believe this is crucial;
  • Be careful with the possibility of having queries that use 3 part-naming, in the form <target database>.<schema>.<table> because if they’re invoked from another database, the query store hint of the target database will be ignored;
  • After its implementation, it’s important to monitor the situation on a regular basis and check if the benefits obtained are still valid. Among others, there is a system view that gives us some information regarding the existence of query store hints in one database, called sys.query_store_query_hints:

Conclusion:

In the end, for this particular situation, we came up with a good result, the CL was elevated to 140, with all its benefits, and the negative impact caused by that single query completely disappeared without any side-effect, thanks to this quick win called Query Store Hint, present for the first time in SQL 2022.

Oh yes, I’ll keep in mind all the alerts I mentioned above and act accordingly 🙂

Cheers!

Author: Etienne

Hello, My name is Etienne Lopes (https://pt.linkedin.com/in/etiennelopes) and I currently work as a Database Performance Architect at ORAMIX. I'm Working with SQL Server since 1997: versions 6.5 to 2019, member of sqlport community (www.sqlport.com), regular speaker since 2012 at SQL Server events such as: - SQL Saturday; - 24 Hours of Pass; above all I like SQL Server ;)

One thought on “Query Store Hints, how can it help in performance issues?”

Leave a comment