Key Lookups at the Poker Table – #TSQL2sDay
|Cheers to Rob Farley ( blog | @rob_farley ) for hosting June’s T-SQL Tuesday #43, and for asking us to talk about plan operators. I read his host post a week ago and was excited, because I immediately knew that I wanted to write about Key Lookups and explain the rationale for my choice.
The Key Lookup is My Interviewing Ace in the Hole
I haven’t interviewed many SQL Server DBAs recently, but I have done my fair share over the last several years. Interviews are like poker, and the questions are your cards. After I have gotten to know a little about the applicant (i.e., dealt the cards), I start asking technical questions. One of my favorite questions to ask shouldn’t be a surprise to anyone. It is somewhere along the line of ‘What is your process for tuning the performance of a poorly running query?” This is where the applicants put on their best poker face, and I would estimate 95% include the words ‘look at the execution plan’ as part of their answer. This is one of the things I want to hear, so I don’t probe into their answers at this point. I just continue to deal the cards. Eventually, I decided to create a simple question specifically about reading an execution plan, since I was getting so many people that ‘knew’ how to read them.
Using Execution Plans to Call Their Bluff
There are a few exceptions, but I’ve learned most of what I know about execution plans from Grant Fritchey ( blog | @GFritchey ), either through his presentations or writings. As I was trying to come up with an execution plan that I could include as an interview question, I paged through his book, SQL Server Execution Plans. I wanted to keep it simple, and flipped to early in the book, where he shows the first plan that involves more than one plan operator. That seemed like the perfect basis for a question to me. In my question, I provide the DDL from AdventureWorks.Sales.SalesOrderDetail, including the indexes:
CREATE TABLE Sales.SalesOrderDetail( SalesOrderID int NOT NULL, SalesOrderDetailID int IDENTITY(1,1) NOT NULL, CarrierTrackingNumber nvarchar(25) NULL, OrderQty smallint NOT NULL, ProductID int NOT NULL, SpecialOfferID int NOT NULL, UnitPrice money NOT NULL, UnitPriceDiscount money NOT NULL, LineTotal AS (isnull((UnitPrice*((1.0)-UnitPriceDiscount))*OrderQty,(0.0))), rowguid uniqueidentifier ROWGUIDCOL NOT NULL, ModifiedDate datetime NOT NULL, CONSTRAINT PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID PRIMARY KEY CLUSTERED (SalesOrderID ASC, SalesOrderDetailID ASC) ) CREATE NONCLUSTERED INDEX IX_SalesOrderDetail_ProductID ON Sales.SalesOrderDetail ( ProductID ASC)
And, this basic DML:
SELECT ProductID , OrderQty , UnitPrice FROM Sales.SalesOrderDetail WHERE ProductID = 897
I’m a big fan of using SQL Sentry’s Plan Explorer over what is provided in SQL Server Management Studio (SSMS). So, I include a screenshot of the execution plan and ask them to explain what improvements could be made (i.e., at this point the pot has gotten big and I ‘call’ to find out if they are bluffing about being able to read execution plans):
Now, I mentioned that 95% of applicants say that they read execution plans to tune queries, but 95% of applicants also don’t know how to eliminate this simple Key Lookup.
More Plan Details
More information is displayed when you move the mouse over plan operators, in either SSMS or Plan Explorer, giving us important clues about what a Key Lookup does. First, let’s look at the Index Seek details:
From the information above, you can see that the query optimizer is checking the IX_SalesOrderDetail_ProductID index to find the rows WHERE ProductID = 897 in the Seek Predicates section. That makes sense because that index is on the ProductID (SalesOrderID and SalesOrderDetailID are also in the Output List section because they are the fields in the Primary Key). An Index Seek is great here, but we also wanted OrderQty and UnitPrice returned, and they aren’t in this index. What is the fastest way to get those? Let’s look at the Key Lookup operator details:
We see that the Seek Predicate section shows using SalesOrderID and SalesOrderDetailID from the Clustered Index to return the OrderQty and UnitPrice in the Output List section. Since IX_SalesOrderDetail_ProductID didn’t contain OrderQty or UnitPrice, a second operation was needed to return those values.
The Index Seek allowed the optimizer to quickly filter down to only the rows that contain ProductID 897, and the Key Lookup allowed the optimizer to find the values not available in the Index Seek. If our goal is to eliminate the Key Lookup here, we want to add the OrderQty and UnitPrice to the IX_SalesOrderDetail_ProductID index:
CREATE NONCLUSTERED INDEX IX_SalesOrderDetail_ProductID ON Sales.SalesOrderDetail (ProductID ASC) INCLUDE (OrderQty, UnitPrice)
Now, when we re-execute the same query, we get the following plan, with no Key Lookup:
In this small example the impact was trivial, but Key Lookups have been known to contribute significantly to query execution times, and are one of the easier problems to fix with proper indexing.