Friday, July 15, 2011

Good old-fashioned SQL optimzation using an index

Sometimes simple, basic stuff really works.

This query was taking too long to execute:
SELECT ISNULL(SUM(Quantity),0) as 'count'
FROM dbo.AllOrderItems WITH (NOLOCK)
WHERE CustomerID=@customerID AND ItemID=@itemID

AllOrderItems has nearly 5 million rows. I executed this query with a customerID and itemID that corresponding to 3 records. It took 23 seconds to run.

I created an index on the CustomerID and ItemID columns like this:
CREATE NONCLUSTERED INDEX [Customer_Item] ON [dbo].[ArcOrderItems]
(
[CustomerID] ASC,
[ItemID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

With this index, the execution time was reduced from 23 seconds to less than one second.

Moral of the story: sometimes the obvious fix is the right one when it comes to database optimization.

Bonus tip: When testing the speed of a query, it will often run faster the second and subsequent times, because SQL caches the results. These two commands clear the cache, ensuring a realistic measurement of the execution time:
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

Interesting book about engineering

When I attended June's Google NYC Tech Talk on performance bugs, speaker Jon Bentley recommended the book To Engineer Is Human by Henry Petroski. I enjoyed Jon's presentation so much, I read the book.

This book is not about software. It's primarily about civil and aeronautical engineering. But the lessons it imparts about how to learn from failure and create more reliable products apply nicely to software engineering.

The writing style was a bit formal, and some of the examples -- this book is from the 1980s -- are dated. But overall I enjoyed it, and it made me think about how to be a better software developer.