As a BI developer, I wanted to read a book about query performance tuning so I can make my reports run faster and my ETL run times shorter. There are some books about query tuning in SQL Server out there, one of them being the excellent Query Tuning Optimization by SQL Server MVP Grant Fritchey (blog | twitter). However, the book is over a whopping 500 pages long and some content is a bit too hardware/administrator-focused for a BI developer like me, so I was looking for a book in a more digestible format (in other words: a bit easier and shorter). If you’re the technical guy wanting to know a lot about the internals of query tuning, definitely check out Grant’s book.
Luckily I found what I was looking for in the following title: Microsoft SQL Server 2014 Query Tuning & Optimization by SQL Server MVP Benjamin Nevarez (blog | twitter). I have to say I was very impressed by the book and I definitely learned a lot. It’s a little less than 400 pages long but it’s written in a way that makes it content quite readable, even though it’s technical.
About the Author
Benjamin Nevarez has over 20 years’ experience with relational databases and is a SQL Server MVP for many years. He has written several books about the internals of SQL Server and is a speaker at many SQL Server conferences, including the SQL Server PASS Conference, the largest SQL Server-related conference in the world. In short, I’d trust the content he writes about SQL Server.
Let’s take a look at what’s inside the book.
The first chapter gives a general but thorough introduction to the architecture of the query processor: parsing, binding, optimization, assessing costs, and so on. The different types of representation of query plans are discussed as well (graphical, text and XML): how to read them, deal with warnings, and get them either from the plan cache, from a trace, or from extended events. SET STATISTICS IO and TIME are briefly introduced. As in many chapters in this book, a lot of useful SQL statements are presented that retrieve essential information from the system, such as how to get an execution plan from the DMVs (dynamic management views). You can add those queries immediately to your toolbox.
MCSE Training – Resources (Intense)
The second chapter talks about troubleshooting queries. The main part of the chapter deals with getting information from various sources, such as DMVs, a SQL trace, extended events, or the data collector. If you want to know how to find the most expensive queries on your system, this is the chapter you want to read. This chapter builds on Chapter 1, as it doesn’t discuss execution plans, for example, to troubleshoot queries.
The query optimizer is the subject of Chapter 3 and is described in great detail. It talks about the different phases the optimizer goes through in order to build a good enough execution plan. For people who quickly want to learn how to make their queries go faster, this chapter might be a bit too detailed. You can write decent performing SQL statements without ever knowing what the memo structure is, for example. However, if you really want to know what’s going on inside the optimizer, this chapter is a must-read.
Chapter 4 explains the most common operators in an execution plan. This is one of the most useful chapters in the book and I definitely learned a lot by reading it. Finally I know the exact difference between a nested loop join, a merge join, and a hash join.
Chapter 5 is all about indexes and has anything you’d expect from it: the different types of indexes, their advantages and disadvantages, the database tuning advisor (which actually most DBAs avoid), missing and unused indexes, and index fragmentation.
Statistics are discussed in Chapter 6 and once again in very great detail. This chapter is the one I learned most from in the book. About every aspect of statistics is treated by the author and he really does a good job in presenting everything in a way it’s easy to understand. You really get the message why statistics are so important in the creation of execution plans and also why they can have such a big impact on performance if the estimates are not good enough. A must-read in my opinion.
Chapter 7 introduces In-Memory OLTP (maybe better known by its codename, Hekaton). To be honest, I skipped this chapter because, as a BI developer, the chances are slim I will use this feature in the near future. Also, in my opinion, this feature has too many restrictions in the current release (SQL Server 2014). I hope Hekaton will have fewer limitations in the next releases of SQL Server.
Chapter 8 talks about plan caching. When is a plan recompiled and when is it reused? The biggest part of the chapter deals with parameterization and parameter sniffing. This last one has gotten a bad reputation over the years because of the potential performance problems, but it is actually quite a useful feature. Definitely worth checking out if you need some research on parameter sniffing.
Chapter 9 is a chapter I was definitely interested in, as it deals with data warehouses. Unfortunately the chapter is quite short. It gives a short introduction to the concept of data warehouses and then it details two important concepts: star join optimization and columnstore indexes. Both features are described really well, but they are both Enterprise edition features so, for readers with another edition, this chapter hasn’t got that much too offer (unless they want to learn about those features of course). Personally I’d hoped there be a little more content, such as, for example, indexing strategies for fact tables. But maybe that’s a bit out of scope for a general book about query tuning and optimization.
The last chapter talks about a few topics: join order (why it complicates query optimization so much), breaking down of complex queries (a last resort when the query itself is too complex to be efficiently optimized) and query hints (with a lot of practical examples).
I personally liked this book very much. It taught me a lot about the inner workings of the SQL Server query optimizer and how statistics work and why they are so important. It never is too overly technical or complicated, and the author really succeeds in bringing across his vast knowledge of the subject areas. The book also has a lot useful queries ready for troubleshooting tasks that you can immediately add to your toolbox. An advantage, in my opinion, is that the book is independent of hardware and server configuration; it only talks about SQL Server itself. The only downside I could find is the lack of case studies. I would have liked, for example, a small chapter (or maybe an appendix) where the author starts with a few problematic queries and then applies all the knowledge of the previous chapters to troubleshoot them and find the root cause of the performance issue so he can suggest a solutions for these problems (adding an index, different database settings, rewriting the query, etc.)
This book will be useful for everyone who needs to write queries on SQL Server and wants to know how to start with query tuning and optimization. It is also a great set of reference material on deeper technical subjects, such as the optimizer, query plans, and statistics. Even the seasoned T-SQL developer will probably find something useful in this book. I definitely recommend this book.