SQL Server query tuning and optimization (Packt)


Page 1 of 4

Author: Benjamin Nevarez
Publisher: Packt Publishing Pages: 446
ISBN: 9781803242620
Print: 1803242620
Kindle: B0B42SVBFY
Audience: DBAs and intermediate to advanced developers
Rating: 4.7
Reviewer: Ian Stirk

This book aims to give you the tools and knowledge to get optimal performance from your queries, how does that happen?

The book, which is largely an update of the author Microsoft SQL Server 2014 query tuning and optimization published by McGraw-Hill, with two new chapters, covers query tuning and optimization, with the goal of giving you the tools and knowledge to get optimal performance from your queries and applications. The optimizer creates an execution plan, detailing how the query will be satisfied, the quality of the plan often depends on the data given to it (e.g. database design, available indexes, etc.). This book examines these factors with the goal of improving query performance.

The book is intended for SQL Server professionals, database developers, database administrators, and data architects. This assumes some knowledge of SQL Server and familiarity with the SQL language. It covers all supported versions of SQL Server, including SQL Server 2022.

Below is a chapter-by-chapter exploration of the topics covered.

Chapter 1 Introduction to Query Tuning and Optimization

The chapter opens with the idea that the more you know about how SQL Server works, the better positioned you are to troubleshoot SQL problems. It continues with a brief overview of the main architectural components: the storage engine (controls getting data, concurrency, and integrity) and the relational engine (aka the query processor, which creates a plan, executes the query and returns the results). This chapter examines the operation of the query processor.

The request handling process is broken down and its steps discussed, namely:

  • Parsing and binding (parsing ensures SQL syntax is valid, binding is primarily name resolution)

  • Query optimization:

  • Generation of candidate execution plans (we want a good enough plan. often potentially a lot of plans, uses rules and heuristics, we must limit the optimization time)

  • Evaluate the cost of each plan (cost of the physical operator [CPU, IO etc] with estimated lines [cardinality] reflects total cost)

  • Query execution and plan caching (execution plan stored in plan cache. Cache checked first to see if plan exists, as optimization can be relatively expensive)

The chapter then focuses on execution plans, their components, and how to read them. They are the primary method of interacting with the query processor. The different formats of the plans are discussed, namely: graphic, XML and text – it should be noted that the latter is obsolete. Various plane attributes are discussed (eg, STATEMENTOPTMLEVEL). Plans often contain warning messages (eg NoJoinPredicate), which are helpful indications of a possibly problematic or suboptimal plan. The possibility of obtaining the plan from a trace or from the plan cache is discussed with the use of the relevant tool. Finally, the use of SET STATISTICS TIME and IO is discussed with examples, providing another useful setting method.

This chapter gives a very good overview of how the query processor works (parsing, binding, optimizing, executing), as well as a useful overview of the concepts that are used in the rest of the book. There are useful diagrams showing request processing and compilation and recompilation processes. There is some useful code to create an Extended Events (XE) session to save show plan information.

The chapter has some shortcomings. First, it should have been updated to refer to execution plans held in Query Store. Second, to get the top 10 most expensive queries by CPU usage, the author sorts by average CPU – that’s wrong, you have to sort by total_worker_Time. This is the second book review in a row where I see this error. The explanation of my fix is: if a query takes 20 minutes of CPU to run and runs twice, its average CPU usage is 20 minutes and its total CPU usage is 40 minutes, if another query takes 2 minutes of CPU to execute, and is executed 1000 times, its average CPU usage is 2 minutes, and its total CPU usage is 2000 minutes. If you sort the results by average CPU usage, the 20-minute query appears on top, even though it’s not the most CPU-intensive query.

There is a useful tip on examining the schema associated with the XML plan to discover any additional plan properties (eg StatementOptmEarlyAbortReason, NonParallelPlanReason). There is good example code to illustrate the concepts discussed, good links to related chapters, and website links for more information – these traits apply throughout the book.


Comments are closed.