ColdFusion Summit East Notes: Faster Queries in ColdFusion and SQL Server, Eric Cobb

All tips work on SQL 2005 and up

Public Stack Overflow Database
powered by SQL server
Stack Overflow periodically make copies of their DB for people to use in presentations!
Good free data set to play with for demos, etc.

SentryOne Plan Explorer
great tool for looking at execution plans in SQL Server

CFML query rules to live by -
Always use CFQueryParam
- security, data validation, better performance
- uses “bind variables” when passing SQL to the server, to prevent sql injection, also allows server to reuse the execution plan

Always look for caching opportunities —
ex: list of US states doesn’t change, can cache that
can also cache things that are not long term
think of it as reducing hits to the DB even if just for a short amount of time
can have a big impact on perf of the app

blog post “Good Developers Practice Safe Query Caching”
Mark Kruger, ColdFusion Muse

ex:
page on busy site receiving over 200 views per minute
query populated choices in a dropdown list that changed frequently
by caching query results for -three- -minutes-, the num of db hits when form 12000 to 20 per hour!
short term caching can be really beneficial!

execution count
every time you run a SQL query
server will look at query, analyze it and generate an execution plan
that’s how it tries to find the data you’re reading, deleting, etc.
this plan tells the server how to go get the data

once sql server generates the execution plan
it saves that plan in a section of memory called “plan cache”
it’s expensive to generate these plans, so it saves them
this is 1 of the reasons why queries run faster the 2nd time vs the 1st time, execution plan is already generated on 2nd run, 3rd run, etc.

Query in a loop —

new “execution_count” for EVERY query because we have hard-coded values in the query
also wastes memory on the SQL server
swapping out the hard-coded values to CFQueryParam lets us reuse the execution plan
uses “bind variables” to pass an parameterized query to SQL from ColdFusion

(there’s a blog post on coldfusion.adobe.com with Eric’s query to show execution plans on the server)

Caching —

short term caches can be very beneficial
got a query in a loop that runs over and over?
try not to do that, bad form
but “cachedwithin” for 30 seconds can be a big speed improvement if you’re stuck in that spot

Common Query Problems —

ORDER BY, GROUP BY, and DISTINCT may be slowing your query down
if you’re running a query that deals with a LOT of data or is very complex (lots of joins, sub queries, etc)
these operators could be causing problems
when a query comes into the server, a “memory grant” is issued
that says “based on the analysis of this query, it will get X number of megs of memory to run”
if the emmory grant is not large enough (ex: it’s 10 megs but you needed 50)
the grant can NOT be expanded
if the memory isn’t available in the grant, the server will do a “temp db spill”
this creates a temp table in temp db and will continue processing in there for the rest of your query stuff
this is what happens a lot with large ORDER BY, GROUP BY, DISTINCT, etc.

when that happens, queries can get significantly slower
very easy to figure out if that’s happening
if you have a query using one of these operations, take out the ORDER BY, DISTINCT, etc, and see if the query is much faster
if it is, you know you’re getting a temp db spill
then, look at some indexing
or try returning the data WITHOUT the “order by” and try to filter the data elsewhere (via QueryOfQuery in CF maybe? via JavaScript table sorting, etc.)
it will depend on your data, but test this and see what happens.

Data Type Mismatches—
when you’re comparing columns in SQL, or 2 columns to each other, or variable to a column, etc
want to make sure the data types match
don’t want to compare an INT to a VARCHAR, for example
SQL has to have an EXACT match to compare these things
ex: WHERE customerName = some variable
SQL has to convert 1 to match the other for the comparison
this is an “implicit conversion” SQL has to do to make the comparison work
happens behind the scenes
if querying against a db in a table with a LTO of data
if it converts EVERY record in that table, you’ll see a big performance hit
happens a lot with “date” types
and with varchar, nvarchar, etc comparisons
into to big int isn’t that big of a deal.

Indexing —
“Poorly designed indexes and a lack of indexes are primary sources of database application bottlenecks”

proper indexing is 1 of the BEST performance enhancements you can make to your database

various types of indexes
clustered, columnstore, nonclustered, hash, spacial, xml, full text, etc.

the generic “index” term means “nonclustered” ALWAYS
all the OTHER index types will be referred to by NAME (“a clustered columnstore index”)
“index” is “nonclustered”

Clustered -
tells SQL how to physically sort the records on disk
the most important index you can apply to a table
Primary Key = clustered index (usually) if you don’t specify a Clustered Index SQL automatically creates one on a table’s Primary Key

clustered and nonclustered are most common for typical web app

nonclustered-
these are the guys you write to make queries faster
think of it as a mini version of your table
copies the values from the specified columns
points to the actual data rows (via clustered index or Heap Row ID)
can have multiple nonclustered indexes on a table
SQL 2005 allows up to 249 per table; sql server 2008+ allows up to 999 per table…but don’t actually make that many!!!

can only have 1 clustered index, multiple nonclustered
clustered describes how the data is physically sorted, hence only 1 of them allowed

“Include Actual Execution Plan” button in SQL Enterprise Manager
typically, read the plan beginning top-right, over to the left
top right will be 1st table sql hits
below will be other tables it hit
as you go to the left you’ll see how it progressed to build your SELECT statement

“clustered index SCAN”
SCAN = sql did NOT know exactly where the data was so it had to SCAN to find it

“index SEEK”
SEEK = “I know exactly where that info is, and i’m going to get it”
much faster than a SCAN

“nested loop” for the inner join

“key lookup”
Key Lookup = i had PART of the data from the index but I had to go to the table to get more info

SentryOne - company that makes SQL products
“Plan Explorer”
products that allows you to analyze execution plans
FREE download!
don’t even need to give email addy

“Missing Index Found”
SQL will TELL YOU if it can find an index and it will tel you WHICH index will make it faster

right-click “Missing index details” and it will write the index FOR YOU

SQL is very good at understanding WHEN an index will help
it’s not good at recommending WHAT the index SHOULD be
about a 50/50 chance of getting it right
it doesn’t take a lot of things into account eg. if another existing index could do the same thing
doesn’t look at how busy yr system is
it looks at this query with blinders on

github.com/ericcobb
sqlnuggets.com
@cfgears - personal twitter
@sqlnugg - for SQL tips