We just finished the fourth day of SQLskills learning. Today was all on indexes and statistics. We learned about choosing a clustering key, what makes a good one, what makes a bad one. We also have to keep in mind the consequences to non-clustered indexes when choosing the clustered index. We learned about why SQL will choose to use a certain index and why it might use one we wouldn't expect.

Tipping points are interesting. These are the point at which SQL says, it's cheaper to do a scan with sequential IOs instead of a lookup with random IOs. I was amazed at how small a percentage is required before it tips to a scan. The narrower the table, the quicker the tip. This isn't necessarily a bad thing, but you have to be aware of it. Covering indexes aren't affected by tipping points.

Which brings up covering indexes. Be careful of creating too many and too narrow of covering indexes. You really need to consider the entire workload before creating any new indexes. You should also periodically review indexes for consolidation.

There seems to be many misconceptions hanging around or old best practices that have been superseded when 7.0 / 2000 came out. For example, you don't necessarily have to list the most selective column in a NC index. List in order of use frequency.

Tonight Paul & Kimberly opened the floor to anyone wanting to present a 20 minute topic. Kendra Little (blog | twitter), Brian D. Knight (twitter), Trevor Barkhouse (blog | twitter), Martin Hill (blog | twitter) and myself presented. I know I flubbed my presentation by going down a rabbit hole. Twenty minutes isn't enough to go down those holes. I need better presentation discipline and posture.

Today was a bit more difficult staying focused. I think it's just being in the same environment for almost a week. Hopefully I'll be more focused tomorrow.

Indexing for Performance (Index Internals)

Indexing for Performance (Internals & Data Access)

  • Have an idea of each table's tipping points, they can be smaller than you think
  • Dense tables will have smaller tipping points, possibly < 1%
  • Forced parameterization could conflict with filtered indexes / stats
  • Always explicitly specify session options
  • Filtering indexes / stats can offer widely covering indexes when highly selective
  • Most selective column shouldn't necessarily be first in index
  • If equality based, order columns in index by frequency used

Indexing for Performance (Statistics)

  • Query predicates can be affected by order, for large numbers of predicates
  • Histograms are usually key
  • Update stats before rebuilding indexes as it affects parallelization
  • sp_createstats 'indexonly' to create stats on non-statted columns of indexes
  • Out of date stats can affect query plan query cost relative to batch