Art of the DBA Rotating Header Image

defrag

Database Oil Changes – Part 2

She’s comin’ apart at the seams!

The other big maintenance task that should be regularly done on your databases is managing your index fragmentation. For those wondering, index fragmentation is how the page splits in your indexes will cause the data pages that make up your indexes to become logically disorganized, so that when the SQL Server engine goes to use the indexes, it will have to make more calls to the physical disk to get the appropriate index pages, thus requiring more work (ultimately meaning your queries will perform slower).

For the full skinny on what index fragmentation is, take an hour and watch Paul Randal’s(b|t) MCM training video.

In order to address index fragmentation, you have a two options:

  • REORGANIZE – This will reorganize the data pages in your index on the leaf level. While it will compact and organize data on that leaf level, header records won’t be adjusted, meaning you might still have index fragmentation.
  • REBUILD – This is exactly what it says it is, a complete rebuild of the index. This is pretty time and resource intensive (especially if your table is large), but once it’s done you will have a clean and shiny new index with little or no fragmentation.

The general rule of thumb on these two options is to REORGANIZE when your fragmentation is low, REBUILD when it gets higher. The numbers typically used are to REORGANIZE when your fragmentation is between 10% and 30%, REBUILD when you exceed 30%, but your mileage may vary.  As with many aspects of SQL Server, there are no real hard and fast rules, so make sure you evaluate your environment when deciding between these two options.

A Level 3 Diagnostic

So how do we find fragmentation? Fortunately, there’s a dynamic management object just for this purpose: sys.dm_db_index_physical_stats.   CAUTION: This is one of the few super-costly DMOs out there, as it will scan all your index pages to find out the exact fragmentation. I recommend you read up on the different options before using it and do not run it on production servers during high activity times.

To get you started with this view, here’s a basic query to get your fragmented indexes in the current database:

SELECT
      db.name AS databaseName
    , ps.OBJECT_ID AS objectID
    , object_name(ps.OBJECT_ID,db.database_id) objectName
    , ps.index_id indexID
    , idx.name
    , ps.partition_number partitionNumber
    , ps.avg_fragmentation_in_percent fragmentation
    , ps.page_count/128.0 index_size
FROM sys.databases db
  JOIN sys.dm_db_index_physical_stats (db_id(),NULL, NULL , NULL, N'Limited') ps
      ON db.database_id = ps.database_id
  left join sys.indexes idx on (ps.object_id = idx.object_id and ps.index_id = idx.index_id)
WHERE ps.index_id > 0 --Skip heap tables
	and ps.avg_fragmentation_in_percent >= 30 --Fragmentation >= 30%
order by fragmentation desc
OPTION (MaxDop 1);

Make it so

So armed with our tools, now we need to set up a regular job to find and defrag our indexes. The frequency with which you do this is completely dependent on your environment, so it might be once a day, once a week, or even once a month. The key, as with the statistics maintenance, is to do it in relation to how frequently your data changes. I try to rebuild my indexes daily (well, only the ones that need it) in my OLTP systems. In a data warehouse, I’ll set up my job to run whenever a major ETL load completes, as that will be just after my data has changed significantly.

And, because the SQL Server community is so awesome, you don’t even have to write your own script! Go take a look at Michelle Ufford’s(b|t) Index Defrag script. It covers many of the possible scenarios for index rebuilding and, with all its parameters, it’s very configurable. You can easily take this stored procedure, implement it into a SQL Agent job, and *BAM*, instant maintenance!

What this will do, as with statistics maintenance, is keep your data in the best shape possible for querying. By reducing the fragmentation in your indexes, you reduce the I/O hit you’ll take whenever you query against those indexes and have one less thing to worry about when troubleshooting performance problems.

One more post to go! We’ll wrap up with me talking(well, writing, really) a little bit about transaction log backups and then your overall general maintenance plan.