SQL Server Index fragmentation

Now this was a new one for me…

If you have had to deal with big systems, you know when a table peaks to a couple of million rows and counting, things tend to get a tad slow unless you’re planning on doing some heavy optimizations however, there’s always room to get things to some new extremes when you start handling that much data.

One of the first logical things you check on a database is that you have the correct indexes on each table and if you’re searching over one field, then that field is indexed if it can be so search gets a bit faster. An index is quite the good thing to work with because basically it stores an index (Duh!) of how to access records (rows) on the table, where the numeration on those indexes is precisely that field you chose. This article is a very good and quick way to know how indexes work, if you’re a developer and don’t know, please take 5 minutes of your time to read this, don’t worry, I’ll wait for you…

Back? Very good, give yourself a pat on the back and lets move on. Now, as you may know now indexes are stored in trees and as you add or delete (fundamentally delete) rows you create gaps on the tree that would be way too costly to fix as part of those insert or delete operations and those gaps are called fragmentation. SQL Server being such a fan-fricking-tastic tool as it is (no Microsoft does no pay me) has a way for you to query off a database which of its indexes is fragmented and how fragmented it is.

The key is on the sys.dm_db_index_physical_stats function:

sys.dm_db_index_physical_stats (DB_ID(@'MY_DB'), OBJECT_ID(@'MY_TABLE'), 1, NULL, NULL)

However, that provides information for just one index, so with a bit of SQL Magic, we get this far:

SELECT DISTINCT frag.* FROM
	( SELECT 
			TableName = t.name, IndexName = ind.name, IndexId = ind.index_id, ColumnId = ic.index_column_id, ColumnName = col.name,
			(
				SELECT avg_fragmentation_in_percent
				FROM sys.dm_db_index_physical_stats (DB_ID(NULL), OBJECT_ID(t.Name), ind.index_id, NULL, NULL) AS a
				JOIN sys.indexes AS b 
				ON a.object_id = b.object_id AND a.index_id = b.index_id
				WHERE avg_fragmentation_in_percent >= 0.1
			) AS Fragmentation
	FROM 
			sys.indexes ind 
	INNER JOIN sys.index_columns ic ON  ind.object_id = ic.object_id and ind.index_id = ic.index_id 
	INNER JOIN sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
	INNER JOIN sys.tables t ON ind.object_id = t.object_id 
	WHERE 
			ind.is_primary_key = 0 
			AND ind.is_unique = 0 
			AND ind.is_unique_constraint = 0 
			AND t.is_ms_shipped = 0
	) frag
ORDER BY frag.Fragmentation DESC

Whoa! That’s a lot bigger isn’t it? Well, yes, because this one will go through all the indexes on our database and call that function and return everything on a nice little table where you’ll know if there are indexes that should take your attention or not.

Defragmenting the bad ones.

Ok, so you know now which index(es) are misbehaving on your database, now we need to defragment them, so for that we’re going to use a simple ALTER statement and be done with it. To make a quick sample, if we had an index called MY_INDEX_I on a table called MY_TABLE the alter statement would be

ALTER INDEX MY_INDEX_I ON MY_TABLE REBUILD

Now this is a bit extreme because it will drop the index stored and will rebuild the whole thing which can potentially take quite some time depending on your data and the amount of rows you have. There is another way to defragment indexes but because it doesn’t work with SQL Azure and this particular investigation was done for a server over that version, I didn’t looked into it, however the call is not too far. On top of that, this whole rebuild is an overkill for any table with more than 10% of fragmentation, but will always work so it’s a very good safe bet.

Now we have a way to do it manually, however, what happens on a production database with maybe 90 or 100 indexes that gets heavily fragmented every few weeks? You just can’t do it manually in case you were wondering. For that we’re going to use a very handy stored procedure called sp_executesql that lets us invoke dynamically created SQL. All we need is a cursor to go through the indexes and then generate the sql to defragment it and voila.  It goes a bit something like this:

 

DECLARE @MySQL NVARCHAR(4000)
DECLARE @Now DATETIME
DECLARE @Message VARCHAR(255)
DECLARE @TableName VARCHAR(255)
DECLARE @IndexName   NVARCHAR(255)
DECLARE @IndexId   INTEGER
DECLARE @ColumnId   INTEGER
DECLARE @ColumnName   NVARCHAR(255)
DECLARE @Fragmentation FLOAT

DECLARE myCursor CURSOR FOR SELECT DISTINCT frag.* FROM
	( SELECT 
			TableName = t.name, IndexName = ind.name, IndexId = ind.index_id, ColumnId = ic.index_column_id, ColumnName = col.name,
			(
				SELECT avg_fragmentation_in_percent
				FROM sys.dm_db_index_physical_stats (DB_ID(NULL), OBJECT_ID(t.Name), ind.index_id, NULL, NULL) AS a
				JOIN sys.indexes AS b 
				ON a.object_id = b.object_id AND a.index_id = b.index_id
				WHERE avg_fragmentation_in_percent >= 0.1
			) AS Fragmentation
	FROM 
			sys.indexes ind 
	INNER JOIN sys.index_columns ic ON  ind.object_id = ic.object_id and ind.index_id = ic.index_id 
	INNER JOIN sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
	INNER JOIN sys.tables t ON ind.object_id = t.object_id 
	WHERE 
			ind.is_primary_key = 0 
			AND ind.is_unique = 0 
			AND ind.is_unique_constraint = 0 
			AND t.is_ms_shipped = 0
	) frag
ORDER BY frag.Fragmentation DESC
OPEN myCursor

FETCH NEXT FROM myCursor INTO @TableName, @IndexName, @IndexId, @ColumnId, @ColumnName, @Fragmentation

WHILE @@FETCH_STATUS = 0
BEGIN
	SET @Now = GETDATE()
	SET @MySQL ='ALTER INDEX ' + @IndexName + ' ON [' + @TableName + '] REBUILD'

	SET @Message = 'Reindex: ' + @IndexName + ' ON ' + @TableName + ': ' + CONVERT(NVARCHAR(28), @Now, 21) + ' (' + @MySQL + ')'
	PRINT @IndexName + ' ' + @TableName + '(' + CONVERT(NVARCHAR(28), @Fragmentation, 21) + ')'

	EXEC sp_executesql @MySQL

	FETCH NEXT FROM myCursor INTO @TableName, @IndexName, @IndexId, @ColumnId, @ColumnName, @Fragmentation
END

CLOSE myCursor
DEALLOCATE myCursor

I know right? But well, this works like a charm and because it’s fully automated, you can just call it from any of your systems as a cron job and no need to worry about fragmented indexes anymore. Did you tried it? Well, give me a shout and let me know how you got on.