Art of the DBA Rotating Header Image

June, 2011:

It’s a heterogeneous world

Here’s my big “DUH” statement to kick things off with: The information technology world is one of constant change. Shocking, I know! For most of this, it’s why we love the business. There’s always something new to learn and fresh challenges to solve.

But it’s also a world where the old things don’t go away. I’m sure many of us have stories of having to keep some Access application afloat or supporting a legacy website that had most of the back end hard coded. When companies make investments, they want to get the most of that investment that they can, which means platforms will stay around for years because it’s to resource intensive to replace things that are already working.

This is especially noticeable if you work in a company that grows by acquisition. Rarely do different company’s have the same systems, so every acquisition means having to adapt to a new set of software. I’ve had to deal with acquisitions in every job I’ve worked in, and through that period, I’ve had to support 5 different database platforms. My focus has always been on SQL Server, but I’ve had to learn to adapt to these other technologies in order to support either a migration, a legacy application, or a proof of concept system.

So what am I getting at here? Mostly, that it’s important to not become to focused or attached to any one platform, particularly when it comes to databases. Only the very lucky will work with one only SQL Server (or another database system) for their entire careers. Here’s a couple tips to keep in mind so that you don’t short circuit yourself when challenged with a new system.

Be Flexible

If a new technology comes into the company, don’t be afraid to take it on. As far as databases go, you have a couple things working in your advantage. While some of the underlying mechanics may be different, such as storage and memory use, many of the fundamental concepts are the same. You’re still going to have tables, keys, and indexes. You’re still going to need to take backups, update statistics, manage access. Chances are that this is most of what you will be required to do to support a new platform, so don’t be intimidated by how “foreign” a new system might be.

Be Aware

The next stage beyond flexibility is awareness. How many database systems can you name? Three? Four? Even if you just know the names of some other systems, you’ll have an edge. Take some time and research what else is available. Once you know some names, you can then take it a step further and get a general idea of the capabilities of these platforms, as well as what separates them from each other. If you can get to the point where you can describe a platform in a paragraph, not only will you be more comfortable learning to support a new system, you can also provide high level feedback on transitioning a platform into your enterprise.

Be Ready

Don’t get blindsided by a new situation. If you can get ahead of the game and at least have cursory knowledge of how to do things in a different database system, your life will be a whole lot easier. Expand on your basic knowledge and take time to figure out how to do some of the basics, like:

  • Taking a backup
  • Creating a database
  • Importing/exporting data

You never know if or when you will need to do any of these tasks. You don’t need to be a certified expert, but if you take the time to at least experiment with other platforms, then you won’t feel like you’re floundering for a life preserver when a new acquisition gets thrown at you. If you don’t know some of the basics when you have a new database thrust upon you, make it a priority to learn these things first, taking all the basic tasks you perform on the database you’re familiar with and find out how to do them with the new technology.

As data professionals, we are expected to do our part for the company and take on these new challenges. Don’t be afraid of having to adapt to other database solutions. From my perspective, this is a great way to be the hero for your department, simply by taking a little time to get ahead of the game. And by knowing more about other technologies, you are also prepared to help plan your company’s acquisition and integration efforts, ultimately furthering your career and preparing yourself for the next stage.

Meme Monday – Dumb Questions

Here I am, jumping in again for this month’s Meme Monday.  For June, Tom LaRock(t|b) is asking for our dumb SQL questions.

Can you disable the transaction log?

The answer is, emphatically, “NO”.  Most people I’ve run into ask this because they struggle with delays caused by locking and deadlocks, or they’re looking for ways to speed up inserts.  The facts are that without the transaction log and everything that comes with it, you lose the ACIDity (Atomicity, Consistency, Isolation, Durability) within the database.  In other words, you lose all the elements that control and maintain the integrity of your data while thousands of different statements are trying to get at it.

So the next time someone comes to you asking about just disabling this key feature, send ’em to this great article by Steve Jones.

Database Oil Changes – Part 3

Captain’s Log

The last bit of maintenance is one I think most folks know about, and that is transaction log backups. The reason I consider this maintenance is because its how you manage the overall size and transactional state of your database when it’s not in SIMPLE recovery mode. Without the proper jobs set up, you could be impacting your overall space usage and recovery time on your servers.

SQL Server supports three recovery models:

  • Simple – The most basic recovery model, SQL Server regularly cleans up the transaction log. Transactional history is not maintained, so no point-in-time recovery is available. NOTE: SQL Server still uses the transaction log in the same as in FULL mode.
  • Full – Transactional history is maintained in the log, as SQL Server only performs clean up on the transaction log when a transaction log backup occurs. Since transactional history is maintained, point-in-time recovery is possible.
  • Bulk logged – Similar to FULL, transactional history is retained in the log, but certain actions within the database are minimally logged, meaning that you don’t get the transactional detail, but must recover the entire action that occurred.

What all this means is that in FULL and BULK-LOGGED modes, our transaction logs are going to fill up and grow unless we backup our transactions and signal to SQL Server that it can clean up the log. If you neglect your log backups, you could easily fill up whatever drives you have your log files on. Also, when you have to  bring a database online, SQL Server will have to go through your log files to reconcile the data in your database with all of the transactions.  The larger your log file, the longer this recovery phase will take.

Backing up your log is pretty simple:

BACKUP LOG AdventureWorks TO DISK='X:\Backups\logs.bak'

Another option is to use the Maintenance Plans. Whichever method, you should be making transaction log backups on a regular basis in your databases, dependent (again) on your particular situation. For example, we have a pretty active OLTP system and I have regular log backups scheduled to occur every 15 minutes. Your server might be active enough for only every hour or so. The factors to keep in mind when setting your schedule is:

  • How active is your database? The more transactions you have, the more information will be put into the log, meaning they are apt to grow more with a heavy transaction load.
  • How much time are you willing to lose? You’ll only be able to restore back to the point of your last transaction log backup, so keep that in mind. If your service level agreements (SLAs) allow you to lose up to an hour, than that’s the minimum window you should have.
  • Performance considerations. Remember, this IS a backup job, so your server will take a hit whenever you run it. Not a tremendous one, but if you make your backup jobs to frequent you could be causing unnecessary contention.

Also remember that log backups are dependent on your full backups. Transaction log backups need to be done in conjunction with full database backups to maintain your disaster recovery information.

Engage!

As you’ve probably figured out, our maintenance revolves around managing changes in our databases. This, above all, should be the defining rule of thumb when creating and scheduling maintenance tasks. Again, if we think of it in terms of taking care of a car, you (hopefully) change your oil after a certain amount of use with your car. If your car sits in the garage a lot, you probably won’t have to do much to keep it in shape, but if you’re out on the road every day, you’ll definitely want to let it go.

It’s not at all different with your databases, so when you sit down to set up your jobs and schedules, as yourself the following questions:

  • What’s my environment like? Is it an OLTP system with many, many, many small changes to the data? Is it a data warehouse where changes are few and far between, but significant when they occur?
  • What kind of activity does my server see? Are there periods of “down time” when I can run these kind of jobs uninterrupted or is that time minimal or non-existent?

Just as a database needs proper planning and design when you build out your tables and indexes, the upkeep of a database requires a similar amount of planning. If you do it right, though, you drastically reduce the factors that could impede your server’s performance. Hopefully, this also reduces the number of phone calls you get from users asking why the database is so slow!

Database Oil Changes – Part 2

She’s comin’ apart at the seams!

The other big maintenance task that should be regularly done on your databases is managing your index fragmentation. For those wondering, index fragmentation is how the page splits in your indexes will cause the data pages that make up your indexes to become logically disorganized, so that when the SQL Server engine goes to use the indexes, it will have to make more calls to the physical disk to get the appropriate index pages, thus requiring more work (ultimately meaning your queries will perform slower).

For the full skinny on what index fragmentation is, take an hour and watch Paul Randal’s(b|t) MCM training video.

In order to address index fragmentation, you have a two options:

  • REORGANIZE – This will reorganize the data pages in your index on the leaf level. While it will compact and organize data on that leaf level, header records won’t be adjusted, meaning you might still have index fragmentation.
  • REBUILD – This is exactly what it says it is, a complete rebuild of the index. This is pretty time and resource intensive (especially if your table is large), but once it’s done you will have a clean and shiny new index with little or no fragmentation.

The general rule of thumb on these two options is to REORGANIZE when your fragmentation is low, REBUILD when it gets higher. The numbers typically used are to REORGANIZE when your fragmentation is between 10% and 30%, REBUILD when you exceed 30%, but your mileage may vary.  As with many aspects of SQL Server, there are no real hard and fast rules, so make sure you evaluate your environment when deciding between these two options.

A Level 3 Diagnostic

So how do we find fragmentation? Fortunately, there’s a dynamic management object just for this purpose: sys.dm_db_index_physical_stats.   CAUTION: This is one of the few super-costly DMOs out there, as it will scan all your index pages to find out the exact fragmentation. I recommend you read up on the different options before using it and do not run it on production servers during high activity times.

To get you started with this view, here’s a basic query to get your fragmented indexes in the current database:

SELECT
      db.name AS databaseName
    , ps.OBJECT_ID AS objectID
    , object_name(ps.OBJECT_ID,db.database_id) objectName
    , ps.index_id indexID
    , idx.name
    , ps.partition_number partitionNumber
    , ps.avg_fragmentation_in_percent fragmentation
    , ps.page_count/128.0 index_size
FROM sys.databases db
  JOIN sys.dm_db_index_physical_stats (db_id(),NULL, NULL , NULL, N'Limited') ps
      ON db.database_id = ps.database_id
  left join sys.indexes idx on (ps.object_id = idx.object_id and ps.index_id = idx.index_id)
WHERE ps.index_id > 0 --Skip heap tables
	and ps.avg_fragmentation_in_percent >= 30 --Fragmentation >= 30%
order by fragmentation desc
OPTION (MaxDop 1);

Make it so

So armed with our tools, now we need to set up a regular job to find and defrag our indexes. The frequency with which you do this is completely dependent on your environment, so it might be once a day, once a week, or even once a month. The key, as with the statistics maintenance, is to do it in relation to how frequently your data changes. I try to rebuild my indexes daily (well, only the ones that need it) in my OLTP systems. In a data warehouse, I’ll set up my job to run whenever a major ETL load completes, as that will be just after my data has changed significantly.

And, because the SQL Server community is so awesome, you don’t even have to write your own script! Go take a look at Michelle Ufford’s(b|t) Index Defrag script. It covers many of the possible scenarios for index rebuilding and, with all its parameters, it’s very configurable. You can easily take this stored procedure, implement it into a SQL Agent job, and *BAM*, instant maintenance!

What this will do, as with statistics maintenance, is keep your data in the best shape possible for querying. By reducing the fragmentation in your indexes, you reduce the I/O hit you’ll take whenever you query against those indexes and have one less thing to worry about when troubleshooting performance problems.

One more post to go! We’ll wrap up with me talking(well, writing, really) a little bit about transaction log backups and then your overall general maintenance plan.