Art of the DBA Rotating Header Image

SQL Server

Effectively Using Invoke-SqlCmd

I really enjoy interacting with the SQL community on Twitter. Whether it is exchanging silly one line, sharing interesting articles and blog posts, or trying to answer a tricky problem, the conversation is always engaging. One such conversation that happened last week was a community member who was struggling with the output he wanted out of Invoke-SqlCmd. I was able to help him out, but the conversation got me thinking about how people use this cmdlet within their scripts and processes.

The focal point of the confusion is how Invoke-SqlCmd relates to sqlcmd.exe. Sqlcmd has been around since SQL 2005, replacing the antiquated osql executable. Many administrators have gotten used to the robustness of sqlcmd and how they can leverage it for their backend automation. I’ve used sqlcmd for many tasks over the years.

Then along came Powershell and Invoke-SqlCmd. This cmdlet, included with the SQL Server provider, is intended to cover a lot of the functionality that an administrator can find in sqlcmd, but now as part of Powershell. What people get hung up on is that Invoke-SqlCmd is not a direct replacement of sqlcmd and that it can’t (and shouldn’t) be used in the same way.

Thinking Differently

The challenge when thinking about sqlcmd versus Invoke-SqlCmd focuses on outputs. One of the cardinal rules of Powershell is that everything is an object. When comparing sqlcmd with Invoke-SqlCmd, this difference becomes quite stark. First, let’s look at sqlcmd’s output, capturing it into a variable and then examining the types that make up the output:

sqlcmd_out

Using the GetType() method, we can see that the collected output of sqlcmd is an array, with each element of the array being a string. What this means is that that each line of output from sqlcmd (including the header and the row count output) is a single string, with all column elements concatenated into that string. As you can imagine, this is clumsy to handle and would require a lot of ugly string parsing to separate out the elements.

Now let’s look at the same command using Invoke-Sqlcmd:

Invoke-SqlCmd_Out

Right off the bat, the output is noticeably different. Our total output object is now a DataRow, with the individual column being an Int32 type. You’ll note that there is no item count as well. This is because we have a singular object as output (which has no count or length properties). If we further investigate the output using Get-Member, we’ll find that the columns of our query are actual properties of the output object (in this case, Column1 since we didn’t name our column).

A New Perspective

This helps us because we can now start treating our outputs as data, not just a mass of text that has to be parsed. For example, let’s consider the following query:

SELECT
SERVERPROPERTY('InstanceName') as InstanceName
,SERVERPROPERTY('MachineName') as MachineName
,SERVERPROPERTY('ProductVersion') as SQLVersion
,SERVERPROPERTY('Edition') as SQLEdition
,SERVERPROPERTY('ProductLevel') as SQLReleaseLevel

If you want to make use of these values in Powershell and use sqlcmd, you would have to perform all sorts of string manipulations to separate out the values. To make it easy, we’d likely have each value a separate call to sqlcmd.exe. Using Invoke-SqlCmd changes the game, allowing you to easily access each column as a property of your output:

$sql=@"
SELECT
SERVERPROPERTY('InstanceName') as InstanceName
,SERVERPROPERTY('MachineName') as MachineName
,SERVERPROPERTY('ProductVersion') as SQLVersion
,SERVERPROPERTY('Edition') as SQLEdition,SERVERPROPERTY('ProductLevel') as SQLReleaseLevel
"@

$sqlstats = Invoke-Sqlcmd -Query $sql

cls

'Instance: ' + $sqlstats.InstanceName
'Host: ' + $sqlstats.MachineName
'Product Info: ' + $sqlstats.SQLVersion +'('+$sqlstats.SQLReleaseLevel+') - ' + $sqlstats.SQLEdition


ServerProperties

This ability to reference columns as properties gets even more powerful when we deal with larger data sets for our scripts. A common technique I use is to leverage a collection of databases to drive work I need to do. With my collection as a series of objects, each row element is easier to access and make use of for my processing. To see a detailed example of this, run the following commands and study the output of the Get-Member call:

$dbs = Invoke-Sqlcmd -Query 'select name,recovery_model_desc,page_verify_option_desc from sys.databases;'
$dbs | Get-Member

Data Driven Decisions

Invoke-SqlCmd lets you integrate your data with your scripts. When you make your outputs discrete objects, data becomes readily accessible for your automation. It allows you to move those clunky parts of your processing out of T-SQL and into Powershell, making your scripts more robust. You can use T-SQL for what T-SQL is good at, allowing Powershell to take over where appropriate.

This does not replace sqlcmd. Another key difference is Invoke-SqlCmd isn’t interactive, where sqlcmd is. If you have need of a command line tool for using SQL Server (you’re on Server Core or you need an emergency connection), sqlcmd will be your weapon choice.

The fuzzy area is around compiled SQL scripts you might use. Both Invoke-SqlCmd and sqlcmd.exe can accept these scripts and allow you to perform sqlcmd’s variable replacement for parameterizing scripts. Choosing which method you use, in my mind, comes back to outputs. If your script output is simply a matter of logging the entire execution of your script for later review, sqlcmd is a perfectly acceptable method. However, if you need to integrate your scripts into a larger process and make use of the that output within the script execution, Invoke-SqlCmd could be your better option.

Powershell is about options. Everyone has their own methods and patterns, the idea behind Powershell scripting is not to give you a one-size-fits-all solution, but to provide you the components for building your scripts. The challenge we all have is to make sure we’re using the right components in the right way, the proverbial “driving a screw with a hammer” problem. When presented with a new tool set, don’t try and use it the same way as other tools you’ve had in the past. Instead, learn the tool on its own so you can make the best use of it for your process.

Using #Powershell to Measure SQL Transactions

4333013417_ca6f8904d3_zA common question that comes up for SQL Server administrators is “how do I measure performance”? This is a tough question and there are a variety of ways to do this.  Just off the top of my head, you could measure wait statistics, logical reads and writes for a query, or query duration. When we talk about system performance, we need more general metrics and the one that business folks tend to look at is transactions per second.

For SQL Server, this metric is captured within the SQL Server performance counters.  There are actually two counters you could use: Batch Requests/sec and Transactions/sec. There’s some significant differences between these two counters and this SQL Magazine article covers them both in great detail.  For my work, I like to go with Batch Requests/sec because it captures more transactions going on within the instance, making it a more informative value for measuring system performance.

What’s the best way to capture this? Setting up a perfmon data collection is pretty easy, but then we have to manage creating the collection, then manage the output, and making sense of it. This can be a bear to handle and take a fair amount of manual work to get in place, especially if we just need a quick check on what’s going on in the system.

Powershell provides the Get-Counter cmdlet to query Windows perfmon counters, allowing us rapid access to the information without a lot of hassle. With this cmdlet we can easily measure system performance:

Get-Counter -Counter '\SQLServer:SQL Statistics\Batch Requests/sec'

The output is a PerformanceCounterSampleSet object that is a collection of samples for the call. Using the above command only gives us a single sample, so that makes it simple. We’ll talk about multiple samples in a bit.

 

The key properties of a sample to focus on are:

  • Timestamp – When the sample was collected
  • RawValue – Running cumulative counter for the object
  • CookedValue – The “actual” value, based on the calculating the previous RawValue and the current RawValue.

For reference, if you’ve ever looked at these values in sys.dm_os_performance_counters you have seen the RawValue. This means you are also probably used to doing your own math to measuring these values. Powershell and the perfmon object model will do this math for us, making it easier for us to use.

How do we leverage this for quickly measuring our instance’s transaction performance? I’ve put together the following function to do this for me.  The approach is simple: I give it an instance name and a duration in seconds, and my code will build a Get-Counter call to calculate the Batch Requests/sec information on my instance.

function Get-SQLTxnCount{

param([string]$InstanceName='localhost'
,[int]$DurationSec)

$smo = new-object ('Microsoft.SqlServer.Management.Smo.Server') $InstanceName
$ComputerName = $smo.ComputerNamePhysicalNetBIOS

$Samples = [Math]::Ceiling($DurationSec/5)
$output = New-Object System.Object
if($smo.InstanceName -gt 0){
$Counters = @('\MSSQL`$'+$smo.InstanceName+':SQL Statistics\Batch Requests/sec')
}
else{
$Counters = @('\SQLServer:SQL Statistics\Batch Requests/sec')
}

$Txns = Get-Counter -ComputerName $ComputerName -Counter $Counters -SampleInterval 5 -MaxSamples $samples
$Summary=$Txns.countersamples | Measure-Object -Property CookedValue -Minimum -Maximum -Average

$output | Add-Member -type NoteProperty -name InstanceName -Value $smo.DomainInstanceName
$output | Add-Member -type NoteProperty -name AvgTxnPerSecond -Value $Summary.Average
$output | Add-Member -type NoteProperty -name MinTxnPerSecond -Value $Summary.Minimum
$output | Add-Member -type NoteProperty -name MaxTxnPersecond -Value $Summary.Maximum
return $Output
}

Get-SQLTxnCount -InstanceName 'PICARD' -DurationSec 30

This function expands on the simple Get-Counter call in a couple ways.  First, I make sure I’ve narrowed the counters to only the specific named instance I want to measure. The second change is to the duration of the test and calculation of  how many 5 second samples will be collected over that duration. Finally, I create a custom output with the Measure-Object cmdlet to show the minimum, maximum, and average Batch Requests/sec for my samples.  This all comes together as a simple transactional measurement for my instance.

This isn’t a replacement for detailed benchmarks and analysis. For this, I recommend some of what Laerte Junior(@LaerteSQLDBA) has written. He has a much more detailed overview of Get-Counter and using it with SQL Server that I would highly recommend (P.S. His article gave me my start on this function).  Benchmarks are extremely important and necessary to make the information coming out of my function useful.

Just like a carpenter has his square or tape measure, this function lets me measure my SQL Server with just a few keystrokes. I still need the full context of what’s happening on my instance, but at least I’ll know the dimensions without a lot effort or additional work. I challenge you to start thinking about your own Powershell in the same way. Don’t think of simply creating scripts, consider how you can take your work and build out your own squares, calipers, or other tools to improve your skills as a craftsman.

Building Availability Groups with #Powershell

A couple weeks ago, I went over automating your SQL Server builds. That’s the kind of process that will benefit many SQL Server administrators. But why stop there?  I say that the more we can automate, the better.  So now let’s talk about automating another aspect of building SQL Servers: availability groups.

Availability groups were the killer feature in SQL 2012 and saw more enhancements with SQL 2014. In my opinion, this is a key feature that allows SQL Server to step into the horizontally scalable world. The problem is that it is not an easy feature to set up or configure. Availability groups have a lot of pitfalls and gotchas to navigate. This is where automation comes in as our ally to build a consistent, repeatable process to deal with these hurdles. Remember, automation is primarily about consistency and we can achieve that by scripting out as much of our build process as possible.

As before, let’s talk about the process we’re going to automate:

  1. Build a Windows failover cluster consisting of two nodes.
    1. The two nodes will already have SQL Server installed.
    2. We will add a file share witness into the cluster (because an even number of quorum votes is bad).
    3. The cluster will be simple, so we will skip things like heartbeat networks and other more advanced configurations.
  2. Build an availability group across both nodes of the cluster.
    1. Availability group will use standard endpoints
    2. Both nodes will be configured for synchronous commit and automatic failover
    3. Create a listener with a static IP address.

Building the Cluster

To ease the scripting process, Microsoft provides several cmdlets in the Failover Cluster module.  Using these, we can create our cluster.

Import-Module FailoverClusters

New-Cluster -Name 'NC1701' -StaticAddress '192.168.10.100' -NoStorage -Node @('PICARD','RIKER')| Set-ClusterQuorum -FileShareWitness '\\hikarudc\qwitness'

As you can see, the cluster creation is pretty easy. Keep in mind this is an extremely simple cluster with none of the usual items we include in production. While you can use this process for your production environment, make sure you work with a clustering expert to define all the additional configurations you will need to build in to make your cluster stable. I also want to call out that I created the fileshare for the witness outside of this script as well.

Before we proceed, I want to validate the cluster.  As with everything else we will use Powershell:

Test-Cluster -Name ‘NC1701’

Firewall Ports

An easy thing to overlook is the firewall ports for availability group communication. I know, I forget it all the time! We need to open up 1433 for SQL Server and listener communication and port 5022 for the availability group endpoints.  Some quick Powershell resolves this for us:

Invoke-Command -ComputerName RIKER -ScriptBlock {New-NetFirewallRule -DisplayName 'SQL Server' -LocalPort 1433 -Direction Inbound -Protocol TCP}
Invoke-Command -ComputerName PICARD -ScriptBlock {New-NetFirewallRule -DisplayName 'SQL Server' -LocalPort 1433 -Direction Inbound -Protocol TCP}

Invoke-Command -ComputerName RIKER -ScriptBlock {New-NetFirewallRule -DisplayName 'AG Endpoint' -LocalPort 5022 -Direction Inbound -Protocol TCP}
Invoke-Command -ComputerName PICARD -ScriptBlock {New-NetFirewallRule -DisplayName 'AG Endpoint' -LocalPort 5022 -Direction Inbound -Protocol TCP}

Building the Availability Group

Microsoft provides some cmdlets for building availability groups in the SQLPS module. The process is fairly straightforward, since we’re only declaring two synchronous nodes with automatic failover. If we were to have additional nodes, we would need to put in additional logic for determining which nodes would perform which roles.

Import-Module SQLPS -DisableNameCheckin
$AGName = 'ENTERPRISE'
$PrimaryNode = 'PICARD'
$IP = '192.168.10.101/255.255.255.0'
$replicas = @()

$cname = (Get-Cluster -name $PrimaryNode).name
$nodes = (get-clusternode -Cluster $cname).name

$sqlperms = @"
use [master];
GRANT ALTER ANY AVAILABILITY GROUP TO [NT AUTHORITY\SYSTEM];
GRANT CONNECT SQL TO [NT AUTHORITY\SYSTEM];
GRANT VIEW SERVER STATE TO [NT AUTHORITY\SYSTEM];
"@

foreach($node in $nodes){
    Enable-SqlAlwaysOn -Path "SQLSERVER:\SQL\$node\DEFAULT" -Force
    Invoke-Sqlcmd -ServerInstance $node -Database master -Query $sqlperms
    $replicas += New-SqlAvailabilityReplica -Name $node -EndpointUrl "TCP://$($node):5022" -AvailabilityMode 'SynchronousCommit' -FailoverMode 'Automatic' -AsTemplate -Version 12
}

New-SqlAvailabilityGroup -Name $AGName -Path "SQLSERVER:\SQL\$PrimaryNode\DEFAULT" -AvailabilityReplica $replicas

$nodes | Where-Object {$_ -ne $PrimaryNode} | ForEach-Object {Join-SqlAvailabilityGroup -path "SQLSERVER:\SQL\$_\DEFAULT" -Name $AGName}

New-SqlAvailabilityGroupListener -Name $AGName -staticIP $IP -Port 1433 -Path "SQLSERVER:\Sql\$PrimaryNode\DEFAULT\AvailabilityGroups\$AGName"

There are a couple cmdlets we’re using to make this all work:

  • Enable-SQLAlwaysOn enables Always On with the SQL Server service. This must be done before we can create our AGs.
  • New-SqlAvailabilityReplica creates a SMO object that represents our node for the availability group. We signify that we are creating SMO objects by using the -AsTemplate switch. Without that switch, the cmdlet would try creating the actual replica, which we can’t do since the availability group doesn’t actually exist at this point.
  • New-SqlAvailabilityGroup requires two items: the primary node the availability group will be created on (declared as a SQL provider path) and the collection of replica nodes.  Running the cmdlet will create the availability group and join the replicas to it.
  • Join-SqlAvailabilityGroup is used to join the replicas to the availability group. Weirdly enough, when you create the availability group, it won’t join the other replicas, so we have to take an additional step to join the non-primary nodes.

You’ll note we wrap the replica process in a foreach loop built on the Get-ClusterNode output. I use this approach so that, no matter how many nodes my cluster contains, I can add them all without explicit calls to the node names. The loop will add the SMO objects to a collection that will eventually represent all the nodes for the availability group (in this case, only two).

A gotcha to call out is the SQL Statement I execute on each node. When working on this, I kept encountering error 41131. It surprises me, because I believe these permissions should be granted when you enable Always On for the service. However, most of the time this does not happen, so I’ve included the SQL Statement to guarantee that my permissions are correct for creating the availability group

Once we’ve created the availability group, we need to create the listener. To do this, we need the listener name, the port we will use, the static IP that will be assigned to the listener name, and the availability group (declared as the provider path). To ease the implementation, I use the availability group name as the listener name and port 1433. This actually takes the longest of the steps, because the cmdlet has to validate the IP isn’t already in use, but as long as the address is available your listener will be created and your availability group built.

 At this point, you should have a functional availability group and can add databases to it.

Another approach

The interesting thing about these cmdlets is they all work using the SMO and related objects. That means that if you want a more controlled approach where you get down and dirty with the code, you can go that route. Thomas Stringer(@sqllife) has a great post on this method and I actually was going down that route originally.

The question I had to ask myself was if it was worth re-inventing the wheel for my availability group builds. I’m not a big fan of how the SQLPS cmdlets rely on the SQL Server provider. Additionally, if you build your script using Thomas’ method, your script will work anywhere you have .Net and doesn’t rely on you installing the SQL Server client tools to get SQLPS. However, using the SMO/.Net method means I have to do a lot more work to get my script running, where the SQLPS cmdlets have taken care of most of that work for me.

Which approach should you use? That’s for you to decide, but I encourage you to look at both methods and evaluate which approach works better for you. When all was said and done, I went with the SQLPS cmdlets because it meant I didn’t have to spend a lot of time writing additional code.

Just as with our SQL Server build process, we’re building consistency and repeatability into our environment. The added benefit is, using this script, I can create my availability group in about a minute. While I may not be creating availability groups often, my script ensures that when I do create them I can maintain my build standards and keep my systems aligned.

 

#Powershell and Automating SQL Server Builds-Part 3

Hopefully you’ve seen over the last two posts some basic techniques for automating and codifying your SQL Server builds.  There’s just two more items I want to cover here, questions you’re probably asking yourself already.

The build script

Let’s start with reviewing the whole build script to this point. While I wouldn’t recommend running this in production yourself, the finalized version looks something like this:

Run this from the install binary location
[Reflection.Assembly]::LoadWithPartialName(“System.Web”)
$SAPassword = [System.Web.Security.Membership]::GeneratePassword(16,4)
.\setup.exe /CONFIGURATIONFILE=<config file location> /SAPASSWORD=$SAPassword /IACCEPTSQLSERVERLICENSETERMS

#Configure the OS
New-Item -ItemType Directory G:\MSSQL\Data
New-Item -ItemType Directory H:\MSSQL\Logs
New-Item -ItemType Directory I:\MSSQL\TempDB

#Configure Instant File Initialization
$svcaccount = 'SDF\sqlsvc'
secedit /export /areas USER_RIGHTS /cfg C:\templocalsec.cfg
$privline = Get-Content C:\templocalsec.cfg | Select-String 'SeManageVolumePrivilege'
(Get-Content C:\templocalsec.cfg).Replace($privline,"$privline,$svcaccount") | Out-File C:\newlocalsec.cfg
secedit /configure /db secedit.sdb /cfg C:\newlocalsec.cfg

#Open the firewall for 1433
New-NetFirewallRule -DisplayName "Allow SQL Server" -Direction Inbound –LocalPort 1433 -Protocol TCP -Action Allow

#Set Server configurations
$smosrv = new-object ('Microsoft.SqlServer.Management.Smo.Server') localhost
$smosrv.Configuration.MaxServerMemory.ConfigValue = 4000
$smosrv.Configuration.MinServerMemory.ConfigValue = 2000
$smosrv.Configuration.MaxDegreeOfParallelism.ConfigValue = 4
$smosrv.Configuration.OptimizeAdhocWorkloads.ConfigValue = 1
$smosrv.DefaultFile = 'G:\MSSQL\Data'
$smosrv.DefaultLog = 'H:\MSSQL\Logs'
$smosrv.Alter()

#disable sa
$smosrv.Logins['sa'].Disable()

Next Steps

After this, we should consider deploying maintenance jobs, restoring databases, or building out things like availability groups. While I don’t cover these here, keep in mind these tasks can be scripted out as well.  I encourage you to consider how you could do this.  For example, if you leverage either Ola Hallengren’s scripts or the new Minion Reindex from the Midnight DBAs(@MidnightDBA), you have another piece you can automate. The key is that you have a repeatable process.

At Xero we have all our maintenance jobs as part of our administrative database deployment. This is handled through SQL Server Data Tools and .dacpacs. We achieve consistency and deployment speed by managing that solution and deploying the administrative database project. By maintaining the database in source control we manage our tool set, keeping it standardized.  When we add or update it, we can apply the changes out to our environment. This also means when we build a new server, we have a standardized way to install our admin tools with a minimum of muss and fuss.

The Point

Automation is more about consistency than speed, but speed is a nice side benefit. By assembling these components into a single script, we have a repeatable build process for any SQL Server in our environment. Our instances will be built the same way, every time, so long as we aren’t changing our script. The bonus is, since it’s all scripted, there’s no fumbling with wizards, dialog boxes, and making sure we type in the right values, so it all just happens. And in a matter of minutes. At Xero, I can take a server from nothing to ready for databases in about 20 minutes using these techniques.

Now, we could go pretty crazy with how we build our automation script. In fact, there’s some pretty cool tools out there that will help you with this. I’ll let you do that on your own. Just make sure that, before you do, you know all the steps you have to build your SQL Server. It’s fine and dandy to be able to script out things, but without a plan or process to automate first, most of the scripts you could write won’t do you much good because you can’t use them again.

P.S. HUGE thanks out to Melody Zacharias(@SQLMelody) for helping me with this series of posts.

#Powershell and Automating SQL Server Builds-Part 2

When last we left our heroes, we were discussing how best to go about building SQL Servers quickly and consistently using Powershell.  The thing is, we hadn’t quite gotten to the Powershell part yet and only really covered the build process plus installing SQL Server.  Let’s change that and move on to the next steps.

Configure the OS

It’s time to flex Powershell. Because Powershell works directly with the OS, it makes it really easy to perform tasks that SQL Server isn’t so good at. Let’s start with creating three directories for our database files:

New-Item -ItemType Directory G:\MSSQL\Data
New-Item -ItemType Directory H:\MSSQL\Logs
New-Item -ItemType Directory I:\MSSQL\TempDB

This assumes that our sysadmins have built the box with those LUNs. There are, of course, ways to script that out as well, but we’ll skip over that for the sake of brevity.

Enabling Instant File Initialization is a little tricky, because editing the local security policy is not something Powershell can do directly. Insert a harumph here. Anyway, we can do some tricks using the secedit tool. This technique, borrowed from Kyle Neier(@Kyle_Neier), basically dumps out the local security policy to a text file, updates it, then re-imports it to the local security policy. Kyle provides  a full function to use, but you can boil it down to the following script:

$svcaccount = 'SDF\sqlsvc'
secedit /export /areas USER_RIGHTS /cfg C:\templocalsec.cfg
$privline = Get-Content C:\templocalsec.cfg | Select-String 'SeManageVolumePrivilege'
(Get-Content C:\templocalsec.cfg).Replace($privline,"$privline,$svcaccount") | Out-File C:\newlocalsec.cfg
secedit /configure /db secedit.sdb /cfg C:\newlocalsec.cfg

As a quick aside, you might want to consider working with your sysadmins to actually set this permission in your domain Group Policy. Means you’ll get it without having to configure it yourself.

Finally, we want to make sure we open the firewall for port 1433:

New-NetFirewallRule -DisplayName "Allow SQL Server" -Direction Inbound –LocalPort 1433 -Protocol TCP -Action Allow

Configuring SQL Server

Now with the OS configured, it’s time to configure our SQL Server. For this, we’re going to leverage our friend and pal, the .Net SMO. As with many things in the SMO, the beauty of it is that we get a consistent API to interface with our SQL Server. For our purposes, we want to look at the Configurations collection of the Microsoft.SqlServer.Management.Smo.Server object. In here we can find everything you’d find in the sys.configurations view through T-SQL. Now let’s configure our max and min memory, max degree of parallelsim, and optimize for ad hoc queries.

$smosrv = new-object ('Microsoft.SqlServer.Management.Smo.Server') localhost
$smosrv.Configuration.MaxServerMemory.ConfigValue = 4000
$smosrv.Configuration.MinServerMemory.ConfigValue = 2000
$smosrv.Configuration.MaxDegreeOfParallelism.ConfigValue = 4
$smosrv.Configuration.OptimizeAdhocWorkloads.ConfigValue = 1
$smosrv.Alter()

Simple enough, right? You can get pretty fancy and add all sorts of logic to calculate these values, giving you a flexible setup script.

You could just as easily write these as a T-SQL script for a series of sp_configure executions to set your values. I’ve done it and know plenty of people who do. There are two reasons I’m moving away from this. The first is that some of my configurations, such the default database directories, need some specialized techniques (usually involving xp_regedit or something similar) to implement in T-SQL. I don’t like that approach because the extended stored procedures are usually unsupported and can have lots of issues. Powershell is much more graceful and (the second reason), using it across the board gives me a common interface for my configuration management.

In fact, let’s look at the default database directory settings. As stated, if you want to do this programatically in T-SQL, you are going to need to use xp_regedit to update the registry keys. Using the SMO is much, much easier:

$smosrv.DefaultFile = 'G:\MSSQL\Data'
$smosrv.DefaultLog = 'H:\MSSQL\Logs'
$smosrv.Alter()

Piece of cake, right? One last thing to do now, and that is disable the sa account:

$smosrv.Logins['sa'].Disable()

And we’re done!

To be concluded…

This builds the core of our server, but there’s a couple more odds and ends to discuss.  We’ll wrap up tomorrow with a few things, like how to handle maintenance jobs and other administrative tools.  We’ll also talk a little bit about the “why” of all of this.  Stay tuned for the thrilling conclusion!

(Well, I think it’s thrilling.)

#Powershell and Automating SQL Server Builds

Last week, my friend John Morehouse(@SQLrUs) and I had a bit of twitter banter about breaking and fixing things. Of course, my joke is usually “I have a script for that”, to which John replied:

Amusingly enough, I kind of do. It’s not quite the way you would think, though. Lately I’ve had a different attitude about “fixing” things in my environments, because sometimes it takes waaaaaaay more time to try and repair stuff than to just build it from scratch. Especially if you are scripting out your processes and automating your builds. This is what I would say to John;  if he walked in to my data center and abused my poor server with a wooden implement, I’d simply build a new one.

Before we get much further here, let’s talk about the fine print. Everything I’m going to talk about doesn’t get around backups, DBCC checks, or other processes we need to address as DBAs. We’re not talking about the data here, simply the server that hosts it all. It’s still vital that you do your due diligence to protect your data.

Building SQL Servers

So let’s talk about building our SQL Servers. Firstly, you should always have a defined build process for your environments even if you haven’t had a chance to automate or script it. Checklists and runbooks are the starting points for consistency in any IT shop and if you don’t have them, make them.

For our purposes, let’s assume the following build process. This is pretty basic and there’s lots of possible elements we could have, but I want to start simple. Here’s what we’re going to do:

  1. Install SQL Server
  2. Configure the OS
    1. Directory structures
    2. Instant File Initialization
    3. Open the firewall for SQL Server
  3. Configure SQL Server
    1. Configure max and min server memory, our maxdop setting, and optimize for adhoc queries
    2. Set default database directories
    3. Disable the sa account

I’m not going to give you a comprehensive script at this time, but instead show you how so you can use the same tricks in your environment. The reason for this is that everyone builds their servers a little differently, but if you define and script your process, you can  have your servers built and configured in 15-20 minutes.

Install SQL Server

This step is actually the easiest and has been around for a while, but I want to add a few twists to it. I always install SQL Server using a configuration.ini file. By using this approach, you can install your SQL Servers the same way, every time. You’ll find plenty of articles on the Googles on how to do this, so I won’t dive into it, but I want to address the matter of passwords.

The problem with using the .ini file is most folks (and this includes me) will put the passwords in plaintext in the file. This has numerous bad security implications. Depending on how accessible this file is, you could be giving away your service account and sa passwords and not even know it.

One step I take to manage this is I no longer include my sa password in the file, but instead randomly generate it when I install SQL Server. I went down this path because my friend Argenis Fernandez(@DBArgenis) got me thinking about how the sa account really gets used. The fact of the matter is that I rarely, if ever, log in to the server as sa. In my opinion, no one should even be using that account. This is why I will disable the account on my servers. I’m going to forgo the debate about whether this is a good idea or not (plenty of opinions on either side), but instead just work on the assumption that I need to provide some password to the account for the install. I still want a strong password, though, so I use the following code in Powershell to create my password and pass it as a command line switch. Feel free to read more up on the System.Web.Security.Membership GeneratePassword method for the details on how this works:

[Reflection.Assembly]::LoadWithPartialName(“System.Web”)
$SAPassword = [System.Web.Security.Membership]::GeneratePassword(16,4)
.\setup.exe /CONFIGURATIONFILE=<config file location> /SAPASSWORD=$SAPassword /IACCEPTSQLSERVERLICENSETERMS

This gets my install complete and, honestly, is the longest part of the process. There are ways to cheat around that, but I’ll save that for another post.

To Be Continued…

When I first started writing this, I figured it would be a quick post. Suddenly I found I was at 5 pages and going. So we’re going to break it up to make it a little more manageable.  Tune in tomorrow when we cover configuring the OS and SQL Server itself.

Growing Log Files Responsibly

One of the commonly referenced blogs out there is Kimberly Tripp’s(@KimberlyLTrippseminal post on Virtual Log Files.  If you haven’t read it yet, I highly recommend it.  It provides a good description of one of those quirks with your log files and how you should manage it.

Of course, while folks are familiar with this, it’s often not a practice that gets followed.  Why?  Because it’s hard to manage.  Usually it’s just easier to set the autogrowth settings and let the log find its own size.  This is where I hope I can help you out.

Just like anything we do, if we need to execute it more than once it should be script.  As such, I figured I’d share a T-SQL script I’ve put together for growing my log files out in 8GB chunks.  I’ve used this quite a bit when managing my own systems for when I either need to expand a log file or I need to resize it after an accidental blowout.

--Set variables
declare @dbname sysname = 'test'
,@limit int = 32000
,@filename sysname
,@currsize int
,@growth int
,@v_sql nvarchar(1000)

--Get initial settings
select @currsize = convert(int,floor(size/128.0))
,@filename = name
,@growth = 8000
from sys.master_files
where database_id = db_id(@dbname) and file_id = 2

--Grow file
while @currsize < @limit
begin
  select @growth = CASE WHEN @limit - @currsize < 8000 THEN @limit - @currsize ELSE 8000 END
  select @currsize += @growth
  select @v_sql = N'alter database '+@dbname+' modify file (name='+@filename+',size='+convert(nvarchar(10),@currsize)+'MB);'
  exec sp_executesql @v_sql
end

This is a fairly handy script and pretty easy to follow.  I set the database name and limit, then let the loop keep adding 8000 MB chunks until I get to the size I want.  No strange black voodoo magic, right?  That part is next.

Hopefully everyone reading this blog knows that I’m a big fan of Powershell.  If not, well…I’m a big fan of Powershell.  So I gave myself a little challenge to re-write that T-SQL script as a Powershell function to see how it would work.  Here’s the end result:

#load assemblies
[System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMO') | out-null
$ErrorActionPreference = 'Inquire'

function Expand-SqlLogFile{
  param(
  [string]$InstanceName = 'localhost',
  [parameter(Mandatory=$true)][string] $DatabaseName,
  [parameter(Mandatory=$true)][int] $LogSizeMB)

#Convert MB to KB (SMO works in KB)
[int]$LogFileSize = $LogSizeMB*1024

#Set base information
$srv = New-Object -TypeName Microsoft.SqlServer.Management.Smo.Server $InstanceName
$logfile = $srv.Databases[$DatabaseName].LogFiles[0]
$CurrSize = $logfile.Size

#grow file
while($CurrSize -lt $LogFileSize){
  if(($LogFileSize - $CurrSize) -lt 8192000){$CurrSize = $LogFileSize}
  else{$CurrSize += 8192000}
  logfile.size = $CurrSize
  $logfile.Alter()
  }
}
#Call the function
Expand-SqlLogFile -DatabaseName 'test' -LogSizeMB 35000

You’ll note that it’s about the same number of lines, does more or less the same thing.  For those of you familiar with the SMO, you’ll also know that at behind the scenes all it’s doing is running T-SQL.  So why write it this way?

First, it’s an interesting challenge just to help understand Powershell and how to write it.  It’s these kind of challenges that help me learn and expand my knowledge.  The second, though, is to explore the differences between the methods.  We know in the IT field, you can solve a problem in any number of ways.  While I love Powershell, I wouldn’t use it everywhere.  The only way to know where to draw the line, though, is to experiment and solve the same problem in different ways.

A PoSH way to skin the cat

Today my friend John Morehouse(@SqlrUs) posted a handy T-SQL script to configure your SQL Server memory based on some rules provided by Jonathan Kehayias(@SQLPoolBoy).  This is a great script and helps automate a lot of the work a DBA needs to do when setting up a new instance. Hopefully, you know by now about my passion for automating and standardization.  However, when reading John’s post, I had the following internal conversation:

“Wow, self, John’s got a neat little script here.”

“Yeah, but T-SQL is so passé.  Everyone does it.  Plus, not sure how easily we could add that into a server deployment script if we wanted total hands-off-ness.”

“Self, I know what you mean.  Nothing wrong with it, it just doesn’t fit my own particular….oh…particular….”

Idiom?

“Yes!  Idiom!  Hrmmm…how about a Powershell script?”

“A Powershell script?  Brilliant!”

And so I sat down and cranked out the following script, which essentially does what John’s script does, just using Powershell.

</pre>
<#
.SYNOPSIS
 Configures a SQL Server instance per the Jonathan Kehayias' guidelines.
.DESCRIPTION
 This script will configure your SQL Server instance per the guidelines
 found in Jonathan Kehayias' blog post: http://www.sqlskills.com/blogs/jonathan/how-much-memory-does-my-sql-server-actually-need/
 The rules are:
 - 1GB for initial OS reserve
 - +1GB per 4GB server RAM up to 16GB
 - +1GB per 8GB server RAM above 16
.PARAMETER
 -instance SQL instance name, i.e. localhost\SQL2012, DBASERVER01
 -apply Switch parameter, call if you want to actually apply the changes. Otherwise, a report will be produced.
.EXAMPLE
 Configure-SQLMemory -instance DBASERVER01 -apply
#>

param([parameter(Mandatory=$true)][string] $instance
 , [Switch] $apply
 )

#load SMO
[System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMO') | out-null

if($instance.Contains("\")){
 $sqlhost = $instance.Split("\") | Select -First 1
 }
else{
 $sqlhost = $instance
 }

#set memory variables
$totalmem = (gwmi Win32_ComputerSystem -computername $sqlhost).TotalPhysicalMemory/1GB
$sqlmem = [math]::floor($totalmem)

#calculate memory
while($totalmem -gt 0){
 if($totalmem -gt 16){
 $sqlmem -= [math]::floor(($totalmem-16)/8)
 $totalmem=16
 }
 elseif($totalmem -gt 4){
 $sqlmem -= [math]::floor(($totalmem)/4)
 $totalmem = 4
 }
 else{
 $sqlmem -= 1
 $totalmem = 0
 }
}

#if not in debug mode, alter config. Otherwise report current and new values.
$srv = New-Object -TypeName Microsoft.SqlServer.Management.Smo.Server $instance
 "Instance:" + $instance
 "Max Memory:" + $srv.Configuration.MaxServerMemory.ConfigValue/1024 + " -> " + $sqlmem
 "Min Memory:" + $srv.Configuration.MinServerMemory.ConfigValue/1024 + " -> " + $sqlmem/2
if($apply){
 $srv.Configuration.MaxServerMemory.ConfigValue = $sqlmem * 1024
 $srv.Configuration.MinServerMemory.ConfigValue = $sqlmem/2 * 1024
 $srv.Configuration.Alter()
 "Configuration Complete!"
 }
<pre>

Now it should be noted that I’m not going to tell you which of these scripts are “better”. Neither is, really, they’re just different ways of approaching the problem. And that’s the fun of it. When working in technology, cats can be skinned in multiple ways, we just have to use something. Whether it’s T-SQL, Powershell, C#, or some other thing, the point is that you want scripts to automate your work.

What was also cool about this was it gave me another exercise to practice Powershell.  After all, practice makes perfect and when learning something, any excuse to make me use it is good.  John and I actually had a conversation shortly after I wrote the script about these sort of opportunities, and we’ll continue to look for ways to challenge each other for learning.

Edit:  Well, that was embarrassing.  Amazing how sometimes you can get the math little screwy.  I’ve updated this script as of 9:44 PM MST to correct me having the math backwards.  If you grabbed/reviewed this script before hand, please make sure up grab the updated version.

Why I Work With SQL Server

Hot on the heels of my NoSQL posts, I wanted to add a counterpoint to the discussion.  After all, even though I see the value of non-relational technologies, I think it’s important not to lose sight of the value relational databases offer.  In the tech world, it’s too easy to chase those squirrels of new tech (though it’s also easy to get stuck in our old patterns as well).  It always helps to take a step back and see the forest for the trees so we can choose the right path for our enterprise.

It is an understood fact that the tech world gets pretty dogmatic:  Oracle vs. SQL Server, Windows vs. Linux, Java vs. C#, etc.  People will dig their heels in about their choices and why those choices are considered “right” when, at the end of the day, each platform is simply a different approach to various higher concepts.  I tend to view most of these debates as Ford vs. Chevrolet and the only real question to answer is what tool is best for the job.

And believe me when I say that I know that this isn’t a groundbreaking opinion, but it is mine.  :)

That being said, we all have good reasons for selecting the platforms we work with.  For relational databases, it’s fairly evident that my choice is SQL Server.  Before I get into that, let’s first talk about why I lean towards relational over non-relational.  Don’t get me wrong, non-relational is an effective tool, but it’s still a very young technology.  The platforms for it are still growing and maturing, where they still are missing a lot of the reliability we’ve come to expect from our relational platforms.

Couple that with the nature of relational databases:  Joins, keys, and constraints do more for us than simply organize data, they provide functionality to implement and control business logic.  Data integrity is extremely important for many applications and a proper database design will provide you with all the rules to keep your data clean and ordered.  Just as with choosing non-relational stores, it’s a matter of choosing the appropriate tool for the job.  Sometimes that job requires tight control over your data, something that you just can’t get in a NoSQL database.

As for SQL Server as my relational platform of choice, there’s a lot of reasons why I favor it over other platforms.  It isn’t just because it is worked I’ve worked with (for the record, I’ve put some serious time in Oracle as well).  There are really three main reasons why I will promote SQL Server as the database I think people should work with.

Maturity

Let’s face it, SQL Server has been around for a while and Microsoft has had a lot of time to refine it.  Over the past 15 year I’ve worked with it, I’ve seen the addition of lots of neat features that enhance the core RDBMS offering.  At the same time, SQL Server is still a solid relational database and gives users a solid, reliable platform for storing their data.  It’s not perfect and I’ll be the last person to tell you it is, but it certainly is on par with Oracle and PostgreSQL.

Adaptability

Microsoft has seen the writing on the wall.  Some of it is from their own hand, some of it is how the data world is evolving.  However, “the cloud”, in memory structures, and big data are ubiquitous in today’s tech landscape.  Looking at the recent version of SQL Server, it’s apparent that Microsoft is trying to mold the product to live in this new realm.  Consider Hekaton, the ability to span databases between Azure and on-premise, and improvements to columnstore (along with other updates). Microsoft is making investments to keep pace with the changes we’re seeing in the larger technology world and I appreciate the vision they have for the product.

Accessibility

This is the big one for me.  The other two basically tell me that, in going with SQL Server, I’m going to have an established RDBMS platform I can rely on along with Microsoft continuing to improve things to keep pace with other products.  What sets SQL Server apart is the fact that it’s so much easier to work with, for both new folks and seasoned professionals.

First, let’s look at the fact that it’s Windows.  Now while we all lament SQL Server’s default settings, the fact is that almost anyone with minimal experience can get their own SQL Server instance up and running in short order.  This means that the door is open a little wider for people wanting to get into the database world than those who don’t have supporting skill sets for Linux or hacking the registry.  SQL Server ships with wizards and graphical tools to get folks going.  Just make sure you talk to a professional before getting to far.  :)

And that’s the second thing to talk about.  Maybe I’m biased because I’ve been involved in the SQL Server community for so long, but I’m continually amazed by the amount of free material for training and best practices provided by this community, from blogs to Twitter to a user group likely nearby where you can ask questions of people using SQL Server in your area.  It’s so easy to get started with SQL Server.

Yeah, I know I sound like a fanboy at this point (squee!).  Just so we’re on the level, I am well aware of SQL Server’s flaws.  There’s a lot of things that aren’t perfect or were added a couple versions ago but were never finished up (*cough* Management Data Warehouse).  And let’s not get into what’s in Standard Edition versus Enterprise.  Trust me, I get it.  Even with that, though, I feel that SQL Server is the preferred offering at this point for companies that are looking for a solid relational platform. 

Lines In The Sand

When working with databases, the question of performance always comes up.  Whether it’s a new build or an existing setup, we need some sort of radar gun to measure our instances to make sure they’re up to snuff.  The idea is simple:  We need some sort of benchmark that tells us, at least in a relative sense, how well our systems are performing.

Of course, the tools and approaches available to us seem as numerous as stars in the sky.  Want to measure storage performance?  Take a look at SQLIO or IOmeter.  Need a full end testing suite?  Paid tools like LoadRunner and Benchmark Factory can help you.  Anything else?  A quick Google search will provide plenty of options.  So where do you start?

For my part, I have become a big fan of HammerDB.  Discovered via Kendra Little(@Kendra_Little), HammerDB is a handy tool that allows you to run a TPC-C benchmark against your instances(and not just SQL Server!).  You can specify several different factors to customize your testing, yet the tool has an easy to use interface.  It’s also very well documented, with some good instruction on how to set up your own load tests.

While the HammerDB documentation and Kendra’s post are fairly comprehensive, I did want to share a little on how I setup and run my own HammerDB tests.  This is mostly because there’s a couple gotchas that aren’t covered (after all, the guy who wrote it is an Oracle dude, so I’ll cut him some slack).  First off, let’s talk about the initial schema build.  We can let HammerDB create everything from scratch, but because it uses all of SQL Server’s defaults for that, we get a poorly configured database.  When I am setting up a test (approximately 100 warehouses), I will specifically create an empty database with my files right-sized and in SIMPLE mode, and then use HammerDB to create the schema within that database:

create database tpcc
on primary (name=tpcc_data01, filename='c:\dbdata\tpcc_data01.mdf',size=10gb,maxsize=unlimited,filegrowth=1gb)
log on (name=tpcc_log, filename='c:\dbdata\tpcc_log.ldf',size=4gb,maxsize=unlimited,filegrowth=1gb);

alter database tpcc set recovery simple;

I use SIMPLE mode primarily so I don’t have to bother with log backups on the benchmark database.  Keep in mind this doesn’t actually affect how transactions are processed, only on how they’re removed from the log file.  By properly sizing the log file, we avoid messing around with too many VLFs, which could adversely affect our numbers.

At this point, I’ll kick off the schema build, usually with 10 concurrent users.  It still could take a while to build, so patience is key.  I have had issues where the tool sometimes flubs on finalizing the schema (when creating the indexes and stored procedures), so if you run into that you can find them in a SQL script here.

Once the schema is in place, it’s a simple matter to run the test.  We can run an ad hoc test easily enough, but for a solid benchmark we need use HammerDB’s autopilot functionality.  The autopilot will run sequential timed tests using different concurrent user threads to iteratively increase the load.

HammerDB_DriverScript

First thing to do is make sure, under the Driver Script options, we’ve selected “Timed Test Driver Script”.  Then I tweak a couple things here, mostly around the length of each test.  HammerDB defaults to a ramp up time of 2 minutes, followed by a 5 minute test.  Keep in mind that having a ramp up time is very important, because it allows the system to “warm the cache” and give you more consistent numbers.  That being said, I adjust these values, going with a ramp up of 1 minute and a 5 minute test.  Our schema is not tremendously big (10 GB), so this is usually enough.  It doesn’t hurt to have larger values if you want, but 1 and 5 suit my purposes.  Once this is all set, I load the script and I’m ready to move on to the autopilot settings.

HammerDB_AutopilotOptionsNext, I need to enable the autopilot.  I’ll leave the concurrent user batches at their defaults, as they give you a nice battery of threads for usage.  However, this is where the first gotcha is.  When setting the time for each test, this has to exceed the total of the ramp up time and the test time, plus some time for setting up and tearing down each test.  Since I go with settings that give me a total of 6 minutes for a single test run, I’ll set the test time in the autopilot settings to 8 minutes to give the application time to do all the other setup work it needs.  Having this buffer time doesn’t affect anything with the test itself, but if you cut it to short, the application will essentially discard your results to move on to the next test.  As a nicety, I like to check “Show Virtual User Output” so I can sanity check that everything is running ok during the test.  The info isn’t very informative (like watching the Matrix code), but you’ll at least see errors in the run.

For the tests themselves, I’ll actually run the full autopilot test batch three different times.  I should get results that are relatively the same, but I do want to measure for consistency.  Then I chart out the results, looking at three numbers:

  • Concurrent users (note this is the number in HammerDB minus 1, because HammerDB has one user acting as a controller)
  • Transactions Per Minute (TPM)
  • Transactions Per Minute Per User (TPM/Concurrent Users)

Naturally, we should see the TPM increase as the number of connections rise.  What I’m particularly looking for is to see the behavior curve for TPM/user, which should remain steady.

Lines_Sand_TPM_ChartIf we take a look at this sample from a run on my laptop, you can see what I mean.  It is, of course, not a “good” thing that the average TPM/user drops as connections increase, but this is expected as my poor little laptop isn’t made for enterprise OLTP.  We can at least see that my laptop could handle two concurrent users before falling over on its face.

Some final caveats to keep in mind.  The first is to remember that these numbers are relative.  There’s no “good” value for this, it’s all related to how your systems are performing.  The second caveat is that this is an unofficial TPC-C result.  Vendors and the TPC will only certify something as official under very controlled circumstances, mostly because it’s their reputation on the line.    Finally, these values shouldn’t stand in a vacuum, but measure them alongside other metrics (i.e. CPU, Page Life Expectancy, etc).
That being said, I find HammerDB an effective tool for giving me a general idea of system performance, using a standard that is application agnostic, thorough, and has a proven track record.