Art of the DBA Rotating Header Image

Availability Groups

#TSQL2SDAY: SQL 2016 Direct Seeding

It’s been awhile since I’ve written a T-SQL Tuesday post, but thanks to Michael Swart(@MJSwart) for hosting this month’s edition. The request is simple enough: blog about some aspect of SQL Server 2016. This is actually a good opportunity for me to write about a feature I’ve been meaning to get to for some time now and just haven’t had the chance.

As I tell many people, I’ve been working with SQL Server since 1999 (version 7.0) and I can’t remember being more excited about a SQL Server release. 2016 has SO many cool features, but more than that, they are features that many people who use SQL Server have been asking for a while now. It is hard for me to simply blog about one feature since there are so many. However, there has been an improvement to Availability Groups that has not gotten a lot of coverage and I think it is a game changer, especially for those of us who have to manage AGs on a day to day basis: direct seeding.

Setting Up Replicas

In SQL Server 2012 and 2014, creating an Availability Group could take a significant amount of work. One of the more tedious tasks is setting up your replica databases. This is because that you need to restore your database to your replica node in a state close enough to the primary to allow synchronization to happen. It can take several log backup restores to hit that magic window where you can join the database on the secondary node. Then, you get to do it again on the next replica!

Enter direct seeding in 2016. With this feature you no longer have to do any of the restores. You simply create your replicas with direct seeding enabled, then when you add a database to the AG, SQL Server will directly seed the database to your replica nodes. It’s surprisingly simple.

How do we enable this magic? It’s part of the AG replica creation. We still need the prerequisites, so we’ll start with the following:

  • Create a cluster
  • Enable AlwaysOn for the SQL Service
  • Create endpoints and grant permissions on those endpoints
  • The database we’re adding must be in FULL recovery mode

I, unsurprisingly, have a Powershell script that will set this all up for me. Then we can create the Availability Group with the direct seeding setting. Currently, this is only supported using T-SQL (go wizard or Powershell support), but the syntax is documented in MSDN. Let’s start by creating our Availability Group:

CREATE AVAILABILITY GROUP [ENTERPRISE]
FOR
REPLICA ON 'PICARD' WITH (
 ENDPOINT_URL = 'TCP://PICARD:5022', 
 AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, 
 FAILOVER_MODE = AUTOMATIC,
 SEEDING_MODE = AUTOMATIC
),
'RIKER' WITH (
 ENDPOINT_URL = 'TCP://RIKER:5022', 
 AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, 
 FAILOVER_MODE = AUTOMATIC,
 SEEDING_MODE = AUTOMATIC
)
LISTENER 'ENTERPRISE' (WITH IP (('10.10.10.101','255.255.255.0')))

ALTER AVAILABILITY GROUP [ENTERPRISE] GRANT CREATE ANY DATABASE

--Run these on the secondary node
ALTER AVAILABILITY GROUP [ENTERPRISE] JOIN
ALTER AVAILABILITY GROUP [ENTERPRISE] GRANT CREATE ANY DATABASE

 

There are two important lines to call out here:

  • SEEDING_MODE = Automatic
  • ALTER AVAILABILITY [ENTERPRISE] GRANT CREATE DATABASE

The SEEDING_MODE is what enables direct seeding, but we need to grant the AG permission to create databases on the node as well. Then we add the database to the AG. At this point, though, we only have node. It’s time to add a second node and watch the magic happen.

Hands Off

Once we’ve created the AG, we just add the database to it:

ALTER AVAILABILITY GROUP [ENTERPRISE]
ADD DATABASE [WideWorldImporters]

Once we’ve done this, SQL Server takes over and pushes the database out to the secondary replicas:

2016-06-14_10-27-36

That’s it! SQL Server handles all the replication for you. All is not sunshine and rainbows, however. There are still some gotchas to keep in mind:

  • You need to create the Availability Group with all its replicas at the start. You can not add replicas at a later point (I thought I had done this, but more recent testing was not working).
  • The AG must also be set up before you add any databases. You can add the database as part of the AG creation or after, but you have to have all your nodes configured before adding the database to allow direct seeding to work.
  • This is a data operation, so we “canna’ change tha’ laws a’ physics, Captain!” If you add a database to an AG with direct seeding, SQL Server still has to copy all the bits and bytes across the wire to the nodes. Which means nodes in other geographical locations could be a problem.

Overall, direct seeding is pretty cool, but still feels a little raw to me. Personally I’d like to be able to seed to a replica I add to the cluster if I desire, but that does not work right now. The lack of GUI and Powershell support (both cmdlets and SMO) is also disappointing. I can still use it, but I’d rather have a simpler way to do it.

For some additional reading, Erik Darling wrote about the exact same feature for T-SQL Tuesday as I did, so it gives you another view at it. Otherwise, that’s it for me on this T-SQL Tuesday. Thanks to Michael Swart for hosting this month. Make sure you check out the rest of the T-SQL Tuesday posts!

#TSQL2SDAY 65 – TIL Edition: Whaddya mean there’s no endpoint?

This month’s T-SQL Tuesday is from Doc Mike Donnely(@SQLMD) with a simple premise: show us something you learned. Considering that working in the information technology field means we are always learning (or at least should be), blogging on this should be a snap. It’s a great fit for the #SQLNewBlogger challenge if you’re in on it, or just a nice reason to write if you’re not.

My tidbit is somewhat Powershell related, but touches on general setup of Availability Groups. If you read my early short series on automating SQL Server builds, I show you how you can build an Availability Group using Powershell. When I was testing it, the process works just fine for building the Availability Group, but I ran into problems when adding databases. When I went through the troubleshooting, I discovered that while you specify an endpoint in the Powershell cmdlet call, that cmdlet doesn’t actually create said endpoint. It just records the endpoint as part of the config.

This should not really come as a surprise. If you ever use the wizard in SQL Server Management Studio to create an Availability Group, it will script out commands to create the proper endpoint. I assumed that the Powershell cmdlet would work similar to the wizard, but it does not. This means I will have to add a step in my process to actually create the endpoint.

There are two ways to do this. The first is to simply use T-SQL in my script and pass it to either Invoke-SqlCmd or sqlcmd:

$endpointsql = @"
CREATE ENDPOINT [HADR_endpoint]
STATE=STARTED
AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE
, ENCRYPTION = REQUIRED ALGORITHM AES);
"@

Invoke-SqlCmd -ServerInstance 'NODE1' -Database 'master' -Query $endpointsql

The second option is to use the SMO to create an endpoint:

[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO") | Out-Null
$smonode = New-Object Microsoft.SqlServer.Management.Smo.Server ‘NODE1’

#create endpoint
if($smonode.Endpoints.Name -notcontains 'HADR_endpoint'){
    $EndPoint = New-Object Microsoft.SqlServer.Management.Smo.Endpoint($smonode, 'HADR_endpoint')
    $EndPoint.EndpointType = 'DatabaseMirroring'
    $EndPoint.ProtocolType = 'Tcp'
    $EndPoint.Protocol.Tcp.ListenerPort = 5022
    $EndPoint.Payload.DatabaseMirroring.ServerMirroringRole = 'All'
    $EndPoint.Payload.DatabaseMirroring.EndpointEncryption = 'Required'
    $EndPoint.Payload.DatabaseMirroring.EndpointEncryptionAlgorithm ='Aes'
    $EndPoint.Create()
    $EndPoint.Start()
}

The choice between these options is down to personal comfort, either one would be acceptable in my eyes. I lean towards the SMO method just because I’m likely to be using the SMO objects for other operations in a script.

The lesson here is the old line about assumptions being the mother of all screwups and why you should always understand what your tools are doing. If you get stuck in to using wizards all the time, you will not see what is happening under the covers. Using a tool is fine, even necessary, but you should always make sure you understand the mechanics of those tools in case something does not work as expected.

How can you learn those mechanics? Almost every action in SSMS can be scripted out into T-SQL. This key learning tool is something I always stress to new SQL Server folks. It is the easiest way to see what SQL Server is trying to do and helps build your skills. The next time you’re trying to do something in Management Studio, study the script, and teach yourself something new.

Update:

So there’s a third option and I’m mildly annoyed at myself for missing it. As Nic Cain(@SirSQL) points out in the comments, Microsoft provides cmdlets for endpoints. In keeping with the links above, here’s your use of them:

$endpoint = New-SqlHadrEndpoint HADR_Endpoint -Port 5022 -Path SQLSERVER:\SQL\NODE1\DEFAULT
Set-SqlHadrEndpoint -InputObject $endpoint -State "Started"

A LOT simpler using the SMO and remains consistent with the rest of the script. Probably the better way to go.  Thanks Nic!