The database owner SID recorded in the master database differs from the database owner SID recorded in database ‘msdb’. You should correct this situation by resetting the owner of database ‘msdb’ using the ALTER AUTHORIZATION statement.

-> We were working on an upgrade project. We were performing an upgrade of a standalone SQL server 2005 to SQL server 2012. The upgrade failed with the error “The database owner SID recorded in the master database differs from the database owner SID recorded in database ‘msdb’. You should correct this situation by resetting the owner of database ‘msdb’ using the ALTER AUTHORIZATION statement.”.

-> The SQL server version changed to SQL server 2012. We tried a repair and it failed.

-> We tried starting the SQL services and got the below error in the errorlog.

.
.
2016-01-05 20:35:00.25 spid5s      Error: 33009, Severity: 16, State: 2.
2016-01-05 20:35:00.25 spid5s      The database owner SID recorded in the master database differs from the database owner SID recorded in database ‘msdb’. You should correct this situation by resetting the owner of database ‘msdb’ using the ALTER AUTHORIZATION statement.
2016-01-05 20:35:00.25 spid5s      The failed batch of t-sql statements :
2016-01-05 20:35:00.25 spid5s      Error: 912, Severity: 21, State: 2.
2016-01-05 20:35:00.25 spid5s      Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent100_msdb_upgrade.sql’ encountered error 33009, state 2, severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
2016-01-05 20:35:00.26 spid5s      Error: 3417, Severity: 21, State: 3.
2016-01-05 20:35:00.26 spid5s      Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.
2016-01-05 20:35:00.26 spid5s      SQL Trace was stopped due to server shutdown. Trace ID = ‘1’. This is an informational message only; no user action is required.

-> I added trace flag -T902 in the SQL server startup parameter and started the SQL services to see if it starts, it started without any issues.

902 Bypasses execution of database upgrade script when installing a Cumulative Update or Service Pack.

-> Went into SQL server management studio and ran the below query. “select * from sys.databases”.

-> The owner for master database was SA -> sid: 0x01. But MSDB was something else. Executed the below command and changed the owner of MSDB to SA.

use msdb
go
sp_changeobjectowner ‘sa’
go

-> Once this was done, I removed the trace flag -T902 from the SQL server startup parameter and started the SQL services and it started fine without any issues.

->  Please note that you cannot change the owner of system databases (Master, Model and tempdb) as per link http://technet.microsoft.com/en-us/library/ms190909(v=SQL.105).aspx.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

SQL resources not coming online after patching

-> I had a requirement to install service pack on a 2-node cluster server. It was an active-passive server.

-> I installed the service pack on the passive node successfully. I tried failing over the SQL services from Active node to Passive node and it failed.

-> Checked the cluster.log and found the below errors,

0001164.00001b70::2016/12/01-05:10:32.854 WARN [ClNet] Tcpip is not bound to adapter E8BF15A3-3617-4818-94E4-7BCC37A3B45D.
000014dc.00000868::2016/12/01-05:12:41.025 ERR  SQL Server <SQL Server>: [sqsrvres] OnlineThread: SQLClusterResourceWorkerWaitForCompletion failed (status 0)
0000101c.00001078::2016/12/01-05:12:41.025 WARN [FM] FmpHandleResourceTransition: Resource Name = 3c028454-7dc4-4943-8bb9-89712945d2f2 [SQL Server] old state=129 new state=4
0000101c.00001078::2016/12/01-05:12:41.025 WARN [FM] FmpHandleResourceTransition: Resource failed, post a work item
000014dc.00001b1c::2016/12/01-05:14:43.117 ERR  SQL Server <SQL Server>: [sqsrvres] OnlineThread: SQLClusterResourceWorkerWaitForCompletion failed (status 0)
0000101c.00001078::2016/12/01-05:14:43.117 WARN [FM] FmpHandleResourceTransition: Resource Name = 3c028454-7dc4-4943-8bb9-89712945d2f2 [SQL Server] old state=129 new state=4
0000101c.00001078::2016/12/01-05:14:43.117 WARN [FM] FmpHandleResourceTransition: Resource failed, post a work item
0000101c.00001078::2016/12/01-05:14:43.117 WARN [FM] Group failure for group <56e7dc42-6936-4227-9ed7-eea24233be74>. Create thread to take offline and move.
00001164.000013ac::2016/12/01-05:14:44.086 WARN Physical Disk <R: >: Offline, Locking volume failed, error 5.
00001164.000013ac::2016/12/01-05:14:45.164 WARN Physical Disk <R: >: Offline, Locking volume failed, error 5.
0000101c.00001028::2016/12/01-05:16:20.834 WARN [INIT] The cluster service is shutting down.
0000101c.00001028::2016/12/01-05:16:20.834 WARN [FM] Shutdown: Failover Manager requested to shutdown groups.

-> Eventviewer log contains the error as below,

 OnlineThread: SQLClusterResourceWorkerWaitForCompletion failed (status 0)

-> Tried starting the SQL service from services.msc. The SQL server came online after script upgrade mode. Connected the SQL server instance from SQL server management studio and the version changed as appropriate.

-> Enabled verbose logging for cluster by setting VerboseLogging to 1 in HKEY_LOCAL_MACHINE\Cluster\Groups\<GUID>\Parameters\VerboseLogging.

-> Started the SQL services again from cluadmin.msc. It failed and got the below message on Cluster.log.

00001184.000019d4::2016/12/04-07:20:30.771 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): Worker thread starting …
00001184.000019d4::2016/12/04-07:20:30.771 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): Entering SQLClusterSharedDataUpgradeWorker thread.
00001184.000019d4::2016/12/04-07:20:30.771 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): SqlDataRoot = ‘R:\MSSQL10.MSSQLSERVER\MSSQL’,         CurrentPatchLevel = ‘10.3.5500.0’,         SharedDataPatchLevel = ‘10.1.2531.0’
00001184.000019d4::2016/12/04-07:20:30.771 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): FTDataPath = ‘R:\MSSQL10.MSSQLSERVER\MSSQL\FTData’,         SQLGroup = ‘S-1-5-21-1897246555-1303978887-2451887125-1005’
00001184.000019d4::2016/12/04-07:20:30.771 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): Entering DoSQLDataRootApplyACL (R:\MSSQL10.MSSQLSERVER\MSSQL\Data).
00001184.000019d4::2016/12/04-07:20:30.771 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): ACL-ing (R:\MSSQL10.MSSQLSERVER\MSSQL\Data) as full-control to SQLGroup.
00001184.000019d4::2016/12/04-07:20:30.787 WARN SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): DoSQLDataRootApplyACL : Failed to apply security to SQLDataRoot (122).
00001184.000019d4::2016/12/04-07:20:30.787 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): Exiting DoSQLDataRootApplyACL (0).
00001184.000019d4::2016/12/04-07:20:30.787 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): Entering DoIFTSSharedDataUpgrade.
00001184.00001e54::2016/12/04-07:20:30.975 INFO SQL Server <SQL Server>: [sqsrvres] Worker Thread (F2DF70): Calling SQLClusterResourceWorker::WaitForCompletion (200)

-> The cause seems to be that the SQL resource DLL detects the installation of a patch during the first failover to the newly patched node. Thus it spawns a worker thread to update the databases accordingly (e.g. create updated system stored procedures and views, etc.). As a first step this worker thread will re-create the NTFS file permissions on the data folders of this SQL instance. As the resource DLL runs in the process RHS.EXE (Failover Cluster Resource Host Subsystem), it runs with the credentials of NT AUTHORITY\SYSTEM.

-> Provided full control to SYSTEM, cluster account, SQL service account to below location.

R:\MSSQL10.MSSQLSERVER\MSSQL\Data
R:\MSSQL10.MSSQLSERVER\MSSQL\
R:\

-> Tried running icacls R:\MSSQL10.MSSQLSERVER\MSSQL\Data, icacls R:\MSSQL10.MSSQLSERVER\MSSQL and icacls R:\ and was able to see that all the above account were having full permission.

-> Started a procmon and failed over the SQL services from active node to passive node. The SQL services failed in cluadmin.msc, but still clusterresourcemonitor was working with files on folder R:\MSSQL10.MSSQLSERVER\MSSQL\LOG.

-> It seems like the worker thread that recreates the NTFS file permissions on the log folder of this SQL instance is causing the issue.

-> The log folder was having 373275 files and was of size 23GB.

-> Tried renaming the folder R:\MSSQL10.MSSQLSERVER\MSSQL\LOG to R:\MSSQL10.MSSQLSERVER\MSSQL\LOG_old and created a dummy LOG folder.

-> Failed over the SQL services from active to passive node. SQL services came online without any issues.

-> It seems that during the first failover after a patch the worker thread while re-creating the NTFS file permissions on MSSQL folder had to move through huge lot of files and eventually timed out and that is the reason why this failed out earlier.

-> Patched the active node. Failover and failback worked fine.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Database ‘model’ cannot be opened. It is in the middle of a restore.

-> SQL server was not coming online in one of the database server.

-> I checked the SQL server log and found the below error,

.
.
2016-12-25 19:43:43.02 spid12s     Starting up database ‘msdb’.
2016-12-25 19:43:43.03 spid10s     Starting up database ‘model’.
2016-12-25 19:43:43.05 spid10s     The database ‘model’ is marked RESTORING and is in a state that does not allow recovery to be run.
2016-12-25 19:43:43.05 spid10s     Error: 927, Severity: 14, State: 2.
2016-12-25 19:43:43.05 spid10s     Database ‘model’ cannot be opened. It is in the middle of a restore.
2016-12-25 19:43:43.05 spid10s     Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.
2016-12-25 19:43:43.05 spid10s     SQL Trace was stopped due to server shutdown. Trace ID = ‘1’. This is an informational message only; no user action is required.
.
.

-> As per the SQL server errorlog, it seems like the model database is restoring. But in reality we see this error when model database is corrupted. The error related to tempdb is because tempdb is recreated everytime SQL server starts using model database.

-> Open services.msc and get to the SQL server services. Right click the service and click on properties. In General tab, look for “path to executable”. Copy the SQLServr.exe location from there.

-> Open a command prompt and navigate to the location. Now type the below command,

sqlservr.exe -s<InstanceName> -T3608 -T3609 -c -f
3608 Recovers master database only. Skips automatic recovery (at startup) for all databases except the master database.
3609 Skips the creation of the tempdb database at startup. Use this trace flag if the tempdb database is problematic or problems exist in the model database.
 -c Start as a console and not service.
 -f Minimal configuration mode.

-> Open another command prompt and use the below command,

SQLCMD -E -SADMIN:<Hostname\InstanceName>

-> Execute the below command,

use master
go
sp_detach_db ‘model
go

-> Execute the below command to attach the database,

CREATE DATABASE [model] ON (FILENAME = N'<Location>\model.mdf’ ),( FILENAME = N'<Location>\modellog.ldf’ )
FOR ATTACH

-> If you get an error that the files cannot be attached. You will have to copy the MDF and LDF file from a different instance of same build and issue the above command again.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.