Why 4 Core is better than 1 Core IaaS Virtual machine with SQL Server in Azure

-> I see not much of a difference in terms of cost when you provision an 1, 2 or 4 Core Azure IaaS Virtual machine with SQL Server Enterprise or Standard edition.

-> I am using Azure pricing calculator to check this.

-> I am checking below IaaS Virtual Machines,

DS1 v2 (1 vCPU(s), 3.5 GB RAM) x 730 Hours
DS2 v2 (2 vCPU(s), 7 GB RAM) x 730 Hours
DS3 v2 (4 vCPU(s), 14 GB RAM) x 730 Hours

-> Checking DS1 v2 (1 vCPU(s), 3.5 GB RAM) x 730 Hours,

Blog92_1

-> Checking DS2 v2 (2 vCPU(s), 7 GB RAM) x 730 Hours
Blog92_2

-> Checking DS3 v2 (4 vCPU(s), 14 GB RAM) x 730 Hours
Blog92_3
-> If you compare SQL Server Enterprise Edition cost for 1 Core, 2 Cores or 4 Cores. It is all same. This is due to the fact that SQL Server license starts with a pack of 4 cores. Even though we use 1 or 2 cores you will be charged for 4 cores. This is applicable for Standard Edition.
-> There is a difference in overall cost between these servers, but to me it is worth to consolidate database Servers to a one (1) – 4 Core Virtual machine rather than provisioning 4 – 1 Core Virtual machines or 2 – 2 Core Virtual machines.
Please let me know your views on comments section.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Rule “Valid DSN” and Rule “Valid Database compatibility level and successful connection” failed.

-> I was performing an In-Place upgrade from SQL Server 2014 to SQL Server 2016.

-> Below rules failed,

Blog91_1

-> Rule Check 1 -> Valid DSN

Blog91_2

—————————
Rule Check Result
—————————
Rule “Valid DSN” failed.

The report server configuration is not complete or is invalid. Use Reporting Services Configuration Manager to verify the report server configuration.
—————————
OK
—————————

-> Rule Check 2 -> Valid Database compatibility level and successful connection

Blog91_3
—————————
Rule Check Result
—————————
Rule “Valid Database compatibility level and successful connection” failed.

The report server database is not a supported compatibility level or a connection cannot be established. Use Reporting Services Configuration Manager to verify the report server configuration and SQL Server management tools to verify the compatibility level.
—————————
OK
—————————

-> SQL Services Reporting Services is installed, but not configured and it seems like thats the reason for this issue.

-> I configured SQL Services Reporting Services as below,

Blog91_4

Blog91_5

Blog91_6

Blog91_7

Blog91_8

Blog91_9

Blog91_10

Blog91_11

Blog91_12

-> Refresh the SQL Server 2014 rule window again and it will succeed this time,

Blog91_13

-> This allowed me to complete the upgrade without any further issue.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

 

Always On – Availability group not failing over automatically

-> Client applications not able to connect to the listener.

-> Environment setup is as below,

Backup_Setup

-> When I try connecting to the Listener using SQL Server management Studio (SSMS), I get below error,

Blog1_0

TITLE: Connect to Server
——————————
Cannot connect to JBAPP.
——————————
ADDITIONAL INFORMATION:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 2)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&EvtSrc=MSSQLServer&EvtID=2&LinkId=20476
——————————
The system cannot find the file specified
——————————
BUTTONS:
OK
——————————

-> I tried connecting to Database Server JBAG1 and JBAG2 manually using SSMS. Connection to JBAG1 worked, But connection to JBAG2 failed with below error,

Blog1_2

TITLE: Connect to Server
——————————
Cannot connect to JBAG2\IN01.
——————————
ADDITIONAL INFORMATION:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 – The wait operation timed out.) (Microsoft SQL Server, Error: 258)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&EvtSrc=MSSQLServer&EvtID=258&LinkId=20476
——————————
The wait operation timed out
——————————
BUTTONS:
OK
——————————

-> I connected to JBAG1 and opened a new query windows to one of the user database and tried a select statement, it worked. But queries are failing with below error when performing insert, delete or update.

Blog1_1

Msg 3906, Level 16, State 2, Line 4
Failed to update database “JBDB” because the database is read-only.

-> Latency between primary and secondary datacentre is 2 ms. Hence the setup is a synchronous replica with automatic failover. Please note that this setup will be a bad design if latency between datacentres are more and if there are frequent network glitches.

-> I checked further and below were my observations,

  1. JBAG2 is the PRIMARY Replica and SQL Server was down on it.
  2. JBAG1 was the secondary replica. It seems like automatic failover did not happen.

-> Always On Availability group on JBAG1 was in resolving state,

Blog1_3

-> I started the SQL Services on JBAG2 and in some time everything started working fine including Always On Availability groups.

-> Now comes the question, why automatic failover did not happen?

-> I opened cluadmin.msc and opened “Cluster Events” and found below errors,

Blog1_4

Log Name: System
Source: Microsoft-Windows-FailoverClustering
Date: 4/11/2020 8:29:08 AM
Event ID: 1069
Task Category: Resource Control Manager
Level: Error
Keywords:
User: SYSTEM
Computer: JBAG2.JBS.COM
Description:
Cluster resource ‘JBAG’ of type ‘SQL Server Availability Group’ in clustered role ‘JBAG’ failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
Event Xml:
<Event xmlns=”http://schemas.microsoft.com/win/2004/08/events/event”&gt;
<System>
<Provider Name=”Microsoft-Windows-FailoverClustering” Guid=”{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}” />
<EventID>1069</EventID>
<Version>1</Version>
<Level>2</Level>
<Task>3</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime=”2020-04-11T02:59:08.867171300Z” />
<EventRecordID>2818</EventRecordID>
<Correlation />
<Execution ProcessID=”2464″ ThreadID=”3496″ />
<Channel>System</Channel>
<Computer>JBAG2.JBS.COM</Computer>
<Security UserID=”S-1-5-18″ />
</System>
<EventData>
<Data Name=”ResourceName”>JBAG</Data>
<Data Name=”ResourceGroup”>JBAG</Data>
<Data Name=”ResTypeDll”>SQL Server Availability Group</Data>
</EventData>
</Event>

Blog1_5

Log Name: System
Source: Microsoft-Windows-FailoverClustering
Date: 4/11/2020 8:29:09 AM
Event ID: 1254
Task Category: Resource Control Manager
Level: Error
Keywords:
User: SYSTEM
Computer: JBAG2.JBS.COM
Description:
Clustered role ‘JBAG’ has exceeded its failover threshold. It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state. No additional attempts will be made to bring the role online or fail it over to another node in the cluster. Please check the events associated with the failure. After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.
Event Xml:
<Event xmlns=”http://schemas.microsoft.com/win/2004/08/events/event”&gt;
<System>
<Provider Name=”Microsoft-Windows-FailoverClustering” Guid=”{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}” />
<EventID>1254</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>3</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime=”2020-04-11T02:59:09.469214500Z” />
<EventRecordID>2822</EventRecordID>
<Correlation />
<Execution ProcessID=”2464″ ThreadID=”3496″ />
<Channel>System</Channel>
<Computer>JBAG2.JBS.COM</Computer>
<Security UserID=”S-1-5-18″ />
</System>
<EventData>
<Data Name=”ResourceGroup”>JBAG</Data>
</EventData>
</Event>

-> Above error shows that the failover did not happen since it reached the failover threshold. Checking cluster.log in JBAG2 to confirm this. Refer to this article if you want to know the command to generate cluster.log.

-> Cluster.log provides the same reason,

[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 INFO [RCM] Resource JBAG is causing group JBAG to failover.
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 INFO [RCM] rcm::RcmGroup::Failover: (JBAG)
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 DBG [RCM] rcm::RcmGroup::FailedDueToError=> (JBAG, 5963, false)
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 DBG [RCM] rcm::RcmGroup::UpdateAndGetFailoverCount=> (1, 2020/04/11-08:25:33.204)
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 DBG [RCM] rcm::RcmGroup::ComputeFailoverThreshold=> (JBAG, 1, computed)
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 WARN [RCM] Not failing over group JBAG, failoverCount 2, failoverThresholdSetting 4294967295, lastFailover 2020/04/11-08:25:33.204
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 DBG [RCM] rcm::RcmGroup::FailedDueToError=> (JBAG, 5963, false)
[Verbose] 000009a0.000014c4::2020/04/11-08:31:18.014 DBG [RCM] rcm::RcmResource::DelayedRestart(JBAG_192.168.0.45)

-> Increasing the Failover threshold will fix this issue.

-> Open cluster administrator (cluadmin.msc). Click on Roles. Right Click the ROLE and click properties,

Blog1_6

-> As per below settings, Only 1 failover is allowed in last 6 hours. This makes sense why automatic failover did not happen.

Blog1_7

-> In my case I changed the value from 1 to 5. This resolved my issue with automatic failover.

Blog1_8

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.