ALWAYS ON – GROWING LOG SEND QUEUE ON SYNCHRONOUS REPLICA

-> Check this article for more details on high Log Send Queue on Asynchronous Replica.

-> Log Send Queue shows the amount of log records in the log files of the primary database that has not yet been sent to the secondary replica. This value is sent to the secondary replica from the primary replica. Queue size does not include FILESTREAM files that are sent to a secondary.

-> I faced this issue recently on a Synchronous replica. We are aware of the reason why the log send queue increased, that was due to an Alter Index job as part of weekly maintenance. But we wanted to understand if we can do to speed up the process. It all depends on the network link and if we have reached the maximum limit, then there is not much we can do here other than waiting.

-> I tried below actions,

Restart Always on Endpoints on Secondary and then on Primary

ALTER ENDPOINT endpoint_name STATE = STOPPED
ALTER ENDPOINT endpoint_name STATE = STARTED

-> After above action, Log send queue gradually decreased and Always On database came to sync in sometime.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

This secondary replica is not connected to the primary replica. The connected state is DISCONNECTED.

-> Availability group secondary database was “Not Synchronizing”.

-> Below is the view of Always On Dashboard,

-> SQL Server Error log

JBSAG1

2022-06-14 05:08:04.130 spid11s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]->[0E5B3832-3DDA-41CB-9730-0F119639B535], database [JBSAG12], remote endpoint [TCP://JBSAG2.JBSWIKI.com:5022], source operation [708829B5-E20D-4112-9D88-050797FE9D91]: Transitioning from [PENDING] to [CHECK_IF_SEEDING_NEEDED].’
2022-06-14 05:08:04.130 spid44s A connection for availability group ‘JBAG12’ from availability replica ‘JBSAG1’ with id [CF81C5D4-0DBF-4573-AB9D-018423FB92B2] to ‘JBSAG2’ with id [0E5B3832-3DDA-41CB-9730-0F119639B535] has been successfully established. This is an informational message only. No user action is required.
2022-06-14 05:08:04.130 spid37s Always On Availability Groups connection with secondary database established for primary database ‘JBSAG12’ on the availability replica ‘JBSAG2’ with Replica ID: {0e5b3832-3dda-41cb-9730-0f119639b535}. This is an informational message only. No user action is required.
2022-06-14 05:08:04.130 spid68s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]->[0E5B3832-3DDA-41CB-9730-0F119639B535], database [JBSAG12], remote endpoint [TCP://JBSAG2.JBSWIKI.com:5022], source operation [708829B5-E20D-4112-9D88-050797FE9D91]: Transitioning from [CHECK_IF_SEEDING_NEEDED] to [CATCHUP].’
2022-06-14 05:08:04.150 spid68s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]->[0E5B3832-3DDA-41CB-9730-0F119639B535], database [JBSAG12], remote endpoint [TCP://JBSAG2.JBSWIKI.com:5022], source operation [708829B5-E20D-4112-9D88-050797FE9D91]: Transitioning from [CATCHUP] to [COMPLETED].’
2022-06-14 05:08:04.170 spid37s [DbMgrPartnerCommitPolicy::SetReplicaInfoInAg] DbMgrPartnerCommitPolicy::SetSyncAndRecoveryPoint: 0E5B3832-3DDA-41CB-9730-0F119639B535:6:4
2022-06-14 05:08:04.170 spid37s DbMgrPartnerCommitPolicy::SetSyncState: 0E5B3832-3DDA-41CB-9730-0F119639B535:6:4
2022-06-14 05:08:41.350 spid68s A connection timeout has occurred on a previously established connection to availability replica ‘JBSAG2’ with id [0E5B3832-3DDA-41CB-9730-0F119639B535]. Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.
2022-06-14 05:08:41.350 spid68s Always On Availability Groups connection with secondary database terminated for primary database ‘JBSAG12’ on the availability replica ‘JBSAG2’ with Replica ID: {0e5b3832-3dda-41cb-9730-0f119639b535}. This is an informational message only. No user action is required.
2022-06-14 05:08:41.360 spid83s [DbMgrPartnerCommitPolicy::SetReplicaInfoInAg] DbMgrPartnerCommitPolicy::SetSyncAndRecoveryPoint: 0E5B3832-3DDA-41CB-9730-0F119639B535:6:1
2022-06-14 05:08:41.360 spid83s DbMgrPartnerCommitPolicy::SetSyncState: 0E5B3832-3DDA-41CB-9730-0F119639B535:6:1

JBSAG2

2022-06-14 05:07:07.030 spid76s A connection timeout has occurred on a previously established connection to availability replica ‘JBSAG1’ with id [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]. Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.
2022-06-14 05:07:07.030 spid76s Always On Availability Groups connection with primary database terminated for secondary database ‘JBSAG12’ on the availability replica ‘JBSAG1’ with Replica ID: {cf81c5d4-0dbf-4573-ab9d-018423fb92b2}. This is an informational message only. No user action is required.
2022-06-14 05:07:17.040 spid76s A connection timeout has occurred while attempting to establish a connection to availability replica ‘JBSAG1’ with id [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]. Either a networking or firewall issue exists, or the endpoint address provided for the replica is not the database mirroring endpoint of the host server instance.
2022-06-14 05:08:04.130 spid31s A connection for availability group ‘JBAG12’ from availability replica ‘JBSAG2’ with id [0E5B3832-3DDA-41CB-9730-0F119639B535] to ‘JBSAG1’ with id [CF81C5D4-0DBF-4573-AB9D-018423FB92B2] has been successfully established. This is an informational message only. No user action is required.
2022-06-14 05:08:04.130 spid12s Always On: DebugTraceVarArgs AR ‘[HADR] [Secondary] operation on replicas [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]->[0E5B3832-3DDA-41CB-9730-0F119639B535], database [JBSAG12], remote endpoint [TCP://JBSAG1.JBSWIKI.com:5022], source operation [708829B5-E20D-4112-9D88-050797FE9D91]: Transitioning from [PENDING] to [CHECK_IF_SEEDING_NEEDED].’
2022-06-14 05:08:04.130 spid85s Error: 41145, Severity: 16, State: 1.
2022-06-14 05:08:04.130 spid85s Cannot join database ‘JBSAG12’ to availability group ‘JBAG12’. The database has already joined the availability group. This is an informational message. No user action is required.
2022-06-14 05:08:04.130 spid85s Always On: DebugTraceVarArgs AR ‘[HADR] [Secondary] operation on replicas [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]->[0E5B3832-3DDA-41CB-9730-0F119639B535], database [JBSAG12], remote endpoint [TCP://JBSAG1.JBSWIKI.com:5022], source operation [708829B5-E20D-4112-9D88-050797FE9D91]: Transitioning from [CHECK_IF_SEEDING_NEEDED] to [CATCHUP].’
2022-06-14 05:08:04.130 spid85s Always On: DebugTraceVarArgs AR ‘Processing BuildReplicaCatchup event with HADR Role: [SECONDARY]’
2022-06-14 05:08:04.130 spid85s Always On: DebugTraceVarArgs AR ‘[HADR] [Secondary] operation on replicas [CF81C5D4-0DBF-4573-AB9D-018423FB92B2]->[0E5B3832-3DDA-41CB-9730-0F119639B535], database [JBSAG12], remote endpoint [TCP://JBSAG1.JBSWIKI.com:5022], source operation [708829B5-E20D-4112-9D88-050797FE9D91]: Transitioning from [CATCHUP] to [COMPLETED].’
2022-06-14 05:08:04.130 spid31s Always On Availability Groups connection with primary database established for secondary database ‘JBSAG12’ on the availability replica ‘JBSAG1’ with Replica ID: {cf81c5d4-0dbf-4573-ab9d-018423fb92b2}. This is an informational message only. No user action is required.
2022-06-14 05:08:04.130 spid31s The recovery LSN (37:1200:1) was identified for the database with ID 6. This is an informational message only. No user action is required.
2022-06-14 05:08:32.050 spid52 The Database Mirroring endpoint has stopped listening for connections.
2022-06-14 05:08:34.070 spid52 The Database Mirroring endpoint is in disabled or stopped state.
2022-06-14 05:08:34.070 spid77s The connection between server instances ‘JBSAG2’ with id [0E5B3832-3DDA-41CB-9730-0F119639B535] and ‘JBSAG1’ with id [CF81C5D4-0DBF-4573-AB9D-018423FB92B2] has been disabled because the database mirroring endpoint was either disabled or stopped. Restart the endpoint by using the ALTER ENDPOINT Transact-SQL statement with STATE = STARTED.
2022-06-14 05:08:34.080 spid77s Always On Availability Groups connection with primary database terminated for secondary database ‘JBSAG12’ on the availability replica ‘JBSAG1’ with Replica ID: {cf81c5d4-0dbf-4573-ab9d-018423fb92b2}. This is an informational message only. No user action is required.
2022-06-14 05:09:09.370 spid84 ALTER DB param option: SUSPEND
2022-06-14 05:09:09.370 spid84 Always On Availability Groups data movement for database ‘JBSAG12’ has been suspended for the following reason: “user” (Source ID 0; Source string: ‘SUSPEND_FROM_USER’). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
2022-06-14 05:09:26.920 spid81 ALTER DB param option: RESUME
2022-06-14 05:09:26.920 spid81 Always On Availability Groups data movement for database ‘JBSAG12’ has been resumed. This is an informational message only. No user action is required.

-> From the SQL Server error log on secondary JBSAG2, we can see that mirroring endpoint is stopped.

-> Lets check the DMV’s for more details,

select r.replica_server_name, r.endpoint_url,
rs.connected_state_desc, rs.last_connect_error_description,
rs.last_connect_error_number, rs.last_connect_error_timestamp
from sys.dm_hadr_availability_replica_states rs join sys.availability_replicas r
on rs.replica_id=r.replica_id

JBSAG1

Connection attempt failed with error: ‘10060(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)’.

JBSAG2

The Database Mirroring endpoint is in disabled or stopped state.

-> Lets try starting the mirroring endpoint using below command on Secondary JBSAG2,

ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED 

-> Once above command completes, I can see databases on Availability group as synchronized,

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Unable to access availability database ‘JBSAG12’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later. (Microsoft SQL Server, Error: 983)

-> Application was failing with below error,

TITLE: Connect to Database Engine
Cannot connect to JBSAG1.
ADDITIONAL INFORMATION:

Unable to access availability database ‘JBSAG12’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later. (Microsoft SQL Server, Error: 983)
For help, click: https://docs.microsoft.com/sql/relational-databases/errors-events/mssqlserver-983-database-engine-error


Microsoft ODBC Driver 17 for SQL Server : Unable to access availability database ‘JBSAG12’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.


-> I checked the database and Always ON availability group status. Database was “Not Synchronizing” and Availability group was resolving,

SQL Server Errorlog

JBSAG1

2022-06-13 06:42:24.070 spid58 Always On: The local replica of availability group ‘JBAG12’ is preparing to transition to the resolving role. This is an informational message only. No user action is required.
2022-06-13 06:42:24.070 Server The availability group ‘JBAG12’ is being asked to stop the lease renewal because the availability group is going offline. This is an informational message only. No user action is required.
2022-06-13 06:42:24.070 spid58 The state of the local availability replica in availability group ‘JBAG12’ has changed from ‘PRIMARY_NORMAL’ to ‘RESOLVING_NORMAL’. The state changed because the availability group is going offline. The replica is going offline because the associated availability group has been deleted, or the user has taken the associated availability group offline in Windows Server Failover Clustering (WSFC) management console, or the availability group is failing over to another SQL Server instance. For more information, see the SQL Server error log or cluster log. If this is a Windows Server Failover Clustering (WSFC) availability group, you can also see the WSFC management console.
2022-06-13 06:42:24.070 spid78s Always On Availability Groups connection with secondary database terminated for primary database ‘JBSAG12’ on the availability replica ‘JBSAG2’ with Replica ID: {0e5b3832-3dda-41cb-9730-0f119639b535}. This is an informational message only. No user action is required.
2022-06-13 06:42:24.080 spid42s The availability group database “JBSAG12” is changing roles from “PRIMARY” to “RESOLVING” because the mirroring session or availability group failed over due to role synchronization. This is an informational message only. No user action is required.
2022-06-13 06:42:24.090 spid42s State information for database ‘JBSAG12’ – Hardened Lsn: ‘(37:1184:1)’ Commit LSN: ‘(37:1144:5)’ Commit Time: ‘Jun 12 2022 4:48AM’
2022-06-13 06:42:24.090 spid42s Microsoft Distributed Transaction Coordinator (MS DTC) resource manager [JBSAG12] has been released. This is an informational message only. No user action is required.
2022-06-13 06:44:55.330 Logon Error: 983, Severity: 14, State: 1.
2022-06-13 06:44:55.330 Logon Unable to access availability database ‘JBSAG12’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.
2022-06-13 06:55:06.730 Logon Error: 983, Severity: 14, State: 1.
2022-06-13 06:55:06.730 Logon Unable to access availability database ‘JBSAG12’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.

JBSAG2

2022-06-13 06:36:11.050 spid25s The recovery LSN (37:1176:1) was identified for the database with ID 6. This is an informational message only. No user action is required.
2022-06-13 06:42:23.960 spid29s Always On: The local replica of availability group ‘JBAG12’ is going offline because the corresponding resource in the Windows Server Failover Clustering (WSFC) cluster is no longer online. This is an informational message only. No user action is required.
2022-06-13 06:42:23.960 spid29s The state of the local availability replica in availability group ‘JBAG12’ has changed from ‘SECONDARY_NORMAL’ to ‘RESOLVING_NORMAL’. The state changed because the availability group state has changed in Windows Server Failover Clustering (WSFC). For more information, see the SQL Server error log or cluster log. If this is a Windows Server Failover Clustering (WSFC) availability group, you can also see the WSFC management console.
2022-06-13 06:42:23.960 spid41s Always On Availability Groups connection with primary database terminated for secondary database ‘JBSAG12’ on the availability replica ‘JBSAG1’ with Replica ID: {cf81c5d4-0dbf-4573-ab9d-018423fb92b2}. This is an informational message only. No user action is required.
2022-06-13 06:42:23.960 spid47s The availability group database “JBSAG12” is changing roles from “SECONDARY” to “RESOLVING” because the mirroring session or availability group failed over due to role synchronization. This is an informational message only. No user action is required.
2022-06-13 06:42:23.960 spid47s State information for database ‘JBSAG12’ – Hardened Lsn: ‘(37:1184:1)’ Commit LSN: ‘(37:1144:5)’ Commit Time: ‘Jun 12 2022 4:48AM’

-> In my case, there was a network issue wherein the nodes were removed from cluster due to quorum loss and then came back online automatically. But the Always On Cluster role went down as shown in below screenshot,

-> I was sure that Cluster was online and also that the nodes are all available. I right clicked on Role “JBAG12” and clicked on “Start Role”. it came online without issues,

-> Availability group came online and everything was working fine,

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.