Select Query on Secondary Replica getting blocked with wait type HADR_DATABASE_WAIT_FOR_TRANSITION_TO_VERSIONING


-> Application team complained of simple select queries getting blocked. They advised that select queries that return a single row are running indefinitely.

-> These select queries are run against secondary replica. My first guess was that the REDO thread might be blocking these select queries.

-> I tried below query on secondary replica to see if there were any blockings and to my surprise there were no blockings.

select * from sysprocesses where blocked<>0

-> Application team advised that things were working fine before 30 minutes when they were executing the same select queries on primary replica. They are experiencing this issue when they started executing these select queries on secondary replica.

-> This is when I realized that “Readable Secondary” property was set to “Yes” recently as we wanted to move all Read-Only workloads to Secondary replica.

-> I tried executing a simple select that the application team is executing and was able to reproduce the behavior. My session_id was 60. I executed below query to check what this session was doing,

select DB_name(database_id) DBName, session_id,status, wait_type,last_wait_type,blocking_session_id 
from sys.dm_exec_requests where session_id =60

-> Wait type “HADR_DATABASE_WAIT_FOR_TRANSITION_TO_VERSIONING” happens when a query on object(s) in a readable secondary database of an Always On availability group is blocked on row versioning while waiting for commit or rollback of all transactions that were in-flight when the secondary replica was enabled for read workloads. This wait type guarantees that row versions are available before execution of a query under snapshot isolation. Please check this article for more details.

-> It is clear that the queries on secondary replica will be blocked when all active transactions active on primary replica at the time of enabling readable secondary have been committed and processed on the secondary replica.

-> I communicated this to the application team and understood from them that there are some active updates executing on Primary replica. I check the primary replica and was able to see the update statement active.

-> We waited for the update to complete and made sure the databases as part of Availability group are synchronized. After this we were able to execute the select statement without any issues.

-> From this instance we started enabling “Readable Secondary” during after business hours or during few/NO user activity.

Hope this helps!!!

Thank You,
Vivek Janakiraman

The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Leave a Reply