CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’

Environment

> Database server JBSAG1 and JBSAG2 are on primary datacentre located in East US.

-> Database server JBSAG3 is on secondary datacentre in Korea Central.

-> Database servers JBSAG1, JBSAG2 and JBSAG3 are part of failover cluster without shared storage.

-> Database JBDB on replicas JBSAG2 and JBSAG3 exhibits a status of “Not Synchronizing / Suspect”. Below is the snapshot from SQL Server Management Studio and Always on Dashboard,

-> Checking SQL server error log on Synchronous Replica JBSAG2,

2021-02-20 15:10:23.760 spid43s Always On Availability Groups connection with primary database established for secondary database ‘JBDB’ on the availability replica ‘JBSAG1’ with Replica ID: {66be35ff-3906-4c64-bfbd-d88618df4221}. This is an informational message only. No user action is required.
2021-02-20 15:10:23.760 spid43s The recovery LSN (37:4698:1) was identified for the database with ID 5. This is an informational message only. No user action is required.
2021-02-20 15:10:24.790 spid39s Error: 5123, Severity: 16, State: 1.
2021-02-20 15:10:24.790 spid39s CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’.
2021-02-20 15:10:24.790 spid39s Error: 5123, Severity: 16, State: 1.
2021-02-20 15:10:24.790 spid39s CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’.
2021-02-20 15:10:24.790 spid39s Error: 5123, Severity: 16, State: 1.
2021-02-20 15:10:24.790 spid39s CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’.

2021-02-20 15:10:24.800 spid39s Error: 5183, Severity: 16, State: 1.
2021-02-20 15:10:24.800 spid39s Cannot create the file “JBDB1”. Use WITH MOVE to specify a usable physical file name. Use WITH REPLACE to overwrite an existing file.
2021-02-20 15:10:24.800 spid39s Always On Availability Groups data movement for database ‘JBDB’ has been suspended for the following reason: “system” (Source ID 2; Source string: ‘SUSPEND_FROM_REDO’). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
2021-02-20 15:10:24.800 spid39s Parallel redo is shutdown for database ‘JBDB’ with worker pool size [1].
2021-02-20 15:10:24.810 spid39s Error: 3313, Severity: 21, State: 2.
2021-02-20 15:10:24.810 spid39s During redoing of a logged operation in database ‘JBDB’ (page (0:0) if any), an error occurred at log record ID (37:4725:1). Typically, the specific failure is previously logged as an error in the operating system error log. Restore the database from a full backup, or repair the database.

-> Checking SQL server error log on Asynchronous Replica JBSAG3,

2021-02-20 15:13:56.700 spid43s Always On Availability Groups connection with primary database established for secondary database ‘JBDB’ on the availability replica ‘JBSAG1’ with Replica ID: {66be35ff-3906-4c64-bfbd-d88618df4221}. This is an informational message only. No user action is required.
2021-02-20 15:13:56.880 spid43s The recovery LSN (37:4698:1) was identified for the database with ID 5. This is an informational message only. No user action is required.
2021-02-20 15:13:58.200 spid58s Error: 5123, Severity: 16, State: 1.
2021-02-20 15:13:58.200 spid58s CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’.
2021-02-20 15:13:58.200 spid58s Error: 5123, Severity: 16, State: 1.
2021-02-20 15:13:58.200 spid58s CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’.
2021-02-20 15:13:58.200 spid58s Error: 5123, Severity: 16, State: 1.
2021-02-20 15:13:58.200 spid58s CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file ‘C:\SQL\JBDB1.ndf’.

2021-02-20 15:13:58.200 spid58s Error: 5183, Severity: 16, State: 1.
2021-02-20 15:13:58.200 spid58s Cannot create the file “JBDB1”. Use WITH MOVE to specify a usable physical file name. Use WITH REPLACE to overwrite an existing file.
2021-02-20 15:13:58.220 spid58s Always On Availability Groups data movement for database ‘JBDB’ has been suspended for the following reason: “system” (Source ID 2; Source string: ‘SUSPEND_FROM_REDO’). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
2021-02-20 15:13:58.220 spid58s Parallel redo is shutdown for database ‘JBDB’ with worker pool size [1].
2021-02-20 15:13:58.220 spid58s Error: 3313, Severity: 21, State: 2.
2021-02-20 15:13:58.220 spid58s During redoing of a logged operation in database ‘JBDB’ (page (0:0) if any), an error occurred at log record ID (37:4725:1). Typically, the specific failure is previously logged as an error in the operating system error log. Restore the database from a full backup, or repair the database.

-> From the error message it is clear that an additional data file C:\SQL\JBDB1.ndf was added in the primary replica. The created secondary file is created fine on the primary replica, but it is not getting created on the secondary replica due to error “The system cannot find the path specified.”.

-> I checked if location C:\SQL is valid and present on JBSAG2 and JBSAG3. That location was not present in JBSAG2 and JBSAG3. I created those location on each of these servers.

-> I checked the status of the database JBDB and it was suspended due to this error,

-> I resumed the data movement using below query,

:CONNECT JBSAG2
ALTER DATABASE [JBDB] SET HADR RESUME;
GO

:CONNECT JBSAG3
ALTER DATABASE [JBDB] SET HADR RESUME;
GO

-> Issue got resolved. Below is the status of Availability group dashboard,

-> That was an easy fix. But what if the file that was created on primary replica on a specific drive that is not available on Secondary replicas. For example, NDF or LDF file is created on H:\SQL folder and H: drive is not present in secondary replicas.

-> One option will be to remove the database from Always On and then restore the database to required drive and configure Always ON all over again for that database. This process will take good amount of time depending on the database size and also the database will be left with no HA/DR till the configuration is completed.

-> Second option will be to Add drive H: on secondary replicas.

-> Third option will be to follow below steps. This can only be possible if you have not already added the additional data or log file already. You cannot follow this approach if you have already added the files and you are seeing “CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the physical file” message in secondary replica.

-> Consider a scenario where you have to add a secondary data file or log file to a location in Primary replica that is not available in secondary replica. In this case I will try reproducing the issue by adding an additional log file at primary on a location that is not available in secondary replicas. I am adding additional log file due to below reason,

  1. The drive that holds the LDF file is full. we dont have more space on that drive. Adding more disk space is not possible at this moment. Tlog backup cannot free up space due to Availability group.
  2. The size of log file has reached 2 TB. Tlog backup cannot free up space due to Availability group. Maximum size for a TLOG file can be 2 TB only.

-> Please note that adding additional log files to a database is not recommended and is done strictly for abover reasons.

-> The below procedure can be followed for both data and log file if the location where the new file that will be created in primary replica will not be available at secondary replica.

-> Remove the secondary database from Availability group using below query,

:CONNECT JBSAG2
ALTER DATABASE [JBDB] SET HADR OFF;
GO

:CONNECT JBSAG3
ALTER DATABASE [JBDB] SET HADR OFF;
GO

-> The database in secondary replica will be in restoring state,

-> Now add the additional log file in primary replica using below query.

USE [master]
GO
ALTER DATABASE [JBDB] ADD LOG FILE 
( NAME = N'JBDB1_Log', FILENAME = N'F:\SQL\LOG\JBDB1_Log.ldf' 
, SIZE = 512000KB , FILEGROWTH = 512000KB )
GO

-> Perform a log backup using below query on database JBDB in primary replica,

BACKUP LOG [JBDB] TO  DISK = N'\\jbsdc\Logship\JBDB_20Feb2021_addfile.trn' WITH NOFORMAT, NOINIT,  NAME = N'JBDB-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10
GO

-> Restore the log backup on secondary replica JBSAG2 and JBSAG3 using with Move option to move the LDF file to a location that is available in secondary replicas,

:CONNECT JBSAG2
RESTORE LOG [JBDB] 
FROM  DISK = N'\\jbsdc\Logship\JBDB_20Feb2021_addfile.trn' 
WITH  FILE = 1
,MOVE N'JBDB1_Log' TO N'G:\log\JBDB1_Log.ldf'
,NORECOVERY,  STATS = 10
GO

:CONNECT JBSAG3
RESTORE LOG [JBDB] 
FROM  DISK = N'\\jbsdc\Logship\JBDB_20Feb2021_addfile.trn' 
WITH  FILE = 1
,MOVE N'JBDB1_Log' TO N'G:\log\JBDB1_Log.ldf'
,NORECOVERY,  STATS = 10
GO

-> Join back the secondary databases to availability group,

-> Check the Always On availability group dashboard to check if everything works fine,

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

This secondary replica is not connected to the primary replica. The connected state is DISCONNECTED

Environment

-> Database server JBSAG2 and JBSAG1 are on primary datacentre located in East US.

-> Database server JBSAG3 is on secondary datacentre in Korea Central.

-> Database servers JBSAG1, JBSAG2 and JBSAG3 are part of failover cluster without shared storage.

-> Database in Asynchronous secondary replica JBSAG3 exhibits a status of “Not Synchronizing”. Below is the snapshot of Always on Dashboard,

-> Below are the messages that can be found,

This secondary replica is not connected to the primary replica. The connected state is DISCONNECTED.

At least one availability database on this availability replica has an unhealthy data synchronization state. If this is an asynchronous-commit availability replica, all availability databases should be in the SYNCHRONIZING state. If this is a synchronous-commit availability replica, all availability databases should be in the SYNCHRONIZED state.

The data synchronization state of this availability database is unhealthy. On an asynchronous-commit availability replica, every availability database should be in the SYNCHRONIZING state. On a synchronous-commit replica, every availability database should be in the SYNCHRONIZED state.

-> Details from SQL Server Error log in Primary JBSAG2.

2021-02-09 15:46:54.270 spid34s Automatic seeding of availability database ‘JBDB’ in availability group ‘JBSwiki’ failed with a transient error. The operation will be retried.
2021-02-09 15:46:54.270 spid34s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [698E9144-D01D-4D34-B898-7FBC32873ABD]->[9B4A72A5-618B-4A9B-81D0-A54826AAEE36], database [JBDB], remote endpoint [TCP://11.0.2.41:5022], source operation [0EFB986D-7156-4BEE-82DC-162265BC3CD8]: Seeding task failed with result 0x80000000.’
2021-02-09 15:46:54.270 spid34s Always On: DebugTraceVarArgs AR ‘Seeding is canceled with cancelReason = 108
2021-02-09 15:46:54.270 spid34s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [698E9144-D01D-4D34-B898-7FBC32873ABD]->[9B4A72A5-618B-4A9B-81D0-A54826AAEE36], database [JBDB], remote endpoint [TCP://11.0.2.41:5022], source operation [0EFB986D-7156-4BEE-82DC-162265BC3CD8]: Transitioning from [CHECK_IF_SEEDING_NEEDED] to [FAILED].’
2021-02-09 15:46:54.280 spid34s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [698E9144-D01D-4D34-B898-7FBC32873ABD]->[9B4A72A5-618B-4A9B-81D0-A54826AAEE36], database [JBDB], remote endpoint [TCP://11.0.2.41:5022], source operation [0EFB986D-7156-4BEE-82DC-162265BC3CD8]: Seeding encountered a transient failure, state ‘108’, maximum retries exceeded’
2021-02-09 15:47:23.960 spid14s Always On: DebugTraceVarArgs AR ‘[HADR] [Primary] operation on replicas [698E9144-D01D-4D34-B898-7FBC32873ABD]->[9B4A72A5-618B-4A9B-81D0-A54826AAEE36], database [JBDB], remote endpoint [TCP://11.0.2.41:5022], source operation [D9301C18-F2D7-4CBE-A2AD-D688D02EA614]: Transitioning from [PENDING] to [CHECK_IF_SEEDING_NEEDED].’

-> Executing below query and checking the status of all replicas,

select r.replica_server_name, r.endpoint_url,
rs.connected_state_desc, rs.last_connect_error_description,
rs.last_connect_error_number, rs.last_connect_error_timestamp
from sys.dm_hadr_availability_replica_states rs join sys.availability_replicas r
on rs.replica_id=r.replica_id

-> Last_Connect_error_description is as below,

Connection attempt failed with error: ‘10060(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)’.

-> Last_Connect_error_number is 10060 which refers to “connection timed out”.

-> It seems like connection to secondary replica cannot be made by availability group.

-> Messages in SQL server error log shows that there is an issue with endpoint URL [TCP://11.0.2.41:5022] which is for secondary replica JBSAG3.

-> I checked the hosts file in location C:\Windows\System32\drivers\etc to make sure there are no invalid entries for JBSAG3 replica. As expected, there were no bad entries for JBSAG3 in hosts file.

-> I tried pinging JBSAG3 from command prompt to see if it is resolving to correctly.

-> The IP address for replica JBSAG3 is 11.0.2.4. But it seems like the endpoint URL is TCP://11.0.2.41:5022. It seems like the endpoint URL is wrong.

-> I changed it to reflect the correct IP using below query on Primary replica JBSAG2,

USE MASTER
GO
ALTER AVAILABILITY GROUP [JBSwiki] 
MODIFY REPLICA ON 'JBSAG3' WITH (ENDPOINT_URL = 'TCP://11.0.2.4:5022');

-> Once after the above changes. Always On availability group turned healthy.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Always On – Modify Availability group Endpoint URL

Environment

->  JBSAG1, JBSAG2 and JBSAG3 are part of a failover cluster without shared storage. Alwayson Availability group is configured between JBSAG1, JBSAG2 and JBSAG3. JBSAG1 is the current primary, JBSAG2 and JBSAG3 are synchronous secondary.

-> The requirement is to change the endpoint url to a different IP.

-> The reason is that we have a separate network link just for replication related activities that can be utilized for Always On and that has fewer checkpoints\firewall which will improve performance.

-> Below screenshot shows the current endpoint URL’s,

-> On the primary replica, I executed below TSQL,

USE MASTER
GO

ALTER AVAILABILITY GROUP [JBSwiki] 
MODIFY REPLICA ON 'JBSAG1' WITH (ENDPOINT_URL = 'TCP://10.0.2.4:5022');

ALTER AVAILABILITY GROUP [JBSwiki] 
MODIFY REPLICA ON 'JBSAG2' WITH (ENDPOINT_URL = 'TCP://10.0.2.6:5022');

ALTER AVAILABILITY GROUP [JBSwiki] 
MODIFY REPLICA ON 'JBSAG3' WITH (ENDPOINT_URL = 'TCP://11.0.2.4:5022');

-> I just have a single Availability group configured and the above code did the trick. If you have multiple AG’s, then you have to modify the above to script to properly reflect the AG name and execute for each available Availability group.

-> Checked the Always On Availability group dashboard and it was all fine.

-> Below screenshot shows the endpoint URL’s after above change,

-> It was an online operation. I did not even suspended the data movement. Everything went fine.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.