SQL Server 2022 In-Place Upgrade fails with message install the .NET Framework 2.0 or 4.0

Introduction:
Upgrading SQL Server is a crucial task, ensuring your database stays up-to-date and secure. Recently, during an attempt to upgrade from SQL Server 2016 to SQL Server 2022, an unexpected error surfaced, hindering the process.

Error Encountered:
The upgrade process halted with the following error message: The Microsoft .NET Framework 2.0 or 4.0 must be installed on your computer before you can install Microsoft SQL Server 2022 Setup Support Files. Please install the .NET Framework 2.0 or 4.0 and then install Microsoft SQL Server 2022 Setup Support Files.”

Below error was found on the installation summary,

Feature: Database Engine Services
Status: Failed
Reason for failure: An error occurred for a dependency of the feature causing the setup process for the feature to fail.
Next Step: Use the following information to resolve the error, and then try the setup process again.
Component name: SQL Server Setup Support Files
Component error code: -2147483648
Component log file: C:\Program Files\Microsoft SQL Server\160\Setup Bootstrap\Log\20231027_113110\SqlSupport_Cpu64_1.log
Error description: The Microsoft .NET Framework 2.0 or 4.0 must be installed on your computer before you can install Microsoft SQL Server 2022 Setup Support Files. Please install the .NET Framework 2.0 or 4.0 and then install Microsoft SQL Server 2022 Setup Support Files.
Error help link: https://go.microsoft.com/fwlink?LinkId=20476&ProdName=Microsoft+SQL+Server&EvtSrc=setup.rll&EvtID=50000&ProdVer=16.0.1000.6&EvtType=SqlSupport.msi%40CA_ErrorPrereqDotNet20Or40%40-2147483648

-> Below message found on C:\Program Files\Microsoft SQL Server\160\Setup Bootstrap\Log\20231027_113110\SqlSupport_Cpu64_1.log,

MSI (s) (74:04) [12:15:03:056]: Doing action: CA_ErrorPrereqDotNet20Or40
Action ended 12:15:03: IsDotNet20Or40Installed. Return value 1.
Action start 12:15:03: CA_ErrorPrereqDotNet20Or40.
MSI (s) (74:04) [12:18:22:923]: Product: Microsoft SQL Server 2022 Setup (English) — The Microsoft .NET Framework 2.0 or 4.0 must be installed on your computer before you can install Microsoft SQL Server 2022 Setup Support Files. Please install the .NET Framework 2.0 or 4.0 and then install Microsoft SQL Server 2022 Setup Support Files.

The Microsoft .NET Framework 2.0 or 4.0 must be installed on your computer before you can install Microsoft SQL Server 2022 Setup Support Files. Please install the .NET Framework 2.0 or 4.0 and then install Microsoft SQL Server 2022 Setup Support Files.
Action ended 12:18:22: CA_ErrorPrereqDotNet20Or40. Return value 3.
Action ended 12:18:22: INSTALL. Return value 3.
Property(S): DiskPrompt = [1]
Property(S): InstallMode = Typical
Property(S): UpgradeCode = {DFF9DA09-88B8-4A80-BE2B-78A4DA9835FA}
Property(S): WelcomeBmp = WelcomeBmp
Property(S): OPENREADME = 1
Property(S): TARGETDIR = C:\

Investigation:
Upon inspection, the system seemingly met the .NET requirements, specifically version 4.8. Perplexed by this discrepancy, a closer examination of the server configuration was initiated.

Troubleshooting Steps:

  1. Service Disabling: Two services, namely Avecto Defendpoint Service and Defendpoint BeyondInsight, were identified as potential culprits. They were promptly stopped and disabled, eliminating possible conflicts.
  2. Administrative Run: The upgrade process was restarted with elevated privileges, utilizing “Run as Administrator” mode, ensuring comprehensive access and control.

Resolution:
Executing the upgrade with administrative privileges resolved the issue. The process completed successfully, and SQL Server 2022 was installed seamlessly, guaranteeing the continuity of database operations.

Conclusion:
In the realm of system upgrades, unexpected errors can pose challenges. Swift and strategic troubleshooting, such as disabling conflicting services and employing administrative privileges, can lead to successful resolutions. This experience underscores the importance of meticulous preparation and proactive problem-solving in the face of unforeseen obstacles. 🚀

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Always ON Availability group in RESOLVING state

Environment

-> Application team advised that their application is not able to connect to the database that is part of Always on availability group.

-> Always on Availability group was in resolving state as shown below,

-> Always on Availability group roles were in pending state on the Cluster administrator,

-> Below were the messages found on event viewer,

Event ID: 41144
The local availability replica of availability group ‘JBSAG’ is in a failed state. The replica failed to read or update the persisted configuration data (SQL Server error: 41005). To recover from this failure, either restart the local Windows Server Failover Clustering (WSFC) service or restart the local instance of SQL Server.

Event ID: 1205
The Cluster service failed to bring clustered role ‘JBSAG’ completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role. Event ID: 1069
Cluster resource ‘JBSAG’ of type ‘SQL Server Availability Group’ in clustered role ‘JBSAG’ failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Event ID: 7043
The Cluster Service service did not shut down properly after receiving a preshutdown control.

-> Below errors were there on SQL Server error log,

Error: 41022, Severity: 16, State: 0.
Failed to create a Windows Server Failover Clustering (WSFC) notification port with notification filter 778567686 and notification key 3 (Error code 5073). If this is a WSFC availability group, the WSFC service may not be running or may not be accessible in its current state, or the specified arguments are invalid. Otherwise, contact your primary support provider. For information about this error code, see “System Error Codes” in the Windows Development documentation.
Always On: The availability replica manager is going offline because the local Windows Server Failover Clustering (WSFC) node has lost quorum. This is an informational message only. No user action is required.
Always On: The local replica of availability group ‘JBSAG’ is stopping. This is an informational message only. No user action is required.

Error: 41066, Severity: 16, State: 0.
Cannot bring the Windows Server Failover Clustering (WSFC) resource (ID ‘ee50bbc1-93ab-4f25-85e5-a7d245555183’) online (Error code 126). If this is a WSFC availability group, the WSFC service may not be running or may not be accessible in its current state, or the WSFC resource may not be in a state that could accept the request. Otherwise, contact your primary support provider. For information about this error code, see “System Error Codes” in the Windows Development documentation.

Error: 41160, Severity: 16, State: 0.
Failed to designate the local availability replica of availability group ‘JBSAG’ as the primary replica. The operation encountered SQL Server error 41066 and has been terminated. Check the preceding error and the SQL Server error log for more details about the error and corrective actions. Error: 41017, Severity: 16, State: 1.
Failed to add a node to the possible owner list of a Windows Server Failover Clustering (WSFC) resource (Error code 5908). If this is a WSFC availability group, the WSFC service may not be running or may not be accessible in its current state, or the specified cluster resource or node handle is invalid. Otherwise, contact your primary support provider.

-> Cluster.log did not have much details other than the AG group failing.

-> All messages or errors were all generic. We could not get much clue.

-> We executed Process monitor and found that DBA team have renamed C:\windows\system32\hadrres.dll to hadrres_old.dll due to a patching error on both JBSAG1 and JBSAG2. They had to rename it as patching was failing with an error that hadrres.dll is used by another process. DBA team forgot to rename it back to hadrres.dll and this caused the issues. We renamed the file back to hadrres.dll and that solved the issue.

Watch this video to grasp a real-time understanding of this matter.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

ALWAYS ON – Troubleshooting Availability Group Database Stuck in Initializing / In Recovery Mode

Introduction

In this blog post, we will discuss an issue encountered with an Always On Availability Group where a database remained stuck in the Initializing / In Recovery state on the secondary replica after a failover. We will explore the steps taken to resolve the problem and provide insights into the underlying causes.

Issue

During a failover from the primary to the secondary replica in an Always On Availability Group, one particular database entered the Initializing / In Recovery state on the secondary replica. Despite restarting SQL services and attempting to suspend data movement, the database remained stuck in this state.

Resolution

  1. Logging off users and restarting SQL services: Initially, all users were logged off from the secondary replica, and the SQL services on the replica were restarted. However, these steps did not resolve the issue.
  2. Suspending data movement: The next approach involved suspending the data movement of the problematic database from the SQL Server Management Studio on the secondary replica. However, this operation resulted in a “lock request time out period exceeded” error.
  3. Identifying the blocking process: To investigate further, a query was executed to identify any blocking processes using the following query:
    SELECT * FROM sysprocesses WHERE blocked > 0
    It was discovered that the DB Startup thread for the specific database was blocking the SPID attempting to suspend data movement.
  4. Checking recovery status: The SQL Server error log was examined to determine if the database’s recovery had completed. The log indicated that recovery for the database had finished in 43 seconds, with no user action required.
    Recovery completed for database JB_DB (database ID 7) in 43 second(s) (analysis 64 ms, redo 324 ms, undo 0 ms.) This is an informational message only. No user action is required.
  5. Analyzing DB Startup thread wait type: Monitoring the DB Startup thread revealed that it was in a wait state with the HADR_RECOVERY_WAIT_FOR_UNDO type. This wait type occurs when the secondary database is waiting for the reverting and initializing phases to synchronize with the primary database after failover.
  6. Monitoring undo progress: The “Log remaining for undo” performance monitor counter was used to track the amount of log in kilobytes remaining for the undo phase. Surprisingly, the counter showed 0, indicating that there was nothing to undo, and the recovery completed message in the SQL Server error log confirmed that the undo took 0 milliseconds.
  7. Failing over to another synchronous secondary replica: To further troubleshoot the issue, a failover was performed from the primary replica to another synchronous secondary replica. This time, all databases returned to a synchronized state without any issues.
  8. Restarting SQL Server Instance on Primary Replica if there are no additional synchronous secondary replica: If there are no additional synchronous secondary replica to failover the primary replica, you need to restart the SQL Server Instance on Primary Replica a a workaround for this issue.

Summary

The issue of a database stuck in the Initializing / In Recovery state after an Always On Availability Group failover was resolved by failing over to another synchronous secondary replica. Although the root cause and exact timing of the issue remain unknown, this blog post aims to provide guidance and assistance to others who may encounter similar challenges within the community. This problem has been identified as a bug and is documented here.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.