Azure Databricks Series: Displaying All Serverless SQL Warehouses in Your Workspace

When working with Azure Databricks, it’s often necessary to programmatically retrieve and manage metadata about your compute resources. One such resource is the Serverless SQL Warehouse, designed for cost-effective and scalable interactive analytics.

In this blog, we’ll walk through a step-by-step Python script that helps you list all Serverless SQL Warehouses in your Databricks workspace using the REST API and persist the results in a Delta table for further analysis.

πŸ” Why Monitor Serverless SQL Warehouses?

Serverless SQL Warehouses are a key part of many organizations’ data strategies due to:

  • Auto-scaling capabilities
  • No infrastructure management
  • Pay-per-use pricing model

By tracking your serverless SQL endpoints, you can gain insights into:

  • Who created them
  • Their sizes and memory footprints
  • Auto-stop configurations
  • Their current state (running/stopped)

πŸ› οΈ Solution Overview

We’ll use the Databricks SQL Endpoints API to get the list of all SQL Warehouses, filter out only the serverless ones, enrich the data with approximate memory, and save it into a Delta table using PySpark.


πŸ§ͺ Code Walkthrough

import requests
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType, IntegerType

instance = "https://adb-1311537494242340.26.azuredatabricks.net/"
token = "dapia***************************"

headers = {
    "Authorization": f"Bearer {token}"
}

url = f"{instance}/api/2.0/sql/endpoints"
response = requests.get(url, headers=headers)
data = response.json()

def size_to_memory(cluster_size):
    mapping = {
        "2X-Small": "64 GB",
        "X-Small": "128 GB",
        "Small": "256 GB",
        "Medium": "512 GB",
        "Large": "1 TB",
        "X-Large": "2 TB",
        "2X-Large": "4 TB",
        "3X-Large": "8 TB",
        "4X-Large": "16 TB"
    }
    return mapping.get(cluster_size, "Unknown")

# Prepare the data
records = []
for endpoint in data.get("endpoints", []):
    if endpoint.get("enable_serverless_compute", False):
        records.append({
            "name": endpoint["name"],
            "id": endpoint["id"],
            "cluster_size": endpoint["cluster_size"],
            "approx_memory": size_to_memory(endpoint["cluster_size"]),
            "auto_stop_mins": endpoint["auto_stop_mins"],
            "creator": endpoint["creator_name"],
            "state": endpoint["state"]
        })
# Create Spark DataFrame
df = spark.createDataFrame(records)

# Save to Delta (overwrite or append as needed)
df.write.format("delta").mode("overwrite").saveAsTable("default.serverless_sql_warehouses")

display(spark.table("default.serverless_sql_warehouses"))

βœ… Sample Output

πŸ’‘ Pro Tips

  • πŸ” Never hardcode tokens in production scripts. Use Azure Key Vault or Databricks secrets to securely manage secrets.
  • πŸ›‘ Consider implementing pagination if your workspace has many warehouses.
  • πŸ“Š Use this Delta table as a source for monitoring dashboards in Power BI or Databricks SQL.

πŸ“š Conclusion

With just a few lines of code, you can automate the discovery of all serverless SQL warehouses, store their metadata in a Delta Lake, and use it for reporting, auditing, or monitoring purposes. This is particularly useful in large-scale environments where managing SQL compute efficiently is crucial.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided β€œAS IS” with no warranties, and confers no rights.

Azure Data Factory Series: Invoke ADF Pipeline from App Service with User-Assigned Managed Identity

Watch the step by step implementation as a You tube Video.

Step 1: Create an Azure App Service (Web App)

  1. Go to Azure Portal β†’ Search for “App Services” β†’ Click “Create”.
  2. Choose Subscription and Resource Group (or create a new one).
  3. Set the following values:
    • Name: my-adf-web-app
    • Publish: Code
    • Runtime Stack: .NET 8 (or any preferred language)
    • Operating System: Windows or Linux
    • Region: Choose a region close to your resources.
    • Plan: Choose a basic plan (B1) for testing.
  4. Click Review + Create β†’ Click Create.

Step 2: Enable Managed Identity for Web App

  1. In Azure Portal, go to App Services β†’ Select my-adf-web-app.
  2. Click on Identity under Settings.
  3. Under User Assigned, toggle the switch to On.
  4. Click Save, and copy the Client ID (you’ll need it later).

Step 3: Grant Permissions to Managed Identity on ADF

  1. Go to Azure Data Factory in the Azure Portal.
  2. Click on Access Control (IAM) β†’ Click Add Role Assignment.
  3. Select:
    • Role: Data Factory Contributor (or Data Factory Operator for limited access).
    • Assign Access To: Managed Identity.
    • Select Members: Choose your Web App (my-adf-web-app).
  4. Click Save.

Step 4: Write Code in Azure App Service to Call ADF Pipeline

Use the following C# code inside your Web App to invoke the ADF pipeline using Managed Identity.

C# Code (ASP.NET Core)

using Microsoft.AspNetCore.Mvc;
using Azure.Identity;
using Azure.ResourceManager;
using Azure.ResourceManager.DataFactory;
using System.Threading.Tasks;
using System.Collections.Generic;
using System;

[Route("JB/[controller]")]
[ApiController]
public class ADFController : ControllerBase
{
    private readonly ArmClient _armClient;
    private readonly string _subscriptionId = "xxxx-xxxxx-xxxxx-xxxxx";
    private readonly string _resourceGroupName = "jbadf";
    private readonly string _dataFactoryName = "jbadfapp";
    private readonly string _pipelineName = "jb_Copydata";

    public ADFController()
    {
        // Set the Client ID of the User-Assigned Managed Identity (UMI)
        var userAssignedClientId = "xxxx-xxxxx-xxxxx-xxxxx"; // Replace this with actual Client ID

        var credential = new DefaultAzureCredential(new DefaultAzureCredentialOptions
        {
            ManagedIdentityClientId = userAssignedClientId
        });

        _armClient = new ArmClient(credential);
    }

    [HttpGet("CuriousBoy")]
    public async Task<IActionResult> TriggerPipeline()
    {
        try
        {
            // Construct the pipeline resource ID
            var pipelineResourceId = DataFactoryPipelineResource.CreateResourceIdentifier(
                _subscriptionId, _resourceGroupName, _dataFactoryName, _pipelineName);

            // Get the pipeline resource
            var pipelineResource = _armClient.GetDataFactoryPipelineResource(pipelineResourceId);

            // Define pipeline parameters (if required)
            var parameters = new Dictionary<string, BinaryData>
            {
                // Example: If your pipeline requires parameters, add them here.
                // { "param1", BinaryData.FromString("value1") },
                // { "param2", BinaryData.FromString("value2") }
            };

            // Trigger the pipeline run
            var runResponse = await pipelineResource.CreateRunAsync(parameters);

            return Ok($"ADF Pipeline triggered successfully on {DateTime.UtcNow}");
        }
        catch (Exception ex)
        {
            return StatusCode(500, $"Error triggering ADF pipeline: {ex.Message}");
        }
    }
}

Step 5: Deploy the Code to Azure App Service

  1. In Visual Studio, create an ASP.NET Core Web API project.
  2. Copy the above C# code into your Controller or Service.
  3. Deploy your code to Azure App Service using:
    • Right-click on the project β†’ Publish β†’ Azure App Service.

Step 6: Test the Web App

  1. Navigate to https://my-adf-web-app.azurewebsites.net.
  2. Trigger the endpoint that executes the above code.
  3. Your ADF pipeline should now run successfully!

This is the simplest way to invoke an ADF pipeline from Azure App Services using Managed Identity.

Possible Errors Expected,

Error triggering ADF pipeline: DefaultAzureCredential failed to retrieve a token from the included credentials. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/defaultazurecredential/troubleshoot

  • EnvironmentCredential authentication unavailable. Environment variables are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/environmentcredential/troubleshoot
  • WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/workloadidentitycredential/troubleshoot
  • ManagedIdentityCredential authentication unavailable. No response received from the managed identity endpoint.
  • VisualStudioCredential authentication failed: Visual Studio Token provider can’t be accessed at D:DWASFilesSitesJBAPPLocalAppData.IdentityServiceAzureServiceAuthtokenprovider.json
  • AzureCliCredential authentication failed: Azure CLI not installed
  • AzurePowerShellCredential authentication failed: Az.Accounts module >= 2.2.0 is not installed.
  • AzureDeveloperCliCredential authentication failed: Azure Developer CLI could not be found.

Error triggering ADF pipeline: The client ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ with object id ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ does not have authorization to perform action ‘Microsoft.DataFactory/factories/pipelines/createRun/action’ over scope ‘/subscriptions/xxxx-xxxx-xxxx/resourceGroups/jbadf/providers/Microsoft.DataFactory/factories/jbadfapp/pipelines/jb_Copydata’ or the scope is invalid. If access was recently granted, please refresh your credentials.
Status: 403 (Forbidden)
ErrorCode: AuthorizationFailed

Content:
{“error”:{“code”:”AuthorizationFailed”,”message”:”The client ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ with object id ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ does not have authorization to perform action ‘Microsoft.DataFactory/factories/pipelines/createRun/action’ over scope ‘/subscriptions/xxxx-xxxx-xxxx/resourceGroups/jbadf/providers/Microsoft.DataFactory/factories/jbadfapp/pipelines/jb_Copydata’ or the scope is invalid. If access was recently granted, please refresh your credentials.”}}

Headers:
Cache-Control: no-cache
Pragma: no-cache
x-ms-failure-cause: REDACTED
x-ms-request-id: 896c9766-5ee3-4dbb-b3b6-71800e2ee564
x-ms-correlation-request-id: REDACTED
x-ms-routing-request-id: REDACTED
Strict-Transport-Security: REDACTED
X-Content-Type-Options: REDACTED
X-Cache: REDACTED
X-MSEdge-Ref: REDACTED
Date: Tue, 18 Feb 2025 08:23:20 GMT
Content-Length: 512
Content-Type: application/json; charset=utf-8
Expires: -1

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided β€œAS IS” with no warranties, and confers no rights.



Azure Data Factory Series: Invoking ADF Pipelines Securely with Managed Identity

Watch the step by step implementation as a You tube Video.

Step 1: Create an Azure App Service (Web App)

  1. Go to Azure Portal β†’ Search for “App Services” β†’ Click “Create”.
  2. Choose Subscription and Resource Group (or create a new one).
  3. Set the following values:
    • Name: my-adf-web-app
    • Publish: Code
    • Runtime Stack: .NET 8 (or any preferred language)
    • Operating System: Windows or Linux
    • Region: Choose a region close to your resources.
    • Plan: Choose a basic plan (B1) for testing.
  4. Click Review + Create β†’ Click Create.

Step 2: Enable Managed Identity for Web App

  1. In Azure Portal, go to App Services β†’ Select my-adf-web-app.
  2. Click on Identity under Settings.
  3. Under System Assigned, toggle the switch to On.
  4. Click Save, and copy the Object ID (you’ll need it later).

Step 3: Grant Permissions to Managed Identity on ADF

  1. Go to Azure Data Factory in the Azure Portal.
  2. Click on Access Control (IAM) β†’ Click Add Role Assignment.
  3. Select:
    • Role: Data Factory Contributor (or Data Factory Operator for limited access).
    • Assign Access To: Managed Identity.
    • Select Members: Choose your Web App (my-adf-web-app).
  4. Click Save.

Step 4: Write Code in Azure App Service to Call ADF Pipeline

Use the following C# code inside your Web App to invoke the ADF pipeline using Managed Identity.

C# Code (ASP.NET Core)

using Microsoft.AspNetCore.Mvc;
using Azure.Identity;
using System.Net.Http;
using System.Threading.Tasks;
using Azure.Core;

[Route("api/[controller]")]
[ApiController]
public class ADFController : ControllerBase
{
    private readonly HttpClient _httpClient;
    private readonly string _adfPipelineUrl = "https://management.azure.com/subscriptions/xxxx-xxxx-xxx-xxx/resourceGroups/jbadf/providers/Microsoft.DataFactory/factories/jbadfapp/pipelines/jb_Copydata/createRun?api-version=2018-06-01";

    public ADFController()
    {
        _httpClient = new HttpClient();
    }

    [HttpGet("trigger")]
    public async Task<IActionResult> TriggerPipeline()
    {
        var credential = new DefaultAzureCredential();
        var token = await credential.GetTokenAsync(new TokenRequestContext(new[] { "https://management.azure.com/.default" }));

        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {token.Token}");

        var response = await _httpClient.PostAsync(_adfPipelineUrl, null);

        if (response.IsSuccessStatusCode)
        {
            return Ok("ADF Pipeline triggered successfully.");
        }
        return StatusCode((int)response.StatusCode, "Failed to trigger pipeline.");
    }
}

Step 5: Deploy the Code to Azure App Service

  1. In Visual Studio, create an ASP.NET Core Web API project.
  2. Copy the above C# code into your Controller or Service.
  3. Deploy your code to Azure App Service using:
    • Right-click on the project β†’ Publish β†’ Azure App Service.

Step 6: Test the Web App

  1. Navigate to https://my-adf-web-app.azurewebsites.net.
  2. Trigger the endpoint that executes the above code.
  3. Your ADF pipeline should now run successfully!

This is the simplest way to invoke an ADF pipeline from Azure App Services using Managed Identity.

Possible Errors Expected,

Error triggering ADF pipeline: DefaultAzureCredential failed to retrieve a token from the included credentials. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/defaultazurecredential/troubleshoot

  • EnvironmentCredential authentication unavailable. Environment variables are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/environmentcredential/troubleshoot
  • WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/workloadidentitycredential/troubleshoot
  • ManagedIdentityCredential authentication unavailable. No response received from the managed identity endpoint.
  • VisualStudioCredential authentication failed: Visual Studio Token provider can’t be accessed at D:DWASFilesSitesJBAPPLocalAppData.IdentityServiceAzureServiceAuthtokenprovider.json
  • AzureCliCredential authentication failed: Azure CLI not installed
  • AzurePowerShellCredential authentication failed: Az.Accounts module >= 2.2.0 is not installed.
  • AzureDeveloperCliCredential authentication failed: Azure Developer CLI could not be found.

Error triggering ADF pipeline: The client ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ with object id ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ does not have authorization to perform action ‘Microsoft.DataFactory/factories/pipelines/createRun/action’ over scope ‘/subscriptions/xxxx-xxxx-xxxx/resourceGroups/jbadf/providers/Microsoft.DataFactory/factories/jbadfapp/pipelines/jb_Copydata’ or the scope is invalid. If access was recently granted, please refresh your credentials.
Status: 403 (Forbidden)
ErrorCode: AuthorizationFailed

Content:
{“error”:{“code”:”AuthorizationFailed”,”message”:”The client ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ with object id ’66a2225c-f1b5-439b-b375-78ae2b744f2f’ does not have authorization to perform action ‘Microsoft.DataFactory/factories/pipelines/createRun/action’ over scope ‘/subscriptions/xxxx-xxxx-xxxx/resourceGroups/jbadf/providers/Microsoft.DataFactory/factories/jbadfapp/pipelines/jb_Copydata’ or the scope is invalid. If access was recently granted, please refresh your credentials.”}}

Headers:
Cache-Control: no-cache
Pragma: no-cache
x-ms-failure-cause: REDACTED
x-ms-request-id: 896c9766-5ee3-4dbb-b3b6-71800e2ee564
x-ms-correlation-request-id: REDACTED
x-ms-routing-request-id: REDACTED
Strict-Transport-Security: REDACTED
X-Content-Type-Options: REDACTED
X-Cache: REDACTED
X-MSEdge-Ref: REDACTED
Date: Tue, 18 Feb 2025 08:23:20 GMT
Content-Length: 512
Content-Type: application/json; charset=utf-8
Expires: -1

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided β€œAS IS” with no warranties, and confers no rights.