Mastering LAG and LEAD Functions in SQL Server 2022 with the IGNORE NULLS Option

SQL Server 2022 introduced a powerful enhancement to the LAG and LEAD functions with the IGNORE NULLS option. This feature allows for more precise analysis and reporting by skipping over NULL values in data sets. In this blog, weโ€™ll explore how to use these functions effectively using the JBDB database, and we’ll demonstrate their application with a detailed business use case.

Business Use Case: Sales Data Analysis

Imagine a retail company, JBStore, that wants to analyze its sales data to understand sales trends better. They aim to compare each month’s sales with the previous and next months, ignoring any missing data (represented by NULL values). This analysis will help identify trends and outliers, aiding in better decision-making.

Setting Up the JBDB Database

First, letโ€™s set up the JBDB database and create a SalesData table with some sample data, including NULL values to represent months with no sales data.

-- Create JBDB database
CREATE DATABASE JBDB;
GO

-- Use the JBDB database
USE JBDB;
GO

-- Create SalesData table
CREATE TABLE SalesData (
    SalesMonth INT,
    SalesAmount INT
);

-- Insert sample data, including NULLs
INSERT INTO SalesData (SalesMonth, SalesAmount)
VALUES
    (1, 1000),
    (2, 1500),
    (3, NULL),
    (4, 1800),
    (5, NULL),
    (6, 2000);
GO

LAG and LEAD Functions: A Quick Recap

The LAG function allows you to access data from a previous row in the same result set without the use of a self-join. Similarly, the LEAD function accesses data from a subsequent row. Both functions are part of the SQL window functions family and are particularly useful in time series analysis.

Using LAG and LEAD with IGNORE NULLS

The IGNORE NULLS option is a game-changer, as it allows you to skip over NULL values, providing more meaningful results. Here’s how you can use it with the LAG and LEAD functions:

Example 1: LAG Function with IGNORE NULLS
SELECT 
    SalesMonth,
    SalesAmount,
    LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS PreviousMonthSales
FROM 
    SalesData;

In this example, LAG(SalesAmount, 1) IGNORE NULLS retrieves the sales amount from the previous month, skipping over any NULL values.

Example 2: LEAD Function with IGNORE NULLS
SELECT 
    SalesMonth,
    SalesAmount,
    LEAD(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS NextMonthSales
FROM 
    SalesData;

Here, LEAD(SalesAmount, 1) IGNORE NULLS retrieves the sales amount from the next month, again skipping over NULL values.

Practical Example: Analyzing Sales Trends

Letโ€™s combine these functions to analyze sales trends more effectively.

SELECT 
    SalesMonth,
    SalesAmount,
    LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS PreviousMonthSales,
    LEAD(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS NextMonthSales
FROM 
    SalesData;

This query provides a complete view of each month’s sales, the previous month’s sales, and the next month’s sales, excluding any NULL values. This is incredibly useful for identifying patterns, such as periods of growth or decline.

Detailed Business Use Case: Data-Driven Decision Making

By utilizing the IGNORE NULLS option with LAG and LEAD functions, JBStore can:

  1. Identify Growth Periods: Detect months where sales increased significantly compared to the previous or next month.
  2. Spot Anomalies: Easily identify months with unusually high or low sales, excluding months with missing data.
  3. Trend Analysis: Understand longer-term trends by comparing sales over multiple months.

These insights can inform marketing strategies, inventory planning, and more.

Calculate Difference Between Current and Previous Month’s Sales:

SELECT SalesMonth, SalesAmount, SalesAmount - LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS SalesDifference FROM SalesData;

Identify Months with Sales Decrease Compared to Previous Month:

WITH CTE AS (
    SELECT 
        SalesMonth,
        SalesAmount,
        LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS PreviousMonthSales
    FROM 
        SalesData
)
SELECT 
    SalesMonth,
    SalesAmount,
    PreviousMonthSales
FROM 
    CTE
WHERE 
    SalesAmount < PreviousMonthSales;

Find the Second Previous Month’s Sales:

SELECT SalesMonth, SalesAmount, LAG(SalesAmount, 2) IGNORE NULLS OVER (ORDER BY SalesMonth) AS SecondPreviousMonthSales FROM SalesData;

Calculate the Rolling Average of the Last Two Months (Ignoring NULLs):

SELECT SalesMonth, SalesAmount, (SalesAmount + LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth)) / 2 AS RollingAverage FROM SalesData;

Compare Sales Between Current Month and Two Months Ahead:

SELECT SalesMonth, SalesAmount, LEAD(SalesAmount, 2) IGNORE NULLS OVER (ORDER BY SalesMonth) AS SalesTwoMonthsAhead FROM SalesData;

Identify Consecutive Months with Sales Increase:

WITH CTE AS ( SELECT SalesMonth, SalesAmount, LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS PreviousMonthSales FROM SalesData ) SELECT SalesMonth, SalesAmount FROM CTE WHERE SalesAmount > PreviousMonthSales;

Find Months with No Sales and Their Preceding Sales Month:

SELECT SalesMonth, SalesAmount, LAG(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS PrecedingMonthSales FROM SalesData WHERE SalesAmount IS NULL;

Calculate Cumulative Sales Sum Ignoring NULLs:

SELECT 
    SalesMonth,
    SalesAmount,
    SUM(ISNULL(SalesAmount, 0)) OVER (ORDER BY SalesMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS CumulativeSales
FROM 
    SalesData;

Identify the First Month with Sales After a Month with NULL Sales:

SELECT SalesMonth, SalesAmount, LEAD(SalesAmount, 1) IGNORE NULLS OVER (ORDER BY SalesMonth) AS FirstNonNullSalesAfterNull FROM SalesData WHERE SalesAmount IS NULL;

    Conclusion ๐ŸŽ‰

    The LAG and LEAD functions with the IGNORE NULLS option in SQL Server 2022 offer a more refined way to analyze data, providing more accurate and meaningful results. Whether you’re analyzing sales data, customer behavior, or any other time series data, these functions can significantly enhance your analytical capabilities.

    Happy querying! ๐Ÿš€

    For more tutorials and tips on SQL Server, including performance tuning and database management, be sure to check out our JBSWiki YouTube channel.

    Thank You,
    Vivek Janakiraman

    Disclaimer:
    The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided โ€œAS ISโ€ with no warranties, and confers no rights.

    SQL Server 2022: Unleashing the Power of the GENERATE_SERIES Function

    In SQL Server 2022, the introduction of the GENERATE_SERIES function marks a significant enhancement, empowering developers and analysts with a flexible and efficient way to generate sequences of numbers. This feature, akin to similar functions in other database systems, simplifies tasks involving sequence generation, such as creating time series data, generating test data, and more.

    In this blog, we’ll explore the GENERATE_SERIES function in detail, using the JBDB database to demonstrate its capabilities. We’ll start with a practical business use case, followed by a comprehensive guide on how to use the function. Let’s dive in! ๐ŸŒŸ

    Business Use Case: Sales Forecasting ๐Ÿ“ˆ

    Imagine you are working for a retail company, and your task is to generate a sales forecast for the next year. You have historical sales data and need to project future sales based on trends. A crucial step in this process is to create a series of dates representing each day of the next year, which will serve as the basis for the forecast.

    The GENERATE_SERIES function can be a game-changer here, allowing you to quickly generate a range of dates without resorting to complex loops or recursive queries.

    Introducing the GENERATE_SERIES Function ๐Ÿ› ๏ธ

    The GENERATE_SERIES function generates a series of numbers or dates. Its syntax is straightforward:

    GENERATE_SERIES(start, stop, step)
    • start: The starting value of the sequence.
    • stop: The ending value of the sequence.
    • step: The increment value between each number in the series.

    Let’s see this in action with some practical examples!

    Example 1: Basic Numeric Series ๐Ÿ”ข

    To generate a series of numbers from 1 to 10:

    SELECT value
    FROM GENERATE_SERIES(1, 10, 1);

    Example 2: Date Series for Forecasting ๐Ÿ“…

    To generate a series of dates for each day of the next year, starting from January 1, 2023:

    SELECT CAST(value AS DATE) AS ForecastDate
    FROM GENERATE_SERIES('2023-01-01', '2023-12-31', 1);

    Generating a Series of Dates Using a CTE ๐Ÿ“…

    Since GENERATE_SERIES supports numeric sequences only, we use a recursive CTE to generate a series of dates. Hereโ€™s how to create a series of dates for the year 2023:

    -- Create a recursive CTE to generate a series of dates
    WITH DateSeries AS (
        -- Anchor member: start date
        SELECT CAST('2023-01-01' AS DATE) AS ForecastDate
        UNION ALL
        -- Recursive member: add one day to the previous date
        SELECT DATEADD(DAY, 1, ForecastDate)
        FROM DateSeries
        WHERE ForecastDate < '2023-12-31'
    )
    -- Query to select the generated dates
    SELECT ForecastDate
    FROM DateSeries
    OPTION (MAXRECURSION 0); -- Remove recursion limit
    

    Implementing the Use Case: Sales Forecasting ๐Ÿ“Š

    Let’s apply the GENERATE_SERIES function to our sales forecasting scenario. Suppose we have a table Sales in the JBDB database with historical sales data. Our goal is to project future sales for each day of the next year.

    Step 1: Creating the JBDB and Sales Table ๐Ÿ—๏ธ

    First, we create the JBDB database and the Sales table:

    CREATE DATABASE JBDB;
    GO
    
    USE JBDB;
    GO
    
    CREATE TABLE Sales (
        SaleDate DATE,
        Amount DECIMAL(10, 2)
    );

    Step 2: Inserting Historical Data ๐Ÿ“ฅ

    Next, let’s insert some historical data into the Sales table:

    INSERT INTO Sales (SaleDate, Amount)
    VALUES
    ('2022-01-01', 100.00),
    ('2022-01-02', 150.00),
    ('2022-01-03', 200.00),
    -- Additional data...
    ('2022-12-31', 250.00);

    Step 3: Generating Future Dates and Forecasting ๐Ÿ“…๐Ÿ”ฎ

    Now, we use GENERATE_SERIES to generate future dates and join it with our historical data to create a sales forecast:

    -- Generate a series of future dates
    WITH DateSeries AS (
        SELECT CAST('2023-01-01' AS DATE) AS ForecastDate
        UNION ALL
        SELECT DATEADD(DAY, 1, ForecastDate)
        FROM DateSeries
        WHERE ForecastDate < '2023-12-31'
    ),
    -- Combine with historical sales data
    SalesForecast AS (
        SELECT
            f.ForecastDate,
            ISNULL(s.Amount, 0) AS HistoricalAmount
        FROM
            DateSeries f
            LEFT JOIN Sales s ON f.ForecastDate = s.SaleDate
    )
    -- Project future sales
    SELECT
        ForecastDate,
        HistoricalAmount,
        -- Simple projection logic (for demonstration)
        HistoricalAmount * 1.05 AS ProjectedAmount
    FROM SalesForecast
    OPTION (MAXRECURSION 0); -- Remove recursion limit
    

    In this query:

    • We generate a series of dates for the year 2023 using GENERATE_SERIES.
    • We join these dates with the historical sales data to create a comprehensive sales forecast.
    • A simple projection logic is applied, assuming a 5% increase in sales.

    Generate a Series of Numbers with Custom Step Size

    Generate a sequence of numbers from 1 to 50 with a step size of 5:

    -- Generate a sequence of numbers with a custom step size
    SELECT value
    FROM GENERATE_SERIES(1, 50, 5);

    Generate a Series of Dates with Custom Step Size

    Generate a series of dates from today to 30 days into the future with a step size of 5 days:

    -- Generate a series of dates with a custom step size (5 days)
    WITH DateSeries AS (
        SELECT DATEADD(DAY, value * 5, CAST(GETDATE() AS DATE)) AS ForecastDate
        FROM GENERATE_SERIES(0, 6, 1) -- 0 to 6 will generate 7 dates
    )
    SELECT ForecastDate
    FROM DateSeries;

    Generate a Series of Random Numbers

    Generate a series of random numbers between 1 and 100:

    -- Generate a series of random numbers between 1 and 100
    SELECT ABS(CHECKSUM(NEWID())) % 100 + 1 AS RandomNumber
    FROM GENERATE_SERIES(1, 10, 1); -- Generate 10 random numbers

    Generate a Series of Time Intervals

    Generate a series of time intervals (every 15 minutes) for one hour:

    -- Generate a series of time intervals (15 minutes) for one hour
    WITH TimeSeries AS (
        SELECT DATEADD(MINUTE, value * 15, CAST('2024-01-01 00:00:00' AS DATETIME)) AS TimeStamp
        FROM GENERATE_SERIES(0, 3, 1) -- 0 to 3 will generate 4 intervals
    )
    SELECT TimeStamp
    FROM TimeSeries;

    Generate a Series of Sequential IDs

    Generate a series of sequential IDs from 1001 to 1010:

    -- Generate a sequence of sequential IDs
    SELECT value + 1000 AS SequentialID
    FROM GENERATE_SERIES(1, 10, 1);

    Generate a Series of Numeric Values with Non-Uniform Steps

    Generate a series of numbers with varying steps (e.g., 1, 2, 4, 8, …):

    -- Generate a series of numbers with varying steps (powers of 2)
    WITH NumberSeries AS (
        SELECT 1 AS value
        UNION ALL
        SELECT value * 2
        FROM NumberSeries
        WHERE value < 64
    )
    SELECT value
    FROM NumberSeries
    OPTION (MAXRECURSION 0);

    Generate a Series of Dates with Monthly Intervals

    Generate a series of dates with a monthly interval for one year:

    -- Generate a series of dates with monthly intervals for one year
    WITH MonthSeries AS (
        SELECT DATEADD(MONTH, value, CAST('2024-01-01' AS DATE)) AS MonthStart
        FROM GENERATE_SERIES(0, 11, 1) -- 0 to 11 will generate 12 months
    )
    SELECT MonthStart
    FROM MonthSeries;

    Generate a Series of Numbers and Calculate Cumulative Sum

    Generate a series of numbers and calculate their cumulative sum:

    -- Generate a series of numbers and calculate the cumulative sum
    WITH NumberSeries AS (
        SELECT value
        FROM GENERATE_SERIES(1, 10, 1)
    ),
    CumulativeSum AS (
        SELECT
            value,
            SUM(value) OVER (ORDER BY value) AS CumulativeSum
        FROM NumberSeries
    )
    SELECT value, CumulativeSum
    FROM CumulativeSum;

    Generate a Series of Custom Random Dates

    Generate a series of random dates within a specific range:

    — Generate a series of random dates within a specific range
    WITH RandomDates AS (
    SELECT DATEADD(DAY, ABS(CHECKSUM(NEWID())) % 365, CAST(‘2024-01-01’ AS DATE)) AS RandomDate
    FROM GENERATE_SERIES(1, 10, 1) — Generate 10 random dates
    )
    SELECT RandomDate
    FROM RandomDates;

    Generate a Series of Numbers and Create Custom Labels

    Generate a series of numbers and create custom labels:

    — Generate a series of numbers and create custom labels
    SELECT value AS Number, ‘Label_’ + CAST(value AS VARCHAR(10)) AS CustomLabel
    FROM GENERATE_SERIES(1, 10, 1);

    Conclusion ๐ŸŒŸ

    The GENERATE_SERIES function in SQL Server 2022 is a versatile tool that can significantly simplify the generation of sequences, whether for numeric ranges or date series. Its applications range from creating time series data for analytics to generating test data for development and testing purposes.

    By leveraging GENERATE_SERIES, businesses can streamline their data workflows, enhance forecasting accuracy, and improve decision-making processes. Whether you’re a database administrator, developer, or data analyst, this function is a valuable addition to your SQL toolkit.

    Feel free to experiment with GENERATE_SERIES and explore its potential in your projects! ๐ŸŽ‰

    For more tutorials and tips on SQL Server, including performance tuning and database management, be sure to check out our JBSWiki YouTube channel.

    Thank You,
    Vivek Janakiraman

    Disclaimer:
    The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided โ€œAS ISโ€ with no warranties, and confers no rights.