SQL Server 2022 introduces several new features, one of the most exciting being the APPROX_PERCENTILE_CONT function. This function allows for efficient and approximate calculation of percentiles in large datasets, which can be particularly useful for analytics and data-driven decision-making. In this blog, we will explore the APPROX_PERCENTILE_CONT function in detail, using the JBDB database for practical demonstrations. We’ll start with a business use case, dive into the function’s capabilities, and provide a range of T-SQL queries for you to try. Let’s get started! 🚀
Business Use Case: Customer Transaction Analysis 💼
Consider a retail company that wants to analyze customer spending behavior. The company has a vast amount of transaction data stored in the JBDB database. To optimize marketing strategies and tailor promotions, they want to identify spending patterns across different customer segments.
For example, the company might want to know the 90th percentile of spending amounts to target high-value customers with exclusive offers. Calculating this percentile accurately in a large dataset can be resource-intensive. The APPROX_PERCENTILE_CONT function offers a solution by providing an approximate, yet efficient, calculation of percentiles.
Understanding the APPROX_PERCENTILE_CONT Function 📊
The APPROX_PERCENTILE_CONT function is designed to compute approximate percentile values for a set of data. This function is particularly useful when dealing with large datasets, as it offers a performance advantage by using approximate algorithms.
Syntax:
APPROX_PERCENTILE_CONT ( percentile ) WITHIN GROUP ( ORDER BY numeric_expression )
percentile: A value between 0 and 1 that specifies the desired percentile.numeric_expression: The column or expression to calculate the percentile on.
Example 1: Basic Usage 🌟
Let’s calculate the 90th percentile of customer transaction amounts.
Setup:
USE JBDB;
GO
CREATE TABLE CustomerTransactions (
TransactionID INT PRIMARY KEY,
CustomerID INT,
TransactionAmount DECIMAL(18, 2),
TransactionDate DATE
);
INSERT INTO CustomerTransactions (TransactionID, CustomerID, TransactionAmount, TransactionDate)
VALUES
(1, 101, 50.00, '2023-01-15'),
(2, 102, 150.00, '2023-01-16'),
(3, 103, 300.00, '2023-01-17'),
(4, 101, 75.00, '2023-01-18'),
(5, 104, 200.00, '2023-01-19'),
(6, 105, 125.00, '2023-01-20'),
(7, 106, 400.00, '2023-01-21'),
(8, 102, 175.00, '2023-01-22');
GO
Query to Calculate 90th Percentile:
SELECT APPROX_PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY TransactionAmount) AS Approx90thPercentile
FROM CustomerTransactions;

This result indicates that 90% of transactions are below $375. This insight can help the company focus on high-value customers who spend above this threshold.
Example 2: Analyzing Different Percentiles 🔍
Let’s calculate different percentiles to understand the distribution of transaction amounts.
Query to Calculate Multiple Percentiles:
SELECT
APPROX_PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY TransactionAmount) AS Approx25thPercentile,
APPROX_PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY TransactionAmount) AS Approx50thPercentile,
APPROX_PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY TransactionAmount) AS Approx75thPercentile,
APPROX_PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY TransactionAmount) AS Approx90thPercentile
FROM CustomerTransactions;

These results provide a clear view of the transaction distribution, helping the company to tailor marketing strategies for different customer segments.
Comparing Percentile Results:
- Compare approximate and exact percentile calculations for the 90th percentile:
SELECT
APPROX_PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY TransactionAmount) AS Approx90thPercentile,
PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY TransactionAmount) OVER () AS Exact90thPercentile
FROM CustomerTransactions
group by TransactionAmount;
Segmenting Customers by Spending:
- Identify customers whose spending is in the top 10%:
SELECT CustomerID, TransactionAmount
FROM CustomerTransactions
WHERE TransactionAmount >= (SELECT APPROX_PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY TransactionAmount)
FROM CustomerTransactions);
Analyzing Spending Patterns Over Time:
- Calculate monthly spending percentiles to identify trends:
SELECT
DATEPART(MONTH, TransactionDate) AS Month,
APPROX_PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY TransactionAmount) AS MedianTransaction
FROM CustomerTransactions
GROUP BY DATEPART(MONTH, TransactionDate)
ORDER BY Month;
Combining Percentiles with Other Aggregations:
- Find the average transaction amount for each percentile group:
SELECT
PercentileGroup,
AVG(TransactionAmount) AS AvgTransactionAmount
FROM (
SELECT
TransactionAmount,
NTILE(4) OVER (ORDER BY TransactionAmount) AS PercentileGroup
FROM CustomerTransactions
) AS SubQuery
GROUP BY PercentileGroup;
Conclusion 🏁
The APPROX_PERCENTILE_CONT function in SQL Server 2022 is a powerful tool for efficiently computing approximate percentiles in large datasets. By using this function, businesses can gain valuable insights into data distributions and make informed decisions based on these insights. Whether you’re analyzing customer spending, sales trends, or any other data, the APPROX_PERCENTILE_CONT function offers a quick and efficient way to understand your data.
Happy querying! 😄
For more tutorials and tips on SQL Server, including performance tuning and database management, be sure to check out our JBSWiki YouTube channel.
Thank You,
Vivek Janakiraman
Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.
- analytics
- approximate percentiles
- APPROX_PERCENTILE_CONT
- APPROX_PERCENTILE_CONT demos
- APPROX_PERCENTILE_CONT examples
- BI
- big data
- business intelligence
- data analysis
- data analysis techniques
- Data Analysis Tools
- data analytics
- Data Engineering
- data insights
- data manipulation
- data processing
- data science
- data transformation
- data visualization
- data-driven decision making
- data-driven strategies
- database analytics
- database management
- Database Optimization
- database techniques
- database tools
- JBDB database
- percentile calculation
- Query Optimization
- SQL Queries
- SQL Server
- sql server 2022
- SQL Server 2022 analytics
- SQL Server 2022 analytics techniques
- SQL Server 2022 analytics tools
- SQL Server 2022 APPROX_PERCENTILE_CONT demos
- SQL Server 2022 APPROX_PERCENTILE_CONT examples
- SQL Server 2022 BI
- SQL Server 2022 capabilities
- SQL Server 2022 data analytics
- SQL Server 2022 data handling
- SQL Server 2022 data insights
- SQL Server 2022 data manipulation
- SQL Server 2022 data processing
- SQL Server 2022 data trends
- SQL Server 2022 data visualization
- SQL Server 2022 improvements
- SQL Server 2022 new features
- SQL Server 2022 tips
- SQL Server 2022 tricks
- SQL Server 2022 tutorials
- SQL Server 2022 updates
- SQL Server analytics
- SQL Server approximate percentiles
- SQL Server BI
- SQL Server business analytics
- SQL Server Business Intelligence
- SQL Server data analysis
- SQL Server data engineering
- SQL Server data handling
- SQL Server data insights
- SQL Server data manipulation
- SQL Server data manipulation techniques
- SQL Server data processing
- SQL Server data processing techniques
- SQL Server data processing tools
- SQL Server Data Science
- SQL Server data trends
- SQL Server data visualization
- SQL Server enhancements
- SQL Server Examples
- SQL Server exercises
- SQL Server features
- SQL Server Functions
- SQL Server improvements
- SQL Server learning
- SQL Server percentile analysis
- SQL Server percentile calculation
- SQL Server Performance
- SQL Server performance enhancement
- SQL Server performance optimization
- SQL Server performance tuning
- SQL Server practice
- SQL Server Tips
- SQL Server training
- SQL Server Tricks
- SQL Server Tutorials
- T-SQL