Partition Elimination in SQL Server

Back in our introduction to partitioning I’d mentioned that using partitions can also bring us benefits in terms of performance. That performance comes in the way of partition elimination with our queries. Before we get started let’s set the stage with a sample table and one million records of data which we’ve used before:

/* Create a partition function to use */
CREATE PARTITION FUNCTION MyPartFunc (DATE)
AS RANGE RIGHT
FOR VALUES ('2000-01-01', '2001-01-01', '2002-01-01', '2003-01-01', '2004-01-01', '2005-01-01', '2006-01-01', '2007-01-01', '2008-01-01',
    '2009-01-01', '2010-01-01', '2011-01-01', '2012-01-01', '2013-01-01', '2014-01-01', '2015-01-01', '2016-01-01',
    '2017-01-01', '2018-01-01', '2019-01-01', '2020-01-01', '2021-01-01', '2022-01-01', '2023-01-01', '2024-01-01');
 
/* Create a partition scheme for the function */
CREATE PARTITION SCHEME MyPartScheme  
AS PARTITION MyPartFunc  
ALL TO ([PRIMARY]);
 
/* Create the table */
CREATE TABLE SalesData (
    ID INT IDENTITY(1, 1),
    SalesDate DATE,
    SalesQuantity INT,
    SalesValue DECIMAL(9, 2)
);
 
/* Apply clustering key */
CREATE CLUSTERED INDEX cx_SalesDate
ON dbo.SalesData (SalesDate)
ON MyPartScheme (SalesDate);
  
/* Populate 100k random records */
WITH RandomData AS (
  
        SELECT 1 [Counter],
            DATEADD(DAY, CAST(RAND(CHECKSUM(NEWID())) * 10000 AS INT), '1999-01-01') [SalesDate],
            CAST(RAND(CHECKSUM(NEWID())) * 50 AS INT) [SalesQuantity],
            CAST(RAND(CHECKSUM(NEWID())) * 1000 AS DECIMAL(9, 2)) [SalesValue]
    UNION ALL
        SELECT [Counter] + 1,
            DATEADD(DAY, CAST(RAND(CHECKSUM(NEWID())) * 10000 AS INT), '1999-01-01'),
            CAST(RAND(CHECKSUM(NEWID())) * 50 AS INT),
            CAST(RAND(CHECKSUM(NEWID())) * 1000 AS DECIMAL(9, 2))
        FROM RandomData
        WHERE [Counter] < 1000000
)
INSERT INTO SalesData (SalesDate, SalesQuantity, SalesValue)
SELECT SalesDate, SalesQuantity, SalesValue
FROM RandomData
OPTION (MAXRECURSION 0);

What is partition elimination

Partition elimination is the process of excluding unnecessary partitions of data from a query plan. We could think of partitions like slices of pizza – they divide our data into smaller and more manageable pieces, and if we really like a certain topping (not pepperoni, that’s mine) then we can just take the slice we want with the most topping.

Translating this into the SQL Server world, we’ve got a partition function which tells us what data is stored in which partition. The query optimiser can use this information along with any relevant filters in our query to figure out which partitions contain the relevant data and exclude the others from the query plan. This has potential to improve the performance of queries by reducing the data that we need to read for them.

Elimination in action

Let’s take a look at the elimination in action with our sample data we created earlier. Firstly let’s get some idea about the size of the table so we’ll turn on STATISTICS IO to see how much data we’re reading and then we’ll count the records in the table:

SET STATISTICS IO ON

SELECT COUNT(SalesDate)
FROM dbo.SalesData;

Table ‘SalesData’. Scan count 26, logical reads 6452, …

You’ll see that we’ve done 26 scans – which are our partitions – and we’ve read nearly 6,500 pages of data in the process.

Let’s try something similar now but we’ll restrict the data volume down to a single year:

SELECT COUNT(SalesDate)
FROM dbo.SalesData
WHERE SalesDate BETWEEN '2023-01-01'
	AND '2023-12-31';

Table ‘SalesData’. Scan count 1, logical reads 236, …

Now look at that, we’re only scanning a single partition and the reads have decreased dramatically to under 250. If we look at the query plan produced for this query and check out the the INDEX SEEK which is retrieving our data you can see it shows that only a single partition was needed:

Showing the query only needed to access one of the partitions

It’s only an index

As helpful as the elimination can be, it’s not an easy win in all situations. Let’s not forget that our partitioned table is still an index at the end of the day.

The reason that I mention this is that the partition elimination still has some of the challenges you’ll have when finding data via an index. One example of this would be when we try to apply a function in our filter which restrictions the optimiser:

SELECT COUNT(SalesDate)
FROM dbo.SalesData
WHERE DATEPART(YEAR, SalesDate) = 2023;

Table ‘SalesData’. Scan count 26, logical reads 6452, …

This time we’re back to scanning the entire table and all of the partitions. Checking the query plan as before we see it reflected there too:

You’ll also see that the row estimates in this case are further away as the statistics for the table are being used differently to estimate the number of rows to be processed.

The partition function is applied to an index at the end of the day so the optimiser is still making decisions as it would with a regular index. In this instance the DATEPART function we’re applying can’t be pre-determined ahead of time so the engine is forced to scan the table.

The differences in row estimates and logical reads are very much the type of behavior we’d see with an index too. The difference here is that only part of that index needs to be read into memory as the query will only need to access the slice which we’re interested in – that’s where we can see benefits to our performance.

Wrap up

If we have sets of data of sufficient size where partitioning becomes useful, we can also see it helping the performance of our queries through partition elimination over and above what we might achieve with standard indexing alone.

The ability for this to appear in our query plans does come down to how our queries are designed. There would need to be some sort of filtering based on our partition column in the query to make use of this feature. Additionally things like functions or manipulation of the data may inhibit the functionality in the same way it would an index.

One of the nice benefits with partition elimination however is that we don’t need to add anything extra into our queries to make it work. If we were to partition an existing table then any existing queries which use that table have the possibility to use partition elimination as part of their query plans in the future.

What is partition elimination

Elimination in action

It’s only an index

Wrap up

Share this:

Related

Leave a comment Cancel reply