Categories
SQL

Blazing Fast (and Accurate) Searches Without an Index

Last week I demonstrated a fast binary search approach to quickly slice through large unindexed tables. I love the approach, but it fell short in two key areas – drop-in usage, and proper boundary handling. Why Let’s recap what we’re doing here: Large append-heavy tables – like logs or audits – often don’t have a useful index […]

Categories
SQL

Blazing Fast Searches Without an Index

I know, clickbait right? Hear me out. Searching vast log or audit tables without indexes is painful. Narrowing down a specific time range often means scanning millions or billions of rows. But that doesn’t have to be the case. Approach This approach is designed for tables with two particular characteristics: The combination of these characteristics […]

Categories
SQL

Understanding Expression Evaluation with ISNULL and COALESCE

When eliminating NULL values with SQL Server queries we typically reach for ISNULL or COALESCE to do the job. Generally speaking they’ll provide equivalent results, but they work in different ways. If you’re dealing with logic more complex than ISNULL(ColA, ColB) – for example using a function or subquery as part of the expression – then you might be in for a surprise. […]

Categories
SQL

Batching Large Data Changes Using Key Ranges

Handling data modifications to large tables can be tricky. We can end up with long running queries and large transactions. Efficiently handling them comes from batching. Effective batching in general helps us by: In this post we’ll take a look at a key-range approach to solve the issue. With this solution a batch will be […]

Categories
SSIS

Identifying Bottlenecks in SSIS Packages

Performance issues in SSIS packages can be tough to track down. You know something is off but trawling through pages of SSIS reporting and the validation noise is frustrating. Adding your own logging is a nice idea but not feasible at scale. An alternative? Querying internal tables within the SSIS database is a quicker way […]

Categories
SSIS

The Silent Killer of SSIS Performance: Paging to Disk

One particular performance issue with SSIS data flows can fly under the radar – spilling to disk. This isn’t clearly visible through regular debugging or execution so can go unnoticed. And it hurts. Paging to disk is bad for performance. Disks are much slower to access than memory, so we want to keep our data […]

Categories
SSIS

Improving Cache Performance in SSIS with Selective Caching

We’ve recently looked at how caching can improve performance and I wanted to show how we can eek even more performance out of caches by using a custom approach I’ll term Selective Caching. I’ll note here that there’s a potential gotcha with this approach which we’ll get to before the end of the post! Scenario Let’s say […]

Categories
SSIS

Using a File Cache for Isolation, Performance, and Reusability in SSIS

Last week we looked at using a cache to improve lookup performance. We saw how a cache improves performance by being able to reuse reference data repeatedly. That used a regular cache but it’s not the only option available to us. In this post we’re going to look at the File Cache option which can achieve […]

Categories
SSIS

Optimising Multiple Lookup Transformations with Caching in SSIS

Lookup transformations provide us a way to access related values from another source, such as retrieving surrogate keys in data warehousing. When we need multiple lookups to the same reference data we can improve performance through the use of a Cache. If we consider data warehousing, a prime example of this would be an order […]

Categories
SQL

Using Missing Index Recommendations in a Pinch

When running queries, reviewing execution plans, or diving into Query Store it’s not uncommon to see missing index recommendations, particularly for more complex queries. But are they any good? There are many factors when considering an index. However if you’re a situation where these missing index recommendations are enticing, I wanted to cover 3 key areas you […]