Categories
SQL

Complex Updates and Deletes

The UPDATE and DELETE statements allow us to manage the data within our databases and being able to use them effectively is key. There was recently a question from a junior developer asking about how to perform data changes when table joins are required so I thought it a good opportunity to put forward some more thought out details around this.

Simple changes

When we’re looking to perform an UPDATE or DELETE statement there are different ways you may be used to writing them, with one of the more straightforward being something like below:

UPDATE dbo.Users
SET Surname = 'Smith'
WHERE UserId = 123;

For the purpose of making this more complex I want to look at an alternative way of writing this query in a way that we build on. We’ll restructure it to include a FROM clause as we would see within a SELECT statement.

Below is an alternative way to write the UPDATE statement above:

UPDATE u
SET u.Surname = 'Smith'
FROM dbo.Users u
WHERE u.UserId = 123;

You’ll see we’ve used the alias u for the Users table. We’re able to do that as the syntax has been changed to include a FROM clause in the statement. Aliases become more useful when you’re working with a number of tables and don’t want to have to write the full names out each time they’re referenced. They can also help with readability.

The same approach can also be used with a DELETE statement. A DELETE statement will already include a FROM clause however we can still utilise aliases with these. Below you’ll see examples written without and then with an alias:

/* Regular Delete */
DELETE
FROM dbo.Users
WHERE UserId = 456;

/* Alternative with an Alias */
DELETE u
FROM dbo.Users u
WHERE u.UserId = 456;

It’s worth noting that I’m only using aliases to make the code more readable and aren’t required for any of the joins we’ll be doing. You could instead substitute those with the full table name.

Adding a join

Now that we’ve seen how we can rewrite a regular UPDATE or DELETE to utilise an alias we can build on this to add more tables into the updates. We can still only update one table, but we can use other tables to support in that update process.

Let’s look at an example below where we could update the status of the user based on the status of their account:

UPDATE u
SET u.IsActive = a.IsActive
FROM dbo.Users u
    INNER JOIN dbo.Accounts a ON u.UserId = a.UserId
WHERE u.UserId = 789;

You can see here that using the FROM clause allows us to define multiple tables to be used as part of the update. By adding the aliases into this it also makes the code much neater although they aren’t strictly required.

The same pattern also works for DELETE statements too, so if we wanted to remove any inactive user accounts it could look something like this:

DELETE u
FROM dbo.Users u
	INNER JOIN dbo.Accounts a ON u.UserId = a.UserId
WHERE a.IsActive = 0;

You’ll see in this example that in our WHERE clause we’re not even filtering by anything in our Users table, all of the changes are driven from the Accounts table which has been joined.

Validating changes

Whenever we’re looking to make changes to our data there’s always that small risk that something may go wrong – we may have missed a WHERE clause, or we could target the wrong table. When we’re using multiple tables in our modifications that risk is amplified.

One method to help mitigate against these risks is to take our completed scripts and change them into SELECT statements to see what data will be changed by the action. Of course there could be changes in the data between when you run your SELECT and when the change is made, but doing this should help give confidence that any filters or joins are set up correctly.

Doing this is pretty straightforward too, just remove the UPDATE or DELETE portion of the script and then SELECT any required fields. Let’s see how our previous statements may look:

/* The Update statement */
SELECT
    u.UserId,
    u.IsActive AS OldStatus,
    a.IsActive AS NewStatus
FROM dbo.Users u
    INNER JOIN dbo.Accounts a ON u.UserId = a.UserId
WHERE u.UserId = 789;

/* The Delete statement */
SELECT u.*
FROM dbo.Users u
	INNER JOIN dbo.Accounts a ON u.UserId = a.UserId
WHERE a.IsActive = 0;

In this instance I’ve changed the UPDATE to show me which user will be impacted and what their old and new statuses will be. For the DELETE statement I’ve just asked to see the user records which will be getting removed.

Using this would allow us chance to go back and make changes and then re-run our checks with updated logic if we were to spot any issues. These statements can be used to take a copy of the existing data if that’s how you’d like to approach a rollback if you ran into issues.

Wrap up

Being able to fully utilise the UPDATE and DELETE statements is a core skill for database development and being able to include joins in them is key. Hopefully I’ve shown how straight forward it is (in more words than may be necessary).

In summary – join them like a SELECT statement, try to use aliases to make code more readable, and make sure to check the records which may be impacted before executing the actual changes. Keep it simple and effective.

Leave a comment