Categories
SQL

Setting Default Values

Defaulting values is a typical part of data ingestion and processing within SQL. Handling NULL values can cause issues in a number of ways so there are time when we want to get these out of our data. Recently I had a question about setting default values for a field which was being processed as a NULL value and what the options were.

The Scenario

To set the stage I’m going to create a Users table and populate it with some data – with the AccessLevel for one of the records set as a NULL value:

CREATE TABLE #Users (
	Username VARCHAR(50) NOT NULL,
	AccessLevel INT NULL
);

INSERT INTO #Users (Username, AccessLevel)
VALUES ('Andy', 9), ('Paul', 0),
	('Lisa', 5), ('Lizzie', NULL);

SELECT * FROM #Users;
UsernameAccessLevel
Andy9
Paul0
Lisa5
LizzieNULL
Contents of #Users table

Now we’ll create a Destination table where we want to import this data into, and a procedure to copy the data over, nothing too fancy:

CREATE TABLE #Destination (
	ID INT IDENTITY(1, 1),
	UserName VARCHAR(50) NOT NULL,
	AccessLevel INT NULL
);
GO

CREATE OR ALTER PROCEDURE #ImportUsers AS
BEGIN

	/* Add new users */
	INSERT INTO #Destination (Username, AccessLevel)
	SELECT u.Username, u.AccessLevel
	FROM #Users u
		LEFT JOIN #Destination d ON u.Username = d.UserName
	WHERE d.ID IS NULL;

END

So lets make a start by running the procedure and looking at the table values afterwards:

EXEC #ImportUsers;

SELECT * FROM #Destination;
IDUsernameAccessLevel
1Andy9
2Paul0
3Lisa5
4LizNULL
Contents of #Destination table

This is giving us a direct copy of the data and what we want to try and do is to remove that NULL value which is popping into the AccessLevel field.

Adding a Default Value

The first approach which was mentioned was adding a default constraint to the table for that field, but let’s see what happens if we try that. We’ll remove the existing data, add in the constraint and then run the import again:

DELETE FROM #Destination;

ALTER TABLE #Destination
ADD CONSTRAINT df_AccessLevel
DEFAULT (0) FOR AccessLevel;

EXEC #ImportUsers;

SELECT * FROM #Destination;
IDUsernameAccessLevel
5Andy9
6Paul0
7Lisa5
8LizNULL
Contents of #Destination table

Oh, we’ve got the same result? Yep.

Default values don’t stop values being explicitly inserted into those fields, they’ll only default the value if the field isn’t specified as part of the INSERT statement. As a test for this we’ll insert just the Username into the table and see what happens:

INSERT INTO #Destination (UserName)
VALUES ('NullTest');

SELECT * FROM #Destination;
IDUsernameAccessLevel
5Andy9
6Paul0
7Lisa5
8LizNULL
9NullTest0
Contents of #Destination table

Now our new test record has correctly had the default applied to it as we’d have liked.

Default values are great for defaulting data when all of the new records being added need to have the value set, for example if you had a user with a Last Access timestamp and maybe you want to default new records to a particular point in time. If there are only some of the records which need the field to be defaulted then we want to tackle that in a different way.

Defaulting a Value

An alternative approach which would be more appropriate in the example of this procedure is with the use of the ISNULL method to replace any NULL values with one explicitly stated in the query. Lets update the procedure to make use of this:

CREATE OR ALTER PROCEDURE #ImportUsers AS
BEGIN

	/* Add new users */
	INSERT INTO #Destination (Username, AccessLevel)
	SELECT u.Username, ISNULL(u.AccessLevel, 0)
	FROM #Users u
		LEFT JOIN #Destination d ON u.Username = d.UserName
	WHERE d.ID IS NULL;

END

Here we’ve specified that any NULL values should be replaced with a default of 0. With that in place lets clear the data and try the process again:

DELETE FROM #Destination;

EXEC #ImportUsers

SELECT * FROM #Destination;
IDUsernameAccessLevel
10Andy9
11Paul0
12Lisa5
13Liz0
Contents of #Destination table

Success, we’ve got that value defaulted as we’d hoped! (it’s the small wins, eh)

I’ll just add a note at this point that if you were doing this type of approach on a production system it may not be feasible to truncate and reload all of the data just to apply a default. In that case it may also be worthwhile defaulting existing records and changing the column to be NOT NULL to avoid any other erroneous records creeping in:

/* Put the procedure into place first to avoid invalid new records */

/* Manually apply the default to existing records */
UPDATE #Destination
SET AccessLevel = 0
WHERE AccessLevel IS NULL;

/* Now set the column to be NOT NULL */
ALTER TABLE #Destination
ALTER COLUMN AccessLevel INT NOT NULL;

Wrap Up

Both of these approaches are great in the right use cases and will likely be used frequently if you’re involved with data ingestion or manipulation.

Applying default values are good for applying to all new records where you don’t need to specify that field as part of the INSERT statement. They can also be handy if you want to extend an existing table with a new column and by applying a default value to it you wouldn’t need to update any existing statements which are adding data to the table.

Typically when importing data we can find a lot of sources where the fields could be NULL values so the ISNULL method can be heavily used in those scenarios. As this is contained in the query too, it’s also really clear to see what’s happening with the inserts/updates at a later point when someone else might revisit the code.

Leave a comment