Categories
SQL Server

More Context for Copilot with SSMS v22.3

Last week, Database Instructions landed for GitHub Copilot as part of the SSMS v22.3 release. Database Instructions piggyback on extended properties to give Copilot specific context about database objects, which can in turn improve its responses.

For me though, this is much more than a step closer to succumbing to AI. I see this as the start of a shift to better collaboration for anyone who interacts with a database, regardless of role. Let’s dive in.

What are database instructions?

The new Database Instructions are text stored against database objects to add more context about the object and how it should be used. A simple example:

We find a Sales table with a Price column. Is that the price for a single unit or the line total? Does that include or exclude VAT? What about discounts?

This is where context is king, and Database Instructions allow us to annotate these details and remove the ambiguity.

The instructions are stored as extended properties. They use markdown for annotation, based on the AGENTS.md format, so they’re plain text. You don’t need to understand that though, we can ask Copilot to do the work for us:

Generate an instruction for the dbo.Users table to define a ‘good’ user as one with 10x UpVotes compared to DownVotes, and a Reputation of at least 1000. They should also have an associated Badge.

The result:

EXEC sp_addextendedproperty 
    @name = N'AGENTS.md',
    @value = N'Definition of a good user: A user row in the dbo.Users table is considered a "good" user when the value in the UpVotes column is at least ten times the value in the DownVotes column, the Reputation is at least 1000, and the user has at least one associated row in dbo.Badges (dbo.Badges.UserId = dbo.Users.Id). SQL predicate: WHERE UpVotes >= 10 * DownVotes AND Reputation >= 1000 AND EXISTS (SELECT 1 FROM dbo.Badges AS b WHERE b.UserId = dbo.Users.Id). Examples: UpVotes=100, DownVotes=9, Reputation=1200, has Badge => good; UpVotes=50, DownVotes=6, Reputation=1500, has Badge => 50 >= 60 false => not good; UpVotes=10, DownVotes=1, Reputation=1000, has Badge => good.',
    @level0type = N'SCHEMA', @level0name = N'dbo',
    @level1type = N'TABLE',  @level1name = N'Users';

With that applied to our table if we ask Copilot how many good users we have:

SELECT COUNT(*)
FROM dbo.Users AS u
WHERE u.UpVotes >= 10 * u.DownVotes
  AND u.Reputation >= 1000
  AND EXISTS (SELECT 1 FROM dbo.Badges AS b WHERE b.UserId = u.Id);

That’s pretty cool.

Database Instructions come in two flavours – object specific, and database wide.

Above we saw object specific context which uses the AGENTS.md property. This is used to define context about what the objects are, for example a table name which might be abbreviated, what the value of a column means, or indicate a foreign key which is implied but does not have a constraint.

There is also database wide context which instead uses the property CONSTITUTION.md. This is used to define how the database should be used, for example naming standards for objects, retention periods for clean up tasks, or how data should be returned like avoiding PII as standard. You may see teething issues which will get ironed out.

Each of these properties are unique to each object, so once the context is set it would need to be managed like any other extended property, using sp_updateextendedproperty or sp_dropextendedproperty as needed.

If you want to read more about Database Instructions, check out Erin’s post about the release along with the documentation for Database Instructions.

Why are they important?

When approaching a new database, we’re presented with the schema at face value. It might be a legacy system, in a foreign language, or have unwritten rules. The knowledge gaps cause friction. Database Instructions let this knowledge live right in the database.

Knowledge bases aren’t new, but they aren’t consistent either – whether its Confluence-like libraries, readme files within a repository, or simply asking Steve (he’s been here the longest). Seeing Microsoft lock in with a standard which it’s building its own Copilot tooling around gives confidence this could be a longer term direction.

Whilst this knowledge is presented primarily to support Copilot, this context has the opportunity to change how we (real people) build solutions with the SQL Server platform. While the experiences aren’t there yet, humour me with these scenarios where this context could add real value:

  • DBAs doing maintenance are aware of explicit retention and compliance policies to ensure routines are conforming to policy
  • Developers get guardrails when building a proc like squiggle under SELECT * or a query hint because its against the database constitution
  • BI / Analytics developers can see implied relationships when building a model, even when explicit constraints aren’t present
  • Support engineers have visibility of rules and patterns when diagnosing behaviour and defects

There is, of course, the reality of where we’re at. The value of these is currently limited by the effort we put into populating instructions, and the singular piece of tooling which utilises them (Copilot). As time moves on – and with strong adoption – their presence will be felt.

Adoption is the key, and there’s a way to go before we get there. We’d need extended properties to become more prominent within the data lifecycle. There’s already a feedback item open to include them in CREATE and ALTER syntax, and it would be great to see prominence in other areas such as Database Projects, and ensuring they can be effectively deployed as part of CI/CD pipelines.

Clearly we’re early in the life of this feature. If this direction is supported across tooling – from DevOps to IDE to AI to end users – I feel it has potential to accelerate how we build data solutions on the platform.

Wrap up

In this post we’ve looked at the new Database Instructions added in SSMS v22.3 which let us define context to our databases and the objects inside of them. I see this as only the start of the journey to embedding this knowledge into data solutions.

From the AI perspective, it’s great to see deeper understanding than what can be scraped at schema-level, and how these instructions shape the behaviour to provide better solutions using the added context.

Away from AI solutions, I think there’s potential for (human) data developers to get real value from these instructions when tooling catches up. Database Instructions plus Copilot are the first meaningful step in that direction.

If you want to try them out for yourself, grab SSMS v22.3 and check out how your instructions shape your own use cases. Ultimately whether this gains traction or remains a niche feature will depend on how we choose to use it.

Leave a comment