Attaining Reusability in SQL Server: Part I

Multitool

15 Oct

Reusability. It’s hammered into the brains of Software Developers from day one of Software Engineering. You want to write the least amount of code to do the most work and to keep things as consistent as possible. You don’t want to have two functions that are supposed to do the same thing; you want one function that you reference whenever you need that bit of work done so that you are getting the same results across all the components that make up your system. And in general, it is pretty attainable. Unless you’re writing T-SQL.

 

Reusability in T-SQL can be pretty difficult.

Sure you have functions and stored procedures, but they tend to only get you so far. This is especially true when performing data transformations or migrations that need to happen in your database. For instance, you might have several stored procedures to perform transformations that are case specific. Most of these procedures will likely have a few selects, inserts or update statements that are the same amongst the, but you can’t really consolidate them unless you want one giant procedure to do all of them. In either case though, things become hard to maintain, take longer to develop and generally create some pretty messy databases.

 

So how can you attain reusability when writing T-SQL?

Enter User Defined Table Types (UDTTs) and Table Valued Functions (or TVFs). These are some pretty nifty objects made available in SQL Server that have a lot to offer, especially when used together.

UDTTs act as ‘templates’ or definitions for tables that you will be creating in memory at some point. You can declare a local table variable of the UDTT you have defined and pass it around from function to function. For anyone with a C background, it’s pretty analogous to a struct. The full spec for UDTTs can be found here.

TVFs are simply user defined functions that return a result-set instead of a scalar value. This means you can write a TVF to return a table that you can join to, filter on, etc. just like any other result-set. This has some cool implications around performance, which I will get to later. The big benefit from a reusability perspective is that TVFs can accept a Table-Valued Parameters of a UDTT, as well as return their result-set as one. The official spec from Microsoft is here.

When you utilize these two objects together, you can start to think more programmatically about your T-SQL statements. You can create functions that accept data it knows how to deal with, and can return data that the calling process can know how to deal with as well. If you want to return statistics on a UDTT variable table inside of a stored procedure, you could create a TVF that accepts that table type, does the work, and returns another UDTT data-set. You can use this same function everywhere so that you have the same statistics calculations going on wherever you need them.

The one caveat is that you have to use some special T-SQL commands to make this all work. Mainly this is the APPLY operator which has variances that you can read about here. It’s analogous to the JOIN operator, it just works specifically for these types of objects. Once you use the keyword a few times, it will feel just like writing a normal JOIN.

 

And the best part to all of this? Performance won’t suffer.

That’s right, you shouldn’t see performance bottlenecks when moving to this pattern. This is unlike scalar functions which are notorious for introducing performance problems (see references below). Using results-sets instead of scalar values allows the query optimizer to do work up front as if your query was all in one place instead of in TVFs. It treats the result-set of your TVF as a table (duh) and when you pass in a variable of your UDTT, it’s just a pointer to a, well, table. It’s not all that different from reading from a normal table in SQL Server, except that it’s in memory.

That’s the general gist. I will be following this up with a second post to give you some practical examples in the near future, so be on the lookout.

A few of many references and supporting articles for using Table Valued Functions and User Defined Table Types.

Scalar, Functions, Inlining and Performance

Don’t Use Scalar Value Functions

Use Table-Valued Parameters

Dan Goslen
Dan Goslen
dgoslen@bandwidth.com

Dan Goslen is a Software Developer at Bandwidth working on our Billing Platform. He went to NC State after being a raised a Wolfpack fan and graduated with a degree in Computer Science in 2013. He was first exposed to programming when his parents bought him a LEGO Mindstorms set when he was a kid and has been hooked ever since. Being passionate about new ideas and how they evolve, Dan is also helping with our Sprint Sabbatical program. When not coding, Dan can be found playing Ultimate Frisbee, or enjoying a cup of French Pressed coffee.

1Comment

Post A Comment