Overview

I'm in the process of reviewing a large number of SQL stored procedures, and occasionally I find that a table join isn't properly expressed.  This can result in hard to find bugs, but how can you make your SQL code simpler?  Isn't using a stored procedure best-practice anyway?

Scenario

It is quite common to have a series of related tables.  Consider a stock control system.  Perhaps you have tables for customers, orders, order lines and stock items.  In this simple scenario, you have four tables and clearly there are joins between several of the tables.

Common solution

A common solution is to describe the table joins in SQL code in each stored procedure.  This way the database is only loosely-coupled to the application code, but the stored procedure is tightly-coupled to the database schema.  Suppose the stored procedure returns the number of times a customer has ordered each of the items he/she has ordered.  The SQL might look like the following

SELECT items.item_description, COUNT(order_lines.item_id)
FROM         customers
  INNER JOIN orders ON customers.customer_id = orders.customer_id
  INNER JOIN order_lines ON orders.order_id = order_lines.order_id
  INNER JOIN items ON order_lines.item_id = items.item_id
WHERE customers.customer_name = @customer_name
GROUP BY order_lines.item_id, items.description

This isn't wrong, but it is easy to forget a join field, particularly where a join is on multiple fields, and you have to remember the joins every time.  Essentially, the stored procedure encompasses both schema knowledge and business logic.  Wouldn't it be better to separate the two?

A better solution

Key Point: Views allow you to express table relationships without affecting server performance.  They allow for reusable table expressions.

Let's build some useful views, getting closer to our original SQL as we go along:

Note: I've omitted several columns in the resulting view for brevity, but there is no reason to do so unless you are also using a clustered index on the view (in which case the columns determine the size of the index)

CREATE VIEW all_customer_orders AS
  SELECT customers.customer_name, orders.*
    FROM customers INNER JOIN orders
      ON customers.customer_id = orders.customer_id

CREATE VIEW all_customer_lines AS
  SELECT customer_id, customer_name, order_lines.*
    FROM all_customer_orders INNER JOIN order_lines
      ON all_customer_orders.order_id = order_lines.order_id

CREATE VIEW all_line_item_details AS
  SELECT order_lines.order_line_id,
         order_lines.item_id,
         items.item_description
    FROM order_lines INNER JOIN items
      ON order_lines.item_id = items.item_id

CREATE VIEW all_customer_line_item_details AS
  SELECT all_customer_lines.*,
         all_line_item_details.item_description
    FROM all_customer_lines INNER JOIN all_line_item_details
      ON all_customer_lines.order_line_id = all_line_item_details.order_line_id

Finally, the stored procedure code can be expressed as

SELECT item_description, COUNT(item_id)
  FROM all_customer_line_item_details
GROUP BY item_id, item_description
WHERE customer_name = @customer_name

This SQL no longer assumes any underlying knowledge of the table structure, it simply refers to a single view.  Also, each view only has a single join and is therefore much easier to both test and understand.

Which views to create is a matter of experience, but whenever you are joining several tables, consider using a view.  The view is easier to maintain and easier to test.  It is also reusable.  If you need to parameterise the view, consider using a table-valued function to represent it.

Remember as well that views can be indexed.  This can introduce significant performance benefits without the penalties of denormalised data.  The indexes aren't free from a performance perspective, but judicious use can result in optimal and simple queries.

If you are using SQL Server 2005 Enterprise edition you also get the benefit of having the SQL Server optimiser consider your views (and the associated indices) automatically.

References

Versions

Metadata


Bookmark with :
Digg It! DZone StumbleUpon Technorati Reddit Del.icio.us Newsvine Furl Blinklist
posted @ Monday, January 14, 2008 3:25 PM | in SQL Server Software Development IT Management

Comments

No comments posted yet.

Post Comment

Title *
Name *
Email
Url
Comment *  


Please add 4 and 4 and type the answer here: