Tuesday, August 4, 2009

Performance Tips - III

Guys,

Let's discuss a few more performance tips. Application performance is driven by a number of different factors. One of those items that typically affect performance is how long it takes SQL Server to process T-SQL statements. Sometimes the database design and the complicated query requirements hinder the speed at which T-SQL statements can run. Other times the way the T-SQL statements are coded cause SQL Server to perform extra work to resolve a query. As we know one query can be written in many ways, so by rewriting the statements differently, you can help the SQL Server engine optimize your queries and improve performance.

Identify Column Names in Your INSERT statements
We should explicitly identify the column names with an INSERT statement. Don’t code your INSERT statements like this:

INSERT INTO MyTable VALUES ('A','B','C');

When we use this coding style, SQL Server requires that only three columns be defined in MyTable, and the value “A” would go in the first column, “B” in the second column and “C” in the last column. If someone adds a new column to MyTable your application code will break with the following error:

Msg 213, Level 16, State 1, Line 1
Column name or number of supplied values does not match table definition.
Therefore, instead of using the above coding style for your INSERT statements you should code them like this:

INSERT INTO MyTable(FirstCode, SecondCode, ThirdCode) VALUES ('A','B','C');

By doing this when someone adds a new column named “FourthCode” to MyTable the above INSERT statement will continue to work, provided the “FourthCode” column was created with a DEFAULT value or allows NULLS.

Speed Up Your Searches by Prefixing Wild Card References
Appropriate use of wild card can improve the performance of your queries. Say we want to search the AdventureWorks.Person.Contact table for all the LastNames ending in “sen”. For a moment, let’s assume that we have also built an index on the LastName column. If we code your search like so:

SELECT Distinct LastName
FROM Person.Contact
WHERE LastName LIKE '%sen'

The code uses the wild card percent (%) character to match zero to many characters followed by the string “sen” in the LastName field. This will cause SQL Server to perform an index scan operation looking for all the names that end in “sen” to resolve this query. This makes sense because until the entire table is read (scanned) SQL Server can’t guarantee that it has found all the records where the LastName ends in “sen”.

In addition, if we were searching for LastNames that where exactly six characters long and ended in “sen” we could code our wild card search like this:

SELECT Distinct LastName
FROM Person.Contact
WHERE LastName LIKE '___sen'

Here we have used the underscore (_) wild card character. This wild card character is used to match a single character. This coding example is similar to the prior example, and uses an index scan operation to resolve it. Once again, the SQL Engine knows it has to scan the complete index before it knows it has found all the six character names that end in “sen” in the Person.Contact table.

SQL Server can return your results faster if it doesn’t have to read the entire index using a scan operation. SQL Server is smart enough to know when we place a prefix of some kind in front of your percent (%) and/or underbar (_) wild card characters that it can use an index seek operation to resolve the wild card search criteria. Here is an example of a statement that will return all the LastNames that start with “A” and end in “sen”:

SELECT Distinct LastName
FROM Person.Contact
WHERE LastName LIKE 'A%sen'

By putting the “A” character in front of the percent (%) sign in the search criteria SQL Server is now able to tell that it can use an index seek operation to resolve this query. This make sense because once SQL Server has gotten to the end of the “A” for last name , it knows there are no more last names that start “A” so it can stop processing.

Not all wildcard characters need to be prefixed in order to make SQL Server use an index seek operation to resolve a query. Here is an example of where we can use a range wildcard expression and still have SQL Server resolve this query using an index seek operation:

SELECT Distinct LastName
FROM Person.Contact
WHERE LastName LIKE '[A-M]%sen'

In this T-SQL statement, we are looking for all the LastNames that start with anything between an “A” through “M” and end with “sen”. Other wild card syntax that identifies a specific set of characters can also invoke an index scan operation to resolve a wild card search criteria.

Only Use DISTINCT If You Need It
Placing the DISTINCT clause on our SELECT statement takes the results of our query and removes the duplicates. It takes SQL Server an extra SORT operation to sort the data so it can identify and remove the duplicates. Therefore, if we already know that results will not contain duplicates than don’t put the DISTINCT clause on our T-SQL statement. By putting the DISTINCT clause on our query, we are telling SQL Server to perform the sort and un-duplication process. This is extra work for SQL Server and provides no value when our result set only contains unique sets of records in the first place.

Faster Code by Using Best Practices
There are many reasons to follow best practices. Some best practices help your application code not break when schema changes are made, while other best practices improve the performance of your queries, by minimizing CPU, I/O and network bandwidth. This set of best practices covered some simple things you can do to minimize the resources used by SQL Server when processing your SQL Server statements.


Will come back with more tips.

Happy SQL coding.