I moved my blog to new address http://www.tobepioneers.com: 2011-10-30

11/04/2011

5 Tips and Techniques for Avoiding Automatic GC Collections

Automatic memory management is isn't new, but it’s a wonderful thing for programmers. We bring you some tips and techniques to .help you understand a bit more about how .NET’s memory management works, can help you to ensure that you write high-performance .NET code

The .NET garbage collector is, in general, a wonderful thing. It has freed countless programmers from the drudgery both of writing allocation and de-allocation code, and also of tracking down heap corruption and memory leaks (due to bad pointers or de-allocation errors). It allows programmers to simply write code and focus on function without normally needing to care about how memory is actually managed. This tends to make it much easier for junior programmers to write correct (or at least functional) code, thereby reducing training costs and the time necessary to do things like code reviews.

However, for all that the garbage collector is a brilliant piece of software engineering, sometimes collections still occur at inconvenient times. If you need to write high performance, real-time code, in addition to measuring performance using tools like ANTS Performance Profiler, you generally want to try to avoid GC collections during times where execution speed is critical. For example: in an XNA game, you want to avoid collections during gameplay to prevent visual glitches in your animations; in a real time financial analysis and trading application, you want to avoid collections between when you begin analyzing incoming data and when you finish executing an algorithmically-controlled trading decision based on that data. Of course, sometimes GC collections at inconvenient times are unavoidable, and in those situations it is best to instead try to minimize the duration of individual collections (See: http://blogs.msdn.com/b/shawnhar/archive/2007/07/02/twin-paths-to-garbage-collector-nirvana.aspx). This article focuses mainly on avoiding collections, and while there is some overlap between optimizing to minimize GC frequency and optimizing to minimize GC latency, the two goals can also collide. This is certainly the case with object pooling, which can be very helpful with reducing frequency, but is almost guaranteed to increase latency (especially with non-generational GCs).
As you’re no doubt already aware, the GC runs automatically when certain defined conditions are met, and the specific conditions vary depending on the common language runtime in which the program is running. The .NET 4.0 Framework CLR has different conditions than the .NET 3.5 Compact Framework CLR, for instance. Nonetheless, the GCs of every CLR that I am familiar with all run a collection when a certain number of bytes have been allocated since the previous collection, and it’s this triggered collection that we will consider.

Note: Generational GCs are more complicated, as we saw in Clive Tong’s article, The Top 5 .NET Memory Management Misconceptions. The essence of a generational GC is that objects which survive a collection of their current generation are promoted to the next generation (if any). The higher generations are collected less frequently, such that the GC typically runs more quickly (it only needs to examine all the objects which are at or below the generation for which it is running a collection). For more, see: http://msdn.microsoft.com/en-us/library/ee787088.aspx and http://blogs.msdn.com/b/abhinaba/archive/2011/06/08/wp7-mango-mark-sweep-collection-and-how-does-a-generational-gc-help.aspx. Non-generational GCs maintain all objects at the same level such that every collection entails an examination of all live objects.

The .NET Framework 4.0 CLRs that run on Windows OSs have generational GCs, with algorithms that will vary the number of bytes that trigger an automatic collection in order to increase performance. On the other hand, the CLR that runs on the Xbox 360 and on Windows Phone 7.0 both run automatic collections whenever 1 MB has been allocated since the last collection. To be more specific, the Windows Phone Mango CLR has a generational GC that collects Gen0 whenever 1 MB has been allocated to Gen0, and collects Gen1 whenever 5 MB has been promoted to that generation (see: http://blogs.msdn.com/b/abhinaba/archive/2011/06/14/wp7-mango-the-new-generational-gc.aspx).
Understanding a bit more about how .NET’s memory management works can help you with both paths. These tips and techniques should help to both increase your understanding of what your code is doing, and to design your code to run faster with a minimum of extra work.

1. Know the difference between value types and reference types

There are two kinds of data types in .NET: value types and reference types. (See http://msdn.microsoft.com/en-us/library/ms173104.aspx, specifically the material beginning at the “The Common Type System” subheading) The difference between these two types ultimately comes down to how they store their data. An instance of a value type holds its data within itself, whereas an instance of a reference type holds a reference to the memory location of its data. Both value types and reference types have a default value, which is assigned to the instance when it is created in various situations.

Typical situations where the default value is automatically assigned are static variables, member variables of structs when instantiated with the default constructor, member variables of classes when a value is neither assigned in the constructor nor explicitly specified in a field’s declaration.

For value types, the default value is the result of a zeroing of the type’s member variables (e.g. for integers it’s zero, for booleans it’s false, for enumerations it’s zero (normally the first member of the enum), and for structs it’s the default value of each member of struct). For reference types, the default value is null. For those who are curious, the value of null in .NET is a zeroed out reference (the size of a reference depends on the runtime platform’s native memory address size and is thus only known at runtime).
So why does this matter? In truth, the difference is normally not too important, although sometimes you might trip over some peculiarities of value types (e.g. if you have a struct as a property, you cannot directly set any of its public fields or properties since the property’s get method returns a copy of the struct, instead of a reference to the struct). However, when writing to avoid or minimize GC collections, the difference becomes very important.
A value type either exists or does not exist. It cannot be null (Nullable<T> struct simulates nullity by maintaining a bool that specifies whether or not the struct has a value and by overriding the Equals method inherited from System.Object to handle checking for null. So it is never null, it just tests as equal to null in the appropriate circumstances) and exists for as long as the object that contains it does. As such, a value type has, at worst, a negligible effect on garbage collection. In fact, value types often do not exist within the GC’s heap. For example, a value type instance that is created within a method is not managed by the GC and, as such, the GC’s internal allocation amount tracking variable does not increase. On the other hand, if the value type is a struct, and that struct has a reference type as a member, then that reference type (should it be instantiated) is still managed by the GC. However the struct itself, when locally scoped in an ordinary method, is not subject to the GC’s purview (see http://blogs.msdn.com/b/ericlippert/archive/2010/09/30/the-truth-about-value-types.aspx).
Further, even when a value type is stored on the GC’s heap, e.g. when you create an array of value types, the GC does not need to check to see if those value type instances still exist when it does a collection since a value type cannot be null; the mere fact that the array still exists guarantees that they exist (and when the array is no longer reachable, they are also thereby not reachable, as their value nature precludes a reference to them existing in such circumstances ).

References to value types are possible, but never ones that could survive the object that contained the value type itself. An example would be a method like: public void AddHalf(ref double value) { value += 0.5; } . The ref double parameter is legal (and saves 4 bytes in the method call in a 32-bit program since references are 4 bytes and doubles are 8) but it’s impossible to have an object that contains the referenced double (such as an array) be subject to collection so long as the reference to the double exists.

So when the GC, in the course of doing a collection, comes across an array of one million floats, just needs to check to see if the array itself is still reachable. By contrast, for an array of one million class instances, it needs to check to see if the array itself is reachable, and then it needs to check each and every one of those one million elements to see if they are valid references or are null. The same holds true for generic collections. By using value types where appropriate, you can both help prevent GC allocations and can also make the GC’s collections speed along more quickly.

2. Consider a Struct Instead of a Class.

When creating a type, especially one that you are going to use in an array or a generic collection, consider making it a struct. Since structs are value types, they cannot be null such that the GC does not manage them (though it does manage any reference types that are members of the struct). When a GC collection does run, it only checks reference types to see if they are still reachable. This is only a small benefit for individual struct instances, but the benefit grows quickly with arrays, and can help to significantly reduce collection times when an array is composed of struct instances rather than class instances.

In my tests, I’ve come across one possible exception. Where you have a struct that contains a reference type as a member and a class with the same members, if most class instances in an array are null then the .NET Framework 4.0 x86 CLR’s GC will be faster on the array of classes. If there are no reference types within the definition or if even a small number (less than one-tenth) of the class instances are non-null, then the struct array retains its advantage.

With an array of class instances, the GC has to check each item in that array to see if it is a live object or not (the same is true for generic collections, which use an internal array). With an array of structs, the GC just looks to see if the array itself is still a live object, since structs cannot be null (this is true even for Nullable<T> structs, which just use an internal tracking mechanism to determine nullity). So that is potentially thousands or even millions of items that the GC does not need to examine when a collection runs!
There are several downsides to structs, though, so don’t get too carried away. They cannot inherit from classes or even from other structs, so while they can implement interfaces, the lack of inheritance can lead to a fair bit of code duplication and can increase your development time as a result. They also, as the term ‘value type’ implies, are copied by value rather than passed by reference when used as method parameters or assigned to / retrieved from arrays, collections, and properties. As such, when they have a large number of members, the cost of copying begins to add up. Lastly, they cannot have destructors (also known as finalizers) such that they are a poor choice for items that need to implement IDisposable. Nonetheless, when you have a data structure which will have many thousands of instances created of it, or of which it would be impractical to create an object pool, a struct can often be a big benefit in terms of GC performance and minimizing automatic GC collections.

The XNA Framework serves as a great example of appropriately using structs in preference to classes. To the extent that it was possible, the types which are commonly created and used in gameplay code, such as Vector2, GamepadState, and Color, have been implemented as structs made up of other value types. Types that aren’t commonly created during performance critical code or which need to be classes to implement IDisposable properly, such as BoundingFrustum and Texture2D, are implemented as classes.

3. Design classes so that instances are reusable and then use object pools where appropriate.

Regardless of the platform, most managed objects, excluding arrays and generic collections, tend to be small. Because of this, keeping instances of objects around when you are done with them and reusing them rather than creating new instances every time can be a useful trick. Doing so prevents the GC’s internal allocation counter from increasing, which helps prevent an automatic GC collection from running. Write objects you are likely to use many times so that each overload of its constructor has an equivalent public method that you can call to reinitialize that object. Then use a generic collection type such as a List<T> to pool object instances.
My own coding tends to be on platforms that use the .NET Compact Framework. The Xbox 360 and Windows Phone 7 both use versions of the .NET CF, and both automatically run a collection every time that 1 MB of GC-managed memory has been allocated since the previous collection. On the .NET Framework, the GC is more flexible on when it runs a collection, and also has separate heaps for large objects (the “LOH”) and for small objects, along with a generational GC which introduces efficiencies that make GC collections less intrusive. Object pools can also be very expensive if you are unable to eliminate GC collections during time critical code, since you are keeping around more objects that the GC will need to examine (this expense is especially a factor on platforms without a generational GC).
The following is an example of an object pool.

    /// <summary>
    /// A basic generic object pool. Objects are removed from the pool when request
    /// and re-added when returned.
    /// </summary>
    /// <typeparam name="T">
    /// The type of object to pool. This must be a class and have a parameterless
    /// constructor.
    /// </typeparam>
    public class ObjectPool<T> where T : class, new()
    {
        /// <summary>
        /// The pool itself.
        /// </summary>
        private System.Collections.Generic.List<T> pool;

        /// <summary>
        /// A lock for synchronizing access to the pool.
        /// </summary>
        private readonly System.Object _poolLock;

        /// <summary>
        /// Returns the count of items currently in the pool.
        /// </summary>
        public int Count
        {
            get
            {
                lock (_poolLock)
                {
                    return pool.Count;
                }
            }
        }

        /// <summary>
        /// Creates the object pool with <paramref name="initialCount"/> number
        /// of instances created and added to the pool. The pool will have an
        /// initial capacity of <paramref name="initialCapacity"/>.
        /// </summary>
        /// <param name="initialCapacity">
        /// How large the initial capacity of the pool should be. Set this high
        /// enough that you won't exceed it, but not so high as to needlessly
        /// waste memory.
        /// </param>
        /// <param name="initialCount">
        /// How many object instances to create initially.
        /// </param>
        public ObjectPool(int initialCapacity, int initialCount)
        {
            _poolLock = new System.Object();

            pool = new System.Collections.Generic.List<T>(initialCapacity);

            for (int i = 0; i < initialCount; i++)
            {
                pool.Add(new T());
            }
        }

        /// <summary>
        /// Gets an instance of <typeparamref name="T"/> from the pool. If the pool is
        /// empty, a new instance of <typeparamref name="T"/> is created and returned
        /// to you instead.
        /// </summary>
        /// <returns>An unused instance of <typeparamref name="T"/>.</returns>
        public T GetObject()
        {
            lock (_poolLock)
            {
                if (pool.Count == 0)
                {
                    System.Diagnostics.Debug.WriteLine("Had to create a new item.");
                    return new T();
                }

                // Remove from the end of the pool to prevent an internal rearrangement.
                var item = pool[pool.Count - 1];
                pool.RemoveAt(pool.Count - 1);
                return item;
            }
        }

        /// <summary>
        /// Return (or add) an existing instance of <typeparamref name="T"/> to the pool.
        /// </summary>
        /// <param name="item">
        /// The instance of <typeparamref name="T"/> to return to the pool.
        /// </param>
        /// <exception cref="System.ArgumentNullException">
        /// Thrown when <paramref name="item"/> is null.
        /// </exception>
        /// <exception cref="System.InvalidOperationException">
        /// Thrown when <paramref name="item"/> is already in the object pool.
        /// </exception>
        /// <remarks>
        /// Do not return an object that you are still using. This will likely lead
        /// to the same object being "checked out" twice which will cause bugs. Also if
        /// <typeparamref name="T"/> implements <see cref="System.IDisposable"/>, do not
        /// return an object that has been disposed.
        /// </remarks>
        public void ReturnObject(T item)
        {
            if (item == null)
            {
                throw new System.ArgumentNullException("button");
            }

            lock (_poolLock)
            {
                if (!pool.Contains(item))
                {
                    pool.Add(item);
                }
                else
                {
                    throw new System.InvalidOperationException("The pool already contains this item.");
                }
            }
        }

        /// <summary>
        /// If <typeparamref name="T"/> implements <see cref="System.IDisposable"/>, this
        /// will call Dispose on all objects in the pool.
        /// </summary>
        /// <remarks>
        /// If <typeparamref name="T"/> does not implement
        /// <see cref="System.IDisposable"/>, this will do nothing. You should
        /// only call this method when you are completely finished with the pool as
        /// as you will otherwise get disposed objects back when calling
        /// <see cref="GetObject"/>.
        /// </remarks>
        public void DisposeAllObjects()
        {
            if (typeof(System.IDisposable).IsAssignableFrom(typeof(T)))
            {
                foreach (var item in pool)
                {
                    (item as System.IDisposable).Dispose();
                }
            }
        }
    }

4. Use collection constructor overloads that take an initial capacity for collection types that have them.

Most of the generic collections have constructor overloads that take an ‘int initialCapacity’ parameter. Internally, List<T>, Dictionary<TKey, TValue>, and the other generic collections use one or more arrays along with a tracking variable to identify how many items of internal array hold valid data. Whenever adding an item to the collection would exceed the capacity of the internal array, a new array that is double the previous capacity is created. The existing array’s elements are then copied to the new array, and the new array then replaces the old internal array. All of this generates allocations that increase the GC’s internal allocation counter.

When you first create a collection using its parameter-less constructor with the .NET Framework 4.0 CLR, the internal array’s length is 0. The first addition causes it to resize to hold 4 elements. These are all implementation details, including the fact that there is an internal array at all. These internal details could change in the future so you should not rely on these internal implementation details. The advice itself (use initial capacity constructors) is part of the public interface, though, so using it is safe.

If you know approximately how many elements will ever be in that collection at one time, you can use one of the constructor overloads that tells the collection to create its internal array(s) with a specific initial size, rather than its default size. If you know the exact number of elements your collection will hold, use that. If not, use an estimation with a small bit of extra capacity, just in case. If you don’t have any idea, consider using System.Diagnostic.Debug.WriteLine to write out the name, current Count, and current Capacity of the collection whenever you have just added an item to it. Then run your program as normal and make note of the debugging output. This will give you a good estimation of each collection’s capacity needs, which you can then use to decide on an initial capacity.
Assuming you set your initial capacity high enough, that internal array will never need to be resized, meaning you can avoid generating new allocations. However, setting the initial capacity to be far in excess of your needs will just cause wasted memory (especially with collections of value types), and will cause GC collections to take longer when they do occur.

5. Be aware of common operations that generate allocations.

There are a few common operations in .NET that generate GC allocations in a way that might not be obvious. The two prime targets are the boxing of value types and most string operations. We’ll look at each in turn.
Boxing of value types refers to the act of casting a value type (such as an integer or a float) to a reference type (such as System.Object or an interface). One place this is commonly found is in string operation methods that take a ‘params object[]’ argument such as String.Format and Console.WriteLine. Since every System.Object has a ToString method, you can cast anything (including value types) to Object, and then pass them to a method that uses ToString to get that object’s string representation. Boxing also occurs when explicitly casting a value type to an interface. While interfaces cannot be instantiated, the CLR treats them as being reference types whenever you create an instance of an interface (through casting an existing data item that implements that interface).

There is also a corresponding ‘unbox’ CIL instruction for casting a boxed value type from Object back to its value type. This is commonly found when a method takes in a parameter as an Object instance and then casts it to its appropriate value type. Unbox does not generate GC allocations.

For value types, a reference type container (a ‘box’) needs to be created to hold the underlying value type when it is cast to Object or to an interface . If you use a disassembler tool such as ildasm.exe or .NET Reflector® , you can look for the Common Intermediate Language (“CIL”) ‘box’ instruction to find instances where you are boxing value types. It’s not always possible to eliminate boxing, and it may not be worthwhile to do so even when it is possible, but simply by knowing what it is and how to spot it, you are more aware of the memory implications of your code.
Strings are a strange thing in .NET. They are reference types and thus are managed by the GC. However they are immutable and are not subject to re-use due to the way they are stored (see http://blogs.msdn.com/b/ericlippert/archive/2011/07/19/strings-immutability-and-persistence.aspx). As such, virtually all operations that involve transforming a string will in some way result in GC allocations. String concatenation creates a new string, for instance, as does trimming whitespace with Trim and its cousins. Most other string operations (e.g. almost any one that returns a string) will generate allocations. Transforming objects into their string equivalents by using the ToString method also normally generates GC allocations. They’re hard to avoid.
One way to minimize allocations caused by string operations is to use System.Text.StringBuilder. StringBuilder is to the String class as List<T> is to an array. By this I mean that it uses an internal array that it automatically resizes itself when needed (using the same “create an array that is twice as big, copy over the data, and replace the existing array with the new array” mechanism). Unfortunately, not many classes in .NET that take a string will also take a StringBuilder. Nevertheless, you can use StringBuilder to compose strings from many different parts, and can reuse StringBuilder instances to cut down on allocations. It’s not often a large gain, but it can help save some allocations in code that performs a lot of string operations.

Wrap Up

Automatic memory management is by no means a new invention (John McCarthy, the inventor of LISP, first came up with the concept in 1959 ). Nonetheless, it’s a wonderful thing for programmers, and is something that will likely find its way into many more venues as computing power continues to grow. Knowing a bit about how it works is something I think every .NET programmer could benefit from. I hope this has helped shed a bit of light on it for you.

Bonus Tip - 6. Force a GC collection at convenient times.

Using the GC class, you can force a collection by calling its static Collect method. In ordinary programming, it’s usually best to let the GC handle such matters itself. However, if you have a situation where a GC collection would be bad, e.g. during the middle of a level in a game, it usually makes sense to force a collection right after you have finished loading that level’s content (right before going into gameplay). Doing so will reset the GC’s internal allocation counter, and delay the next collection. If you have managed to limit, but not eliminate, garbage generation during gameplay, then you might find that your time-sensitive operation will complete before you pass the magic allocation mark at which an automatic collection occurs. So, if you can identify good places within your code to force a GC collection, consider doing it and measuring the performance results. A forced collection might be just the thing you need to stave off an unwanted automatic collection.

10/30/2011

SSMS Tools Pack 2.0

by Grant Fritchey

You know all those little things that make you crazy when you’re developing in SQL Server Management Studio? Those pieces of functionality that you just can’t understand why Microsoft didn’t just include it? Stuff like generating a set of CRUD statements (Create, Read, Update Delete), or running custom scripts right from the SSMS GUI, or even a way to analyze execution plans rather than read through them? Yeah, Mladen Prajdic felt exactly the same way. Except unlike the rest of us, instead of sitting around moaning about how the software doesn’t do this or that for me, Mladen just started building it for himself. Then he went one step farther, he let the rest of us have it too. That’s right, SSMS Tools Pack is a free piece of software that patches over a lot of the little cracks in SSMS (although I’m certain that Mladen wouldn’t mind contributions if you really love his work). It works with SQL Server versions from 2000 to Denali and it does everything that it does working directly within SSMS so you can stay well and truly inside your comfort zone, developing with the toolset that you’re used to. In fact, SSMS Tools Pack will quickly become one of those pieces of software that you won’t want to live without.

Download and Install

When I heard that a whole new version was becoming available, I couldn’t wait to try it out. I downloaded it the first day it was available. Here is the exhaustive list of actions and requirements to download the software:

Go to http://www.ssmstoolspack.com/Download
Click on the download button

That’s right. No registration, no email address, no home phone, none of the standard stuff that you have to supply in order to get a piece of software. Did I mention that this is free software and that Mladen doesn’t really want to know who you are in order to supply you with it?
After you launch the installer, it will recognize what version of SQL Server you have installed on your desktop. I tried the install on a 2008R2 instance and on a Denali instance and didn’t have any troubles with it whatsoever.
On the Denali instance, I had SQL Server Management Studio open when I did the install. I didn’t get any error messages, but when I went to SSMS I also didn’t see any of the new tools available. Upon restarting SSMS, the tools loaded right up.

Functionality

There’s a shocking amount of functionality built into this piece of software. Because it covers so much different functionality trying to find a handle for how to approach the evaluation isn’t entirely simple. So, with the full set of functionality in front of me, I’m going to cover the stuff that I find cool or intriguing. If I leave out your favorite function, there’s your chance to post a comment or write up an article of your own.
Mind you, I’m not going right to the coolest stuff. We’ll start slow and build.

Generate CRUD

There’s a small market on code generators for SQL Server. Why? Because most of the queries, the CUD part of CRUD, can be built completely through generation. There just is no reason why SSMS can’t do this on its own. But, instead of having to go to an outside tool or work with code bases other than TSQL, you can just right click on the table you’re interested in and one of the new context menu choices is “SSMS Tools” and right under that, “Create CRUD…” Clicking on it opens a new query window with a set of stored procedures all set up to add, update, remove or read the data from the table selected. It’s that easy. The statements are well formatted too with DROP statements before the create statements so that you can run this as needed when you change your structures.
It gets better though. You’re not dependent on the structures that Mladen has provided. If you go to the SSMS Tools menu, there’s a list of options to modify different functionality. CRUD Generation is right there with some of what you can control as shown in Figure 1:

Figure 1

As you can see you can turn the functionality on or off. More importantly, you can control column order, set stored procedure prefixes and others. But the important stuff is on the next four tabs. You’re not dependent on Mladen’s choices for how to format and layout your procedures (although I don’t have any problems with his choices). If you wanted to put a standard TRY/CATCH error handler in your Insert statements, all you have to do is click on the tab and modify the query you find there, just like you see in Figure 2:

Figure 2

All you have to do is follow the instructions and you can modify these procedures so that they work exactly the way you’d want them to.
One change I’d like to see, make it so that I don’t have to generate the Read procedure because those are mostly not simple, select from one table, affairs on systems I’ve worked with.

Connection Coloring

I’ve never in my life done something like what I’m about to describe, but I’ve heard that others have done so. You hit execute on that RESTORE DATABASE script and you set it to restricted user and put ROLLBACK IMMEDIATE in there because the developers on the development server where you’re restoring the database are never logged out even though they say they are and then your eye drifts down to the bottom of the screen… wait, that’s not DEV02\BoringOldDevServer. It’s PROD42\YouBetterGetYourResumeCleanedUpServer. Oh… *******…
Not that I’ve ever done that, or at least, there are no living witnesses that I’ve ever done that. Same thing really.
That’s where Windows Connection Coloring comes in handy. If you go back to the SSMS Tools menu you can find the appropriate menu and open the options window. Once the window is open, you can supply a server name, or, you can use regular expressions to define a pattern, so that you can set a group of servers up to behave a certain way. Figure 3 shows how I configured it locally:

Figure 3

If you do type in a regular expression, make sure you also select the “IsRegex” checkbox. Once you have things set up appropriately, you’ll see a colored bar at the top of your query window so you can be very aware of which server or set of servers you’re currently connected to. Figure 4 shows the bar on display:

Figure 4

I realize this is a pretty simplistic little thing, but if a simple little thing can prevent a production outage, it’s pretty cool. In keeping with my irritate Mladen with more suggestions though, since you can format the bottom color on connections individually, it would be good if this also updated that color so that you could see the color chosen for a particular connection on the top and the bottom of the query window.

Tab Sessions

New with Version 2 of SSMS Tools Pack is the Tab Sessions. This little bit of functionality keeps track of the tabs that you have open in SSMS and the TSQL inside those tabs. On a periodic basis (set within the options screen) it will save the tabs and the code. Then, say after a exiting and reentering SSMS you can hit the Restore Session button on the toolbar and it will reopen your tabs. Which is really cool, but it gets better.
If you go to the SSMS Tools menu and to the “SQL History” menu there, you can see the “Tabs History Management” menu choice. Clicking on that will open up a collection of different sets of tabs going back in time. You can see an example in Figure 5:

Figure 5

It’s cool that you can scroll through these and pick one to restore, but what’s really cool is that you can then click on the tabs, like I have on the center one, and scroll through the code on that tab to ensure it’s the one you want. I’m digging this in a major way.
One major issue that I do have with it though, and this isn’t just another irritate Mladen request, is that while it knows which TSQL I have in a tab, it’s ignorant of the fact that that TSQL is from a file, so it will restore the TSQL to the appropriate tab on your window, but it won’t link it back up with the file that you had it open from. If you continue editing that TSQL, it won’t be saving to the file until you choose save and overwrite what you saved originally.

Execution Plan Analyzer

I’ve saved the coolest new feature for last. SSMS Tools integrates with your execution plans. There’s a button on the SSMS toolbar, “Show/Hide Execution Plan Analyzer.” Click on that button when you have an execution plan open in SSMS, any plan, even from a file, and your execution plan screen will suddenly change to look like Figure 6:

Figure 6

What you’ve got are different sets of functionality. Across the top are three buttons, offering you the ability identify the highest cost operation, search operators, and get suggestions on plan improvements. Yeah, you heard me, it’ll help you turn your queries.
The first piece of functionality, identify the highest cost operation, that’s hardly worth talking about on a plan that looks like the one above. But how a plan that looks like Figure 7?

Figure 7:

Yeah, that’s one of the hairiest plans I’ve ever seen and I sure would have used this to browse through that plan (I used XQuery to find the costliest operators, worked, but it wasn’t pretty like this). You can adjust it too so that it shows more or less operations based on their cost by using the little slider bar right above it. The tool tip will show the name of the operator and the cost and clicking on it will take you to that operator within the plan. That’s extremely handy.
But, you can adjust the default values, and you can pick other properties from the Options window and have them bubble to the top. More interested in high numbers of Estimated Rows, change the property and you’ll see those operations that exceed the value that you supply on the bar on the left side of the screen. Again, very handy.
Next, you can search for operators. Want to know if you have a Sort operation in that mess of icons? Click the button and type in the term. You’ll see any and all Sort operations. But, it gets better. What if you’re curious that the primary key was used in the query. Type that in and you’ll see results like Figure 8:

Figure 8.

I left the tooltip up so that you can see that SSMS Tools Pack searches inside the operators to find the things you were interested in.
Now, what about this tuning help? I’ll quote Mladen on this one:
It goes through your plan finding most common problems and gives you the usual solution for each one. This feature isn't meant to replace manual analysis but it will find the usual problematic spots that you can fix up fast.
The query I gave it is very simple and has a couple of issues, primarily structural. You can see that SQL Server thinks I need a different index. Let’s see what SSMS Tools Pack says I ought to do about it.

Figure 9:

It called out two issues, the Merge Join, and the Clustered Index scan that you can see in Figure 9. On both it pointed out that my estimated number of rows and my actual number of rows differed by a significant amount. This is true. I use this query to demonstrate bad parameter sniffing. The suggested solutions were to update the statistics on all tables used in the query or to check for missing indexes. These are excellent suggestions. Neither will work in this case because I’ve got a skewed data set (like I said, this is my demo code and it’s broken on purpose), but nonetheless, based on the information provided, these are the same recommendations I’d make myself. I’m impressed.

And More

Yes, there’s more. I can’t cover everything without generating about 50 pages of documentation to go through all the functionality and all the control over that functionality that you get. Here’s a quick run-down on some of the stuff I didn’t cover:
SQL SnippetsSearch Table/View/Database DataRun Scripts on Multiple ServersSearch Results in GridCustom Scripts from Object Explorer
Mladen did a nice job of documenting all the functionality, including the parts I didn’t mention.

Conclusion

I like this piece of software. A lot. I use it regularly. I’ve had to turn off a couple of pieces of functionality because they interfered with some of the work being done by my Red Gate tools (there’s a code formatter that capitalizes key words, but I’m getting that from SQL Prompt, and the snippet manager steps on SQL Prompt too. Neither is a bad thing, just that SQL Prompt does more, so I use that instead). But those were very minor pieces. All the major pieces of functionality integrate extremely well with SSMS and with my Red Gate tools. In fact, I was doing a demo of SSMS and I started talking about some of the SSMS Tools Pack functionality like it was a part of SSMS and then caught myself. It’s that good.

Window Functions in SQL Server

by Fabiano Amorim

Hi Folks, I’m going to start a new series of three articles about Window functions, starting off by explaining what they are, and how they are used to provide details of an aggregation.

Window functions

Window functions belong to a type of function known as a ‘set function’, which means a function that applies to a set of rows. The word ‘window’ is used to refer to the set of rows that the function works on.
Windowing functions were added to the standard SQL:2003 that is managed by the ISO and it was specified in more detail in SQL:2008 For some time, other DBMSs such as Oracle, Sybase and DB2 have had support for window functions. Even the open source RDBMS PostgreSQL has a full implementation. SQL Server has had only a partial implementation up to now, but it is coming in SQL 2012.
One of the most important benefits of window functions is that we can access the detail of the rows from an aggregation. To see an example, let’s first suppose we have this table and data:

USE tempdb
GOIF OBJECT_ID('TestAggregation') IS NOT NULL
  DROP TABLE TestAggregation
GOCREATE TABLE TestAggregation (ID INT, Value Numeric(18,2))GO
INSERT INTO TestAggregation (ID, Value) VALUES(1, 50.3), (1, 123.3), (1, 132.9),
      (2, 50.3), (2, 123.3), (2, 132.9), (2, 88.9),
      (3, 50.3), (3, 123.3);GO

This is what the data looks like:

SELECT * FROM TestAggregation

If we sum the column Value grouping by ID, by using a conventional GROUP BY, we would have the following query and result:

SELECT ID, SUM(Value)
FROM TestAggregation
GROUP BY ID;

Here we can see a sample of the partitions of rows, or sets that the SUM aggregation function is working on.

In the blue we have partition 1, green as partition 2 and red as partition 3. Because we applied the aggregation function in the column value, grouping the results by ID, we the lose the details of the data. In this case the details are the values of the columns of the rows in the partitions 1, 2 and 3.
Let’s suppose I need to write a query to return the total value of sales, the average value of sales and the quantity of sales for each ID, and still return the actual values of the rows, then we might think that we could use something like this to return this data:

SELECT ID,
       Value,
       SUM(Value) AS "Sum"
       AVG(Value) AS "Avg"
       COUNT(Value) AS "Quantity"   FROM TestAggregation
GROUP BY ID;

Msg 8120, Level 16, State 1, Line 2
Column 'TestAggregation.Value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Unfortunately it is against the way that aggregations work. If you group by something, then you lose access to the details.
A very commonly used alternative is to write every aggregation into a subquery, and then correlate with the main query using a join, something like:

SELECT TestAggregation.ID,
       TestAggregation.Value,
       TabSum."Sum"
       TabAvg."Avg"
       TabCount."Quantity"
  FROM TestAggregation
INNER JOIN (SELECT ID, SUM(Value) AS "Sum"
               FROM TestAggregation
              GROUP BY ID) AS TabSum
    ON TabSum.ID = TestAggregation.ID
INNER JOIN (SELECT ID, AVG(Value) AS "Avg"
               FROM TestAggregation
              GROUP BY ID) AS TabAvg
    ON TabAvg.ID = TestAggregation.ID
INNER JOIN (SELECT ID, COUNT(Value) AS "Quantity"
               FROM TestAggregation
              GROUP BY ID) AS TabCount
    ON TabCount.ID = TestAggregation.ID

But a neater and faster solution is this…

SELECT
  TestAggregation.ID,
  TestAggregation.Value,
  AggregatedValues.[Sum],
  AggregatedValues.[Avg],
  AggregatedValues.QuantityFROM   TestAggregationINNER JOIN
  (
  SELECT ID,
         SUM(Value) AS "Sum"
         AVG(Value) AS "Avg"
         COUNT(Value) AS "Quantity"     FROM TestAggregation
   GROUP BY ID) AggregatedValuesON AggregatedValues.ID=TestAggregation.ID

A very elegant alternative is to use the clause OVER() implemented in the aggregation functions, it allow-us to access the details of the rows that have been aggregated.
For instance we could try to get the result we want with this …

SELECT ID,
       Value,
       SUM(Value) OVER() AS "Sum"
       AVG(Value) OVER() AS "Avg"
       COUNT(Value) OVER() AS "Quantity"   FROM TestAggregation

…but, with this query, we didn’t return the expected results. In fact, we returned the aggregate values for the entire table rather than for each ID. Using the clause OVER() with the aggregation function (SUM, AVG and COUNT) we have accessed the details of the grouped data along with a total aggregation of the data, but, in In fact, we want the aggregates for the data grouped by ID. To do this, we should use the clause PARTITION BY clause. For instance:

SELECT ID,
       Value,
       SUM(Value) OVER(PARTITION BY ID) AS "Sum"
       AVG(Value) OVER(PARTITION BY ID) AS "Avg"
       COUNT(Value) OVER(PARTITION BY ID) AS "Quantity"
  FROM TestAggregation

Now we can see the same results as in the subquery alternative, but with a much more simple and elegant code.
In the following picture we can see that even the results are grouped by ID we can see access the details of the aggregated partitions through the clause OVER and PARTITION BY.

Set based thinking

It can get tedious to hear about this set base thing, but in fact it’s not so easy to think set based. Sometimes it’s very hard to find a set based solution to a query we have. Window Functions give us more set-based solutions to awkward problems.
The main point of windowing functions it’s that they were created to work with a set. SQL Server never was good on processing queries row by row, that’s why you always hearing that ‘cursors are evil’, and are ‘not good for performance’, ‘you should avoid them’ and so on. SQL Server was not built to work row by row.
Let’s illustrate this. SQL Server takes an average of 50 seconds to run a loop of 1 hundred million times against 100 milliseconds of a Win32 app.

DECLARE @i INT = 0, @Time Time(3) = GETDATE()WHILE @i < 100000000BEGIN
SET @i += 1;END

SELECT CONVERT(Time(3), GETDATE() - @Time) AS "Total Time to run the Loop"

Delphi Code:

PROCEDURE TForm1.Button1Click(Sender: TObject);Var
  i : Integer;
  Tempo : TDateTime;BEGIN
  i := 0;
  Tempo := Now();
  WHILE i < 100000000 do
  BEGIN
    inc(i);
  END;

Of course it is an unfair comparison. The compiler of a win32 application is totally different from SQL Server, what I wanted to show here is the fact that SQL Server was not supposed to run row by row.
I once was in London doing training with my colleague Itzik Ben-Gan when I remember he said: “There is no row-by-row code that you cannot run an equivalent set-based version, the problem is that you didn’t figured out how to do it”. Yes it’s a heavy phrase but, who could tell Itzik that this is not true? Not I!

Window functions on SQL Server 2005 and 2008

Since SQL Server 2005 we have had support for some window functions, they are: ROW_NUMBER, RANK, DENSE_RANK and NTILE.
In this first article we’ll review how these functions works, and how they can help-us to write better and efficient set-based codes.
Test database
To test the functions we’ll use a table called Tab1. The code to create the table is the following:

USE TempDB
GOIF OBJECT_ID('Tab1') IS NOT NULL
DROP TABLE Tab1
GOCREATE TABLE Tab1 (Col1 INT)GO
INSERT INTO Tab1 VALUES(5), (5), (3) , (1)GO

Row_Number()

The ROW_NUMBER function is used to generate a sequence of numbers based in a set in a specific order, in easy words, it returns the sequence number of each row inside a set in the order that you specify.
For instance:

-- RowNumberSELECT Col1,
ROW_NUMBER() OVER(ORDER BY Col1 DESC) AS "ROW_NUMBER()"
FROM Tab1

The column called “ROW_NUMBER()” is one of a series of numbers created in the order of Col1 descending. The clause OVER(ORDER BY Col1 DESC) is used to specify the order of the sequence for which the number should be created. It is necessary because rows in a relational table have no ‘natural’ order.

Rank() & Dense_Rank()

Return the position in a ranking for each row inside a partition. The ranking is calculated by 1 plus the number of previews rows.
It’s important to mention that the function RANK returns the result with a GAP after a tie, whereas the function DENSE_RANK doesn’t. To understand this better, let’s see some samples.

-- RankSELECT Col1,
RANK() OVER(ORDER BY Col1 DESC) AS "RANK()" FROM Tab1
GO
-- Dense_RankSELECT Col1,
DENSE_RANK() OVER(ORDER BY Col1 DESC) AS "DENSE_RANK" FROM Tab1

Notice that in the RANK result, we have the values 1,1,3 and 4. The value Col1 = “5” is duplicated so any ordering will produce a ‘tie’ for position between them. They have the same position in the rank, but, when the ordinal position for the value 3 is calculated, this position isn’t 2 because the position 2 was already used for the value 5, in this case the an GAP is generated and the function returns the next value for the tank, in this case the value 3.

NTILE()

The NTILE function is used for calculating summary statistics. It distributes the rows within an ordered partition into a specified number of “buckets” or groups. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs. It makes it easy to calculate n-tile distributions such as percentiles.
Let’s see a sample:

-- NTILESELECT Col1,
NTILE(3) OVER(ORDER BY Col1 DESC) AS "NTILE(3)"
FROM Tab1

In the result above we can see that 4 rows were divided by 3 it’s 1, the remaining row is added in the initial group. Let’s see another sample without remained rows.

-- NTILESELECT Col1,
NTILE(2) OVER(ORDER BY Col1 DESC) AS "NTILE(2)"
FROM Tab1

It is good practice to order on a unique key, ensure that there are more buckets than rows, and to have an equal number of rows in each bucket

The Power of Window functions

Now let’s see some examples where window functions could be used to return some complex queries.

Example 1

Let’s suppose I need to write a query to return those employees that receive more than his department’s average.
Let’s start by creating two sample tables called Departamentos (departments in Portuguese) and Funcionarios (employees in Portuguese):

IF OBJECT_ID('Departamentos') IS NOT NULL
  DROP TABLE Departamentos
GOCREATE TABLE Departamentos (ID      INT IDENTITY(1,1) PRIMARY KEY,
                            Nome_Dep VARCHAR(200))GO
INSERT INTO Departamentos(Nome_Dep) VALUES('Vendas'), ('TI'), ('Recursos Humanos')GO
IF OBJECT_ID('Funcionarios') IS NOT NULL
  DROP TABLE Funcionarios
GOCREATE TABLE Funcionarios (ID      INT IDENTITY(1,1) PRIMARY KEY,
                           ID_Dep  INT,
                           Nome    VARCHAR(200),
                           Salario Numeric(18,2))GO
INSERT INTO Funcionarios (ID_Dep, Nome, Salario) VALUES(1, 'Fabiano', 2000), (1, 'Amorim', 2500), (1, 'Diego', 9000),
      (2, 'Felipe', 2000), (2, 'Ferreira', 2500), (2, 'Nogare', 11999),
      (3, 'Laerte', 5000), (3, 'Luciano', 23500), (3, 'Zavaschi', 13999)GO

This is what the data looks like:

To write this query I could do something like this:

SELECT Departamentos.Nome_Dep,
       Funcionarios.Nome AS Funcionario,
       Funcionarios.Salario,
       AVG(Funcionarios.Salario) OVER(PARTITION BY Departamentos.Nome_Dep) "Média por Departamento"
       Salario - AVG(Funcionarios.Salario) OVER(PARTITION BY Departamentos.Nome_Dep) "Diferença de Salário"   FROM Funcionarios
INNER JOIN Departamentos
    ON Funcionarios.ID_Dep = Departamentos.ID
ORDER BY 5 DESC

As we can see, the employees Luciano, Nogare and Diego are receiving much more than the average of their department.
Using the OVER clause, I was able to partition the data per each department, and then I could access the average with the details of the salary.
You can avoid using window functions by using a query like this.

SELECT Departamentos.Nome_Dep,
       Funcionarios.Nome AS Funcionario,
       Funcionarios.Salario,
       [Média por Departamento],
       Salario - [Média por Departamento] AS [Diferença de Salário]
FROM Funcionarios
INNER JOIN Departamentos
    ON Funcionarios.ID_Dep = Departamentos.ID
INNER JOIN (SELECT ID_Dep, AVG(Funcionarios.Salario) AS [Média por Departamento] FROM FuncionariosGROUP BY ID_Dep)[Média]ON [Média].ID_Dep=Funcionarios.ID_DepORDER BY [Diferença de Salário] DESC

Example 2

Another very common problem is the “running totals”. To show this sample I’ll use a very simple data. Let’s started by creating a table called tblLancamentos where I’ve a column with a date and a column with a numeric value.

IF OBJECT_ID('tblLancamentos') IS NOT NULL
  DROP TABLE tblLancamentos
GO
-- Tabela de Lançamentos para exemplificar o SubtotalCREATE TABLE tblLancamentos (DataLancamento  Date,
                             ValorLancamento FLOAT)GO
-- Insere os registrosINSERT INTO tblLancamentos VALUES ('20080623',100)INSERT INTO tblLancamentos VALUES ('20080624',-250)INSERT INTO tblLancamentos VALUES ('20080625',380)INSERT INTO tblLancamentos VALUES ('20080626',200)INSERT INTO tblLancamentos VALUES ('20080627',-300)GO

One of the alternatives to return the running total of the column ValorLancamento ordered by DataLancamento is to write a query like the following:

SELECT DataLancamento,
       ValorLancamento,
      (SELECT SUM(ValorLancamento)
         FROM tblLancamentos
        WHERE DataLancamento <= QE.DataLancamento) AS SaldoFROM tblLancamentos AS QE

In the query above we join the table with the aggregation to return the total. Another easier, quicker and more elegant alternative would be:

SELECT DataLancamento,
       ValorLancamento,
       SUM(ValorLancamento) OVER(ORDER BY DataLancamento) AS Saldo
  FROM tblLancamentos AS QE

In the query above I’m using the clause OVER() with an ORDER BY. Unfortunately this will only possible with next version of SQL Server, SQL Server 2012.

Conclusion

In the next article I’ll show in more details the limitations of the window functions on SQL Server 2008 R2, and compare the performance of a running aggregation on SQL Server 2012 (Denali).
In the final and third article I’ll show in details how the window frame works. Also I’ll show how the windowing functions LEAD, LAG, FIRST_VALUE, LAST_VALUE, PERCENT_RANK and CUME_DIST works in SQL Server.
That’s all folks, see you soon with the second part of this article.