by
Greg Larsen
I’ve been on a performance kick lately. I’ve been sharing with you how to use SQL Server 2005 DMVs and DMFs to identify the resource intensive queries for SQL Server 2005. Well now I’m going to switch gears and look at SQL Server 2008 and discuss one of the new concepts known as Management Data Warehouse (MDW) and the processes around managing data in the MDW. In this article I’ll give you an overview of what the MDW is, how to set it up using the wizard, and will discuss the different system Data Collections that are installed with the wizard.
What is the Management Data Warehouse?
The MDW is just a single database inside of a SQL Server 2008 instance, which holds various performance related statistics. The performance statistics in the MDW are gathered via special data gathering routines, known as Data Collections. The MDW can contain Data Collection information from a single instance, or can hold data collected from multiple instances. The Data Collection process uses SSIS and SQL Agent jobs to gather, schedule and purge information in the MDW.
SQL Server 2008 provides three different system Data Collection definitions. These Data Collections are Disk Usage, Query Activity, and Server Activity. Each of these Data Collection definitions identifies the data to be collected; how it often it should be collected and how long it should be stored in the MDW. Information for these system Data Collections can be found in the snapshots schema within the MDW. Microsoft has also provided standard report to allow you to drill down into data collected for each of these collections using SSMS. I will not be covering the standard reports in this article.
If you want to collect your own performance data periodically you are allowed to define your own data collection definitions to feed the MDW database. User defined Data Collection information will be stored in the custom_snapshots schema. For your own data collections you will need to build your own reporting process. You can do that with an application, or building custom reports that can be run directly from SQL Server Management Studio (SSMS) that references your data collected in the MDW.
What is a Data Collection?
A data collection is a container that allows you to specify what data to collect, when to collect it, and how long to keep it. It does this by using Data Collector. Everyone should be familiar with the concept of a Data Collector. It is someone or something that goes around and collects items from a source, using a specific method and then depositing the collected items in target location somewhere. The tax man is a collector. He takes taxes from you, via a tax return and then deposits them into the tax reserves. Well a SQL Server 2008 Data Collection does the same thing. It collects data from a source location (a SQL Server instance), using a Data Collector and deposits it in a target location, the MDW.
A given Data Collector collects a specific type of data. As of the February CTP of SQL Server 2008 there are only four different collector types. They are: T-SQL Query, SQL Trace, Performance Counters, and Query Activity. Each data collection is defined against the source where you want to collect the data, and then associated with a collector type to identify the type of data that will be collected.
The data can be collected continuously, on a predetermined schedule, or manually. A data collection also identifies how long the collected data will be retained in the MDW. The Data Collection process handles the loading, and purging of data in the MDW. When data is collected continuously SQL Server caches the information. By caching the information SQL Server 2008 optimize the performance impact of collecting data continuously.
All data collected by a Data Collection container is stored in the MDW that is identified for the specific instance of SQL Server where the Data Collection is defined. The MDW database for an instance might be stored physically within the instance, or store on another instance. Where it is stores the collected data depends on how the MDW is set up for a particular instance. When I discuss how to use the wizard to setup the MDW for an instance you should get a better understanding of how to identify a location for the MDW for a given SQL Server instance.
The Data Collection process requires SSIS to transform and store the data collection information and SQL Server Agent schedule to execute the data gathering process. Therefore if you plan to use the MDW then you will need to install SSIS and make sure both SSIS and SQL Server Agent services are running.
Setting Up the Management Data Warehouse
To set up the MDW you need to run through a wizard. The wizard is started by clicking on “Configure Management Data Warehouse” item, which can be found by right clicking on the “Data Collection” item under the “Management” folder in the “Object Explorer” pane within SSMS, as the screen shot below shows:
When the wizard is started a “Welcome to the Configuration Management Warehouse Wizard” window will be displayed. The welcome screen explains a little about the wizard, and the rights needed to create and manage the MDW. The kinds of rights needed by the wizard will depend on whether you are setting up a new MDW database, or just setting up a new SQL Server instance to populate an existing MDW. If you are setting up a new MDW database then you need to make sure the login you are running the wizard under has rights to create a database. If you are only using the wizard to set up a new SQL Server instance to use an existing MDW then the login only needs the rights to define users in the MDW database and associate the database users with appropriate roles (more on this later). When you click on the “Next>” button on the welcome screen the following window will be displayed:
On this screen you will identify the server and database where the MDW will live, or already lives if you are just configuring a new SQL Server instance to use and existing MDW database. The MDW database could live on the SQL Server instance being configured, or it could live on another instance on the same server, or a complete different SQL Server 2008 machine. If you have many SQL Server machines and/or instances you might want to consider using the same MDW database for all instances. This way you all your data collection information will be stored in a single MDW database. This is especially useful if you want to write your own reporting mechanism. By doing this you will be able simply the reporting process by housing all your Data Collection data in a single database.