Introduction

Each time a SQL Server execution string needs to hold up, before beginning or continuing with its work, SQL Server tracks the holdup in the sys.dm_os_wait_Stats DMV. It tracks both the source and length of the holdup, and the holdups are amassed after some time. A string may need to hold up, for instance, for an asset; for example, the CPU, or for information to be perused from plate into memory, or to obtain a lock. By building up a gauge for such holdup measurements under ordinary workload, we can comprehend our typical holdup examples. By following the information after some time, we can search for sudden changes in holdup conduct, and along these lines decide the main driver of any recognized issue.

MCSE Training – Resources (Intense)

In this article, I’ll examine methodologies and strategies for catching, exploring, and overseeing holdup measurements information, with the goal that you can recognize stressing patterns, for example, a float up in the noticeable quality of a sure kind of holdup, and research the reason rapidly and effectively.

To work through this article, you’ll need SQL Server 2005, SQL Server 2008, or higher and a BaselineData database in which to store gauge data.

Short Background on Wait Statistics

I’m not going to endeavor much else besides a brief prologue to holdup insights here, as I need to stay concentrated on the subject of gathering baselines. Nonetheless, see the References at the end of the article for an accumulation of articles and posts that give a more profound foundation.

As specified in the presentation, SQL Server knows when execution strings have needed to wait for a resource, and to what extent they have needed to wait. This data is uncovered in the sys.dm_os_wait_stats DMV.

Waits and Queues

The values stored in sys.dm_os_wait_stats are running aggregates, gathered over all sessions since the last server restart, or a manual reset of the measurements utilizing the DBCC SQLPERF summon (more on this in no time).

It is critical to comprehend that all SQL Server cases will have waits, no matter how well you tune and improve them. In this way, if this information gathers over a long stretch and after that we question the DMV for, say, the “main 10 waits to a SQL Server instance,” it will in any case be exceptionally hard to know whether any of these waits speak to a potential issue or are only “ordinary” for that case. We will probably comprehend the standard waits for a case. When we realize what’s ordinary, then we can center our tuning endeavors and we have a reference if performance degrades suddenly.

How to Clear Wait Stats

As a matter of course, SQL Server clears the cumulative wait statistics for an occurrence from the sys.dm_os_wait_stats DMV upon instance restart. Moreover, a DBA can clear the stats physically utilizing DBCC SQLPERF (http://msdn.microsoft.com/en-us/library/ms189768.aspx).

It is not necessary to clear out wait statistics all the time. In any case, to break down this information meaningfully, both to comprehend what “ordinary” conduct is and to rapidly spot anomalies when contrasting information from a past period with current information, it is imperative that DBAs receive a reasonable arrangement on the timing and recurrence of clearing wait statistics.

On the off chance that one DBA is clearing wait statistics at 5 AM day by day, and another DBA is capturing the data at 6 AM, the data just speaks to the wait statistics accumulated during an hour’s workload, which may not speak to the normal workload.

In a perfect world, we have to gather wait statistics for a period that speaks to ordinary movement without clearing them. In the meantime, on the other hand, the DBAs will need to see how huge changes, for example, addition of new indexes, or the alteration of a configuration setting, influence the example of waits for the instance. Clearing the wait statistics quickly in the wake of rolling out the improvement can offer us some assistance with understanding the effect of the change.

There are various reasons that may urge a DBA to clear wait statistics. For instance, a few organizations utilize a single third-party utility for all backups, paying little mind to the application, bringing about a stretched-out time to finish SQL Server database backups. While there are exchange alternatives that can perform quick SQL Server backups (e.g., native SQL backups, dedicated third-party SQL backup applications), the DBA can’t utilize them. In such cases, the DBA knows backup performance to be poor, however it can’t make changes and rather may select to clear out wait statistics after every backup job finishes, to keep any waits from the real backup from affecting the translation of the wait statistics overall. On the other hand, the DBA can channel BACKUP* holds up from the output.

The queries in Listing 1 will uncover when wait statistics were last cleared by an instance restart, and additionally if, and when, somebody last cleared them physically. Just contrast the two values with check whether wait statistics have been physically cleared subsequent to the last restart.

SELECT  [wait_type] ,
        [wait_time_ms] ,
        DATEADD(SS, -[wait_time_ms] / 1000, GETDATE()) AS "Date/TimeCleared" ,
        CASE WHEN [wait_time_ms] < 1000 THEN CAST([wait_time_ms] AS VARCHAR(15)) + ' ms' WHEN [wait_time_ms] BETWEEN 1000 AND 60000 THEN CAST(( [wait_time_ms] / 1000 ) AS VARCHAR(15)) + ' seconds' WHEN [wait_time_ms] BETWEEN 60001 AND 3600000 THEN CAST(( [wait_time_ms] / 60000 ) AS VARCHAR(15)) + ' minutes' WHEN [wait_time_ms] BETWEEN 3600001 AND 86400000 THEN CAST(( [wait_time_ms] / 3600000 ) AS VARCHAR(15)) + ' hours' WHEN [wait_time_ms] > 86400000
             THEN CAST(( [wait_time_ms] / 86400000 ) AS VARCHAR(15)) + ' days'
        END AS "TimeSinceCleared"
FROM    [sys].[dm_os_wait_stats]
WHERE   [wait_type] = 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP';
/*
   check SQL Server start time - 2008 and higher
*/
SELECT  [sqlserver_start_time]
FROM    [sys].[dm_os_sys_info];

/*
   check SQL Server start time - 2005 and higher   
*/
SELECT  [create_date]
FROM    [sys].[databases]
WHERE   [database_id] = 2

Listing 1: When Were Waits Stats Last Cleared, Either Manually or by a Restart?

Eventually, it is at the discretion of the DBAs to choose when to clear out wait statistics. If all else fails, gather your holdup measurements (see the following area) for a period that will catch an agent workload (for instance one month). The following month, gather the details on a more regular schedule; for example, each Sunday, after planned code changes, and afterward instantly get out the sys.dm_os_wait_stats DMV. Compare each of the four one-week information sets to the one-month set: Do diverse wait patterns exist (for instance, maybe the most recent week of the month, when different business reports run, has distinctive waits), or would you say they are reliable over every one of the five sets? On the off chance that you see contrasts, then you might need to consider getting out the details on a consistent (e.g., week by week) premise.

Looking into the collected wait stats is talked about in more detail in the Reviewing Wait Statistics area.

Collecting Wait Statistics for Analysis

In order to collect wait statistics, on a regular schedule, the first step is to create a table to hold the information, as shown in Listing 2 (as described previously, this script assumes the DistributionData database exists).

USE [DistributionData];
GO

IF NOT EXISTS ( SELECT  *
                FROM    [sys].[tables]
                WHERE   [name] = N'WaitStats'
                        AND [type] = N'U' ) 
    CREATE TABLE [dbo].[WaitStats]
        (
          [RowNum] [BIGINT] IDENTITY(1, 1) ,
          [CaptureDate] [DATETIME] ,
          [WaitType] [NVARCHAR](120) ,
          [Wait_S] [DECIMAL](14, 2) ,
          [Resource_S] [DECIMAL](14, 2) ,
          [Signal_S] [DECIMAL](14, 2) ,
          [WaitCount] [BIGINT] ,
          [Percentage] [DECIMAL](4, 2) ,
          [AvgWait_S] [DECIMAL](14, 2) ,
          [AvgRes_S] [DECIMAL](14, 2) ,
          [AvgSig_S] [DECIMAL](14, 2)
        );
GO

CREATE CLUSTERED INDEX CI_WaitStats ON [dbo].[WaitStats] ([RowNum], [CaptureDate]);

Listing 2: Creating the dbo.WaitStats table

The second step is essentially to plan query, to keep running all the time, which captures the wait data to this table. Listing 3 comes from a query introduced in Paul Randal’s wait statistics post. This query utilizes a CTE to capture the crude wait statistics information, and afterward controls the output to incorporate averages: for instance, average wait (AvgWait_S) and average signal wait (AvgSig_S). I incorporated an extra, discretionary INSERT, as it isolates every set of data collected while exploring the output.

USE [DistributionData];
GO

INSERT  INTO dbo.WaitStats
        ( [WaitType]
         )
VALUES  ( 'Wait Statistics for ' + CAST(GETDATE() AS NVARCHAR(19))
         );

INSERT  INTO dbo.WaitStats
        ( [CaptureDate] ,
          [WaitType] ,
          [Wait_S] ,
          [Resource_S] ,
          [Signal_S] ,
          [WaitCount] ,
          [Percentage] ,
          [AvgWait_S] ,
          [AvgRes_S] ,
          [AvgSig_S] 
         )
        EXEC
            ( '
      WITH [Waits] AS
         (SELECT
            [wait_type],
            [wait_time_ms] / 1000.0 AS [WaitS],
            ([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],
            [signal_wait_time_ms] / 1000.0 AS [SignalS],
            [waiting_tasks_count] AS [WaitCount],
            100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],
            ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
         FROM sys.dm_os_wait_stats
         WHERE [wait_type] NOT IN (
            N''CLR_SEMAPHORE'',   N''LAZYWRITER_SLEEP'',
            N''RESOURCE_QUEUE'',  N''SQLTRACE_BUFFER_FLUSH'',
            N''SLEEP_TASK'',      N''SLEEP_SYSTEMTASK'',
            N''WAITFOR'',         N''HADR_FILESTREAM_IOMGR_IOCOMPLETION'',
            N''CHECKPOINT_QUEUE'', N''REQUEST_FOR_DEADLOCK_SEARCH'',
            N''XE_TIMER_EVENT'',   N''XE_DISPATCHER_JOIN'',
            N''LOGMGR_QUEUE'',     N''FT_IFTS_SCHEDULER_IDLE_WAIT'',
            N''BROKER_TASK_STOP'', N''CLR_MANUAL_EVENT'',
            N''CLR_AUTO_EVENT'',   N''DISPATCHER_QUEUE_SEMAPHORE'',
            N''TRACEWRITE'',       N''XE_DISPATCHER_WAIT'',
            N''BROKER_TO_FLUSH'',  N''BROKER_EVENTHANDLER'',
            N''FT_IFTSHC_MUTEX'',  N''SQLTRACE_INCREMENTAL_FLUSH_SLEEP'',
            N''DIRTY_PAGE_POLL'')
         )
      SELECT
         GETDATE(),
         [W1].[wait_type] AS [WaitType], 
         CAST ([W1].[WaitS] AS DECIMAL(14, 2)) AS [Wait_S],
         CAST ([W1].[ResourceS] AS DECIMAL(14, 2)) AS [Resource_S],
         CAST ([W1].[SignalS] AS DECIMAL(14, 2)) AS [Signal_S],
         [W1].[WaitCount] AS [WaitCount],
         CAST ([W1].[Percentage] AS DECIMAL(4, 2)) AS [Percentage],
         CAST (([W1].[WaitS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) AS [AvgWait_S],
         CAST (([W1].[ResourceS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) AS [AvgRes_S],
         CAST (([W1].[SignalS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) AS [AvgSig_S]
      FROM [Waits] AS [W1]
      INNER JOIN [Waits] AS [W2]
         ON [W2].[RowNum] <= [W1].[RowNum]
      GROUP BY [W1].[RowNum], [W1].[wait_type], [W1].[WaitS], 
         [W1].[ResourceS], [W1].[SignalS], [W1].[WaitCount], [W1].[Percentage]
      HAVING SUM ([W2].[Percentage]) - [W1].[Percentage] < 95;'
            );
GO

Listing 3: Capturing Wait Stats Data for Analysis

We ought to capture wait statistics consistently, at any rate once every week or once per month. We could do as such all the more every now and again, maybe day by day; however, recollect that, unless we are clearing the information consistently, they speak to a collection of waits following the last restart. The more extended the waits have been accumulating, the harder it might be to spot littler changes in wait rates. For instance, suppose the framework endures a brief period (60 minutes) of poor performance, during which the quantity of, and span of, a sure wait type increments fundamentally. This “spike” may be difficult to spot if you’re examining waits amassed over a long stretch (e.g., a month), since the spike won’t not influence fundamentally the general wait rate for that period.

Reviewing Wait Statistics Data

You’ll need to audit frequently the wait stats for each SQL Server occurrence. On the off chance that you capture them once every week, then mind the inclining once per week. The simple SELECT in Listing 4 recovers from the dbo.WaitStats table all the information caught throughout the previous 15 days

SELECT  *
FROM    [dbo].[WaitStats]
WHERE   [CaptureDate] > GETDATE() - 30
ORDER BY [RowNum];

Listing 4: Reviewing the Last 30 Days of Data

On the off chance that you have to see older data, alter the number of days as necessary, or remove the predicate. Now and again, it may be perfect to take a gander at just the top wait for every set of data collected, as shown in Listing 5 (once more, change the number of days as required).

SELECT  [w].[CaptureDate] ,
        [w].[WaitType] ,
        [w].[Percentage] ,
        [w].[Wait_S] ,
        [w].[WaitCount] ,
        [w].[AvgWait_S]
FROM    [dbo].[WaitStats] w
        JOIN ( SELECT   MIN([RowNum]) AS [RowNumber] ,
                        [CaptureDate]
               FROM     [dbo].[WaitStats]
               WHERE    [CaptureDate] IS NOT NULL
                        AND [CaptureDate] > GETDATE() - 30
               GROUP BY [CaptureDate]
             ) m ON [w].[RowNum] = [m].[RowNumber]
ORDER BY [w].[CaptureDate];

Listing 5: Reviewing the Top Wait for Each Collected Data Set

There are numerous ways in which to look at this data, yet your focus, at first, ought to be to comprehend the top waits in your system and guarantee they’re predictable after some time. Hope to change your capture and monitoring process in the initial couple of weeks of usage. As talked about before, it’s up to the DBAs to choose an unmistakable, steady arrangement on how regularly to gather and dissect this data, and when to clear out the sys.dm_os_wait_stats DMV. Here, using a method for a beginning stage, I offer three conceivable alternatives for clearing, capturing, and reviewing this data, as proposed by Erin Stellato:

Option 1:

  • Never clear wait statistics
  • Capture weekly (at the end of any business day)
  • Review weekly

Option 2:

  • Clear wait statistics on Sunday nights (or after a full weekly backup)
  • Capture daily at the end of the business day
  • Review daily, checking to see if the percentages for wait types vary throughout the week

Option 3:

  • Clear wait statistics nightly (after full or differential backups complete)
  • Capture daily, at the end of the business day (optional: capture after any evening or overnight processing)
  • Review daily, checking to see how the waits and their percentages vary throughout the week (and throughout the day if capturing more than once a day)

The basic point to keep in mind is that you are capturing this information to accomplish a baseline, and to comprehend “normal” wait patterns on your systems. Be that as it may, it’s basic, as you’re exploring this data, to recognize existing or potential bottlenecks. This is something to be thankful for. Having this data permits you to research a startling or high wait type, and decide the conceivable wellspring of the bottleneck that created the wait, before it turns into a production problem.

Managing Historical Data

Similarly as with all baseline data, it will stop being pertinent, after a sure point, and you can expel it from the DistributionData database. The inquiry in Listing 6 removes data older than 90 days; however, you can modify this value as suitable for your environment. The general size of the dbo.WaitStats table will depend on the decisions you make about how frequently to capture the data and to what extent to keep it.

DELETE  FROM [dbo].[WaitStats]
WHERE   [CaptureDate] < GETDATE() – 90;

Conclusion

Wait statistics are among the best places for a Database Administrator to begin when tuning a SQL Server environment or investigating an execution issue. While Wait statistics alone won’t take care of a problem, they are a key bit of data that will point you toward the right course, especially when you have baseline values that you can reference. The queries given in this article ought to serve as a decent beginning stage for any DBA to capture and survey the Wait statistics. Also, never forget to add the dbo.WaitStats table to your optimization tasks, and add indexes as needed to support new queries and reports.

This article captures baselines on SQL Server instances. Accomplishment with baselines relies on having systems set up to both gather the required data and review it rapidly. Begin with the data that is most indispensable to the administration of your answers, and after that expand on that establishment over the long haul. Keep in mind that the exertion you put in at first to gather this information, will spare time at last. It might likewise mean the distinction between spending the entire night at the server, attempting to investigate an execution issue, and spending an hour or two of time examining the standard information to pinpoint and take care of the issue, and coming back to your own particular bed for a decent night’s rest. Make the time, and make your occupation less demanding.

References