Question

I m experiencing massive slowness when accessing one of my tables and I need some re-factoring advice. Sorry if this is not the correct area for this sort of thing.

I m working on a project that aims to report on server performance statistics for our internal servers. I m processing windows performance logs every night (12 servers, 10 performance counters and logging every 15 seconds). I m storing the data in a table as follows:

CREATE TABLE [dbo].[log](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [logfile_id] [int] NOT NULL,
    [test_id] [int] NOT NULL,
    [timestamp] [datetime] NOT NULL,
    [value] [float] NOT NULL,
CONSTRAINT [PK_log] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH FILLFACTOR = 90 ON [PRIMARY]
) ON [PRIMARY]

There s currently 16,529,131 rows and it will keep on growing.

I access the data to produce reports and create graphs from coldfusion like so:

SET NOCOUNT ON

CREATE TABLE ##RowNumber ( RowNumber int IDENTITY (1, 1), log_id char(9) )

INSERT ##RowNumber (log_id)
SELECT l.id
FROM log l, logfile lf
WHERE lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#"
and l.timestamp >= #arguments.report_from#
and l.timestamp < #arguments.report_to#
and l.logfile_id = lf.id
order by l.timestamp asc

select rn.RowNumber, l.value, l.timestamp
from log l, logfile lf, ##RowNumber rn
where lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#
and l.logfile_id = lf.id
and rn.log_id = l.id
and ((rn.rownumber % #modu# = 0) or (rn.rownumber = 1)) 
order by l.timestamp asc

DROP TABLE ##RowNumber

SET NOCOUNT OFF

(for not CF devs #value# inserts value and ## maps to #)

I basically create a temporary table so that I can use the rownumber to select every x rows. In this way I m only selecting the amount of rows I can display. this helps but it s still very slow.

SQL Server Management Studio tells me my index s are as follows (I have pretty much no knowledge about using index s properly):

IX_logfile_id (Non-Unique, Non-Clustered)
IX_test_id (Non-Unique, Non-Clustered)
IX_timestamp (Non-Unique, Non-Clustered)
PK_log (Clustered)

I would be very grateful to anyone who could give some advice that could help me speed things up a bit. I don t mind re-organising things and I have complete control of the project (perhaps not over the server hardware though).

Cheers (sorry for the long post)

Answer 1

Your problem is that you chose a bad clustered key. Nobody is ever interested in retrieving one particular log value by ID. I your system is like anything else I ve seen, then all queries are going to ask for:

all counters for all servers over a range of dates
specific counter values over all servers for a range of dates
all counters for one server over a range of dates
specific counter for specific server over a range of dates

Given the size of the table, all your non-clustered indexes are useless. They are all going to hit the index tipping point, guaranteed, so they might just as well not exists. I assume all your non-clustered index are defined as a simple index over the field in the name, with no include fields.

I m going to pretend I actually know your requirements. You must forget common sense about storage and actually duplicate all your data in every non-clustered index. Here is my advice:

Drop the clustered index on [id], is a as useless as is it gets.
Organize the table with a clustered index (logfile_it, test_id, timestamp).
Non-clusterd index on (test_id, logfile_id, timestamp) include (value)
NC index on (logfile_id, timestamp) include (value)
NC index on (test_id, timestamp) include (value)
NC index on (timestamp) include (value)
Add maintenance tasks to reorganize all indexes periodically as they are prone to fragmentation

The clustered index covers the query history of specific counter value at a specific machine . The non clustered indexes cover various other possible queries (all counters at a machine over time, specific counter across all machines over time etc).

You notice I did not comment anything about your query script. That is because there isn t anything in the world you can do to make the queries run faster over the table structure you have.

Now one thing you shouldn t do is actually implement my advice. I said I m going to pretend I know your requirements. But I actually don t. I just gave an example of a possible structure. What you really should do is study the topic and figure out the correct index structure for your requirements:

Also a google on covering index will bring up a lot of good articles.

And of course, at the end of the day storage is not free so you ll have to balance the requirement to have a non-clustered index on every possible combination with the need to keep the size of the database in check. Luckly you have a very small and narrow table, so duplicating it over many non-clustered index is no big deal. Also I wouldn t be concerned about insert performance, 120 counters at 15 seconds each means 8-9 inserts per second, which is nothing.

Answer 2

A couple things come to mind.

Do you need to keep that much data? If not, consider either creating an archive table if you want to keep it (but don t create it just to join it with the primary table every time you run a query).
I would avoid using a temp table with so much data. See this article on temp table performance and how to avoid using them.

http://www.sql-server-performance.com/articles/per/derived_temp_tables_p1.aspx

It looks like you are missing an index on the server_id field. I would consider creating a covered index using this field and others. Here is an article on that as well.

http://www.sql-server-performance.com/tips/covering_indexes_p1.aspx

Edit

With that many rows in the table over such a short time frame, I would also check the indexes for fragmentation which may be a cause for slowness. In SQL Server 2000 you can use the DBCC SHOWCONTIG command.

See this link for info http://technet.microsoft.com/en-us/library/cc966523.aspx

Also, please note that I have numbered these items as 1,2,3,4 however the editor is automatically resetting them

Answer 3

Once when still working with sql server 2000, i needed to do some paging, and i came accross a method of paging that realy blew my mind. Have a look at this method.

DECLARE @Table TABLE(
        TimeVal DATETIME
)

DECLARE @StartVal INT
DECLARE @EndVal INT

SELECT  @StartVal = 51, @EndVal = 100

SELECT  *
FROM    (
            SELECT  TOP (@EndVal - @StartVal + 1)
                    *
            FROM    (
                        --select up to end number
                        SELECT  TOP (@EndVal)
                                *
                        FROM    @Table
                        ORDER BY TimeVal ASC
                    ) PageReversed
            ORDER BY TimeVal DESC
        ) PageVals
ORDER BY TimeVal ASC

As an example

SELECT  *
FROM    (
            SELECT  TOP (@EndVal - @StartVal + 1)
                    *
            FROM    (
                        SELECT TOP (@EndVal)
                                l.id,
                                l.timestamp
                        FROM log l, logfile lf
                        WHERE lf.server_id = #arguments.server_id#
                        and l.test_id = #arguments.test_id#"
                        and l.timestamp >= #arguments.report_from#
                        and l.timestamp < #arguments.report_to#
                        and l.logfile_id = lf.id
                        order by l.timestamp asc
                    ) PageReversed ORDER BY timestamp DESC
        ) PageVals
ORDER BY timestamp ASC

友情链接