English 中文(简体)
Recommended way of querying multiple Versioned tables
原标题:

Have a win 2003 box with MSSQL 2005 running on it. There is a database which is populated every morning with new/modified SalesOrder made the previous day. The database has several tables: SalesOrder, SalesOrderItem, SalesOrderItemBom. Each with a corresponding Version table (i.e. SalesOrderVersion, SalesOrderItemVersion, SalesOrderItemBomVersion) which has the exact same fields but with 2 addition columns VersionStartDate, VersionEndDate. The non-Versioned tables have the latest data.

Also the VersionStartDate is part of the PK for the version tables, so for example:- SalesOrder has OrderID as PK and SalesOrderItem has VersionStartDate, OrderID as the PK.

An simplified example of how the version table works:

SalesOrder

OrderID, Amount 1, 100 2, 200

SalesOrderVersion

VersionStartDate, OrderID, VersionEndDate, Amount 20090101 13:00:00, 1, 20090103 08:00:00, 50 20090103 08:00:00, 1, 99991231 00:00:00, 100 20090101 09:00:00, 2, 20090105 15:00:00, 300 20090105 15:00:00, 2, 99991231 00:00:00, 200

whenever the a row in SalesOrder is changed, the current row VersionEndDate in SalesOrderVersion is updated and a new row is inserted into SalesOrderVerion with VersionEndDate 99991231

Notes: If a record in SalesOrderItem was changed, it does not necessarily result in a change in "parent" record in SalesOrder

Have been requested to do a report showing trend and daily increments in sales. Off the top of my head, I was thinking of creating 3 snapshot tables for SalesOrder, SalesOrderItem, SalesOrderItemBom, which capture the "latest data" as of the current day and thus build incremental snapshots to show trends. Besides requiring more disk space, is there any draw back to this method compared to doing a stored procedure which joins the Version tables, cos it seems to be a long and expensive query.

Any thoughts or recommendations?

问题回答

There s a whole lot of "it depends" in here. Here are some discussion ideas.

Is data only ever loaded in the morning? That would mean the data doesn t change during the course of the day, so long expensive queries would not cause locking or blocking issues. Is this database only used for reporting? If not--if the data gets updated throughout the day--additional redundant copies of the data may be necessary to support your requirements.

How frequently will the new reports be run? Will they be run many, many times for a given day, or only once or twice? If the "time-aggregated" data s going to be accessed over and over again, making a redundant copy (the snapshot) might be worthwhile, but if the reports are going to be run once or twice and then dumped, I don t know that I d bother.

How important is performance? Must the reports be generated and complete within two-three seconds after a phb hits a button? (pause) No, really, particularly once you explain to them the cost in dollars (extra hard drive space, extra backup space, extra time to prepare and backup and restore, whatever other stealth costs arise from data bloat). If they can wait a few minutes for once-a-day reports, then make it cheaper. (You still have the up-front cost of writing more complex code, but once it s done, it s done.)

On the flip side, adding a routine to generate the day s report after the day s data is loaded, and only keeping that one set of data (or maybe a set of data for the past week, four weeks, whatever) has a strong appeal. If you know they re going to run that 5 minute report, launch it in the AM so it s ready when they come in.

These are all based on guesses at your operational requirements. Like I said, much depends on what you have and what you need.





相关问题
How to write this T-SQL WHERE condition?

I ve got two tables: TableA Col1 Col2 TableB Col3 Col4 I want to join them together: SELECT * from TableA join TableB ON (...) Now, in place of ... I need to write an expression ...

Customer and Order Sql Statement

TSQL query to select all records from Customer that has an Order and also select all records from customer that does not have an Order. The table Customer contains a primary key of CustomerID. The ...

Recommended way of querying multiple Versioned tables

Have a win 2003 box with MSSQL 2005 running on it. There is a database which is populated every morning with new/modified SalesOrder made the previous day. The database has several tables: SalesOrder, ...

update duplicate record

I have a table with the following fields Id Name IsPublic i need to write a sql query that updates IsPublic to false where name has a duplicate. Only one of the duplicates should have IsPublic = ...

Define variable to use with IN operator (T-SQL)

I have a Transact-SQL query that uses the IN operator. Something like this: select * from myTable where myColumn in (1,2,3,4) Is there a way to define a variable to hold the entire list "(1,2,3,4)"? ...

Selecting records during recursive stored procedure

I ve got a content management system that contains a hierarchical structure of categories, with sub-categories subject to different ordering options at each level. Currently, that s retrieved by a (...

热门标签