English 中文(简体)
SQL Table Size And Query Performance
原标题:

We have a number of items coming in from a web service; each item containing an unknown number of properties. We are storing them in a database with the following Schema.

Items
- ItemID
- ItemName

Properties
- PropertyID
- PropertyName
- PropertyValue
- PropertyValueType
- TransmitTime
- ItemID [fk]

The properties table is growing quite large since it stores the properties for each item, each time the web service is called. My question is this: at what point should we stop adding new records to the Properties table, and archive older Property records according to their transmit time? When does the properties table become too large, and take too long to query? Is there a rule of thumb?

Thanks.

最佳回答

There is no rule of thumb

Some thoughts:

  • define "large" (we have 160 million rows tables)
  • do you have a problem now? if no, don t fix it
  • have you run profiler or some of the whizzy dmvs to find out bottlenecks (missing indexes etc)
  • if you need the data to be in hand, then you can t archive it
  • you could partition the table though
问题回答

I m not sure about MS SQL Server but most databases seem to have a way to partition tables. That is, make a virtual table from many smaller tables and divide the data between them based on some simple rules.

This is very good for time based data like this. Divide the table on a time period like a day or an hour. Then once per time period add a new table partition and drop the oldest table partition. Much more efficient than doing a DELETE WHERE time< now - 1 hour , or whatever.

Or instead of dropping the oldest, archive it or just let it stick around taking up space. As long as your queries always specify the date range, the queries can use only the most appropriate sub-tables.

I don t think there s a golden rule for this. Your schema is pretty normalized though normalization can result in significant degrading in performance.

Several factors to consider:
- Usage scenario
- Server hardware specs
- Nature of DB Operation (E.g. more read than write?, insert and no update?)

For your case, if the number of properties do not exceed a certain number, a single jagged table might be better or maybe not. (I might get flamed for this statement :P)

Archiving strategy also depend on your business needs/requirement. You might need to pump up your hardware just to meet that need.

Depending of how many specific "property types" you have, the observation pattern may be able to help.

In your example:
Item = Subject,
Property = Observation,
PropertyName = ObservationType.Name,
PropertyValueType = ObservationType.IsTrait

This way you do not repeat PropertyName and PropertyValueType in each record. Depending on your application, if you can cache ObservationType and Subjectin app layer, then inserts will improve too.

- Measurement and trait are types of observations. Measurement is a numeric observation, like height. Trait is a descriptive observation, like color.

observation_model_02





相关问题
Export tables from SQL Server to be imported to Oracle 10g

I m trying to export some tables from SQL Server 2005 and then create those tables and populate them in Oracle. I have about 10 tables, varying from 4 columns up to 25. I m not using any constraints/...

SQL server: Can NT accounts be mapped to SQL server accounts

In our database we have an SQL server account that has the correct roles to access some of the databases. We are now switching to windows authentication and I was wondering if we can create a NT user ...

SQL Server 2000, ADO 2.8, VB6

How to determine if a Transaction is active i.e. before issuing Begin Transaction I want to ensure that no previous transaction are open.. the platform is VB6, MS-SQL Server 2000 and ADO 2.8

热门标签