English 中文(简体)
How do i add a column to a fact table after it is already deployed and populated?
原标题:

I have a SQL Server 2005 data-mart star schema with the usual fact and dimensions tables. This is deployed and being populated via an SSIS based ETL package. All fine so far.

I have just been approached by the customer with a new requirement. Never, I hear you say! This requirement will mean I need to add a new dimension table to the data-mart to measure a new aspect of the incoming facts which happen to be financial.

To be able to slice the facts by this new dimension I need to add a new foreign key column in the fact table linking to the new dimension.

I am unclear on the best way to do this. What should I do with the data that has already been captured? Just make the new column null-able and accept that old fact will have a NULL? Actually, as I am typing, its dawned on me that I could update old facts as well. Or maybe I should create a separate (child?) fact table which would just contain a link to each new (parent) fact and a link to the new dimension.

I ve not been able to find any information on a best practice for this type of change.

Any help would be much appreciated.

By the way. No Analysis Services used yet.

Thanks, Martin

最佳回答
  1. Add the new dimension table.

  2. Populate it.

  3. Add the nullable reference on your existing facts.

    Not all facts may be joinable to the new dimension. This is common when you have new information. If you have all of your original files, you may have all the information required to update all facts.

    If all facts cannot be mapped to the new dimension, add a "N/A" row to the dimension. Sometimes there s a good reason to have several N/A rows depending on what you know about your facts.

  4. Update your facts so they all reference the new dimension -- either the proper dimension value or the special N/A row.

Ideally, you ll modify your fact table to make the column not-nullable. Sometimes this takes a painfully long time, and it s easier to export the data, redefine the table, and reload the data.

问题回答

Well S.Lot answered most of it, I would just add that what to do with the old facts is a business decision. Make sure you ask them and get it in writing.

S.Lott has a great answer, if you combine that with your suggestion of a "child" fact table, you could have another fact table as a "bridge table" with the new dimension and all the other same dimensions, but think about it - if you are going to that trouble, you could just make a new fact table design identical to the old, but with a non-nullable new dimension (the only expense over your bridge table would be the space of the facts anyway). Populate the history appropriately into the new table - linking either to a valid dimension or the N/A dimension for history as S.Lott suggested. Then start populating new data. You never have to go back and change the nullable to non-nullable. Then you can just deprecate the old fact table - or swap your presentation layer views (if you have those, this gives you a lot mofr freedom underneath with your facts and dimensions).





相关问题
SQL Server database is not visible

I have installed ASP.NET application with database on our server. ASP.NET application created database using connection string below. The problem is that I do not see database in SQL Server Management ...

Most efficient way to store number 11.111 in SQL Server

What datatype is the most efficient to store a number such as 11.111. The numbers will have up 2 digits before the point and up to three after. Currently the column is a bigint , I am guessing that ...

表格和数据表

是否需要在提瓜表之后更新表格统计数据,还是自动更新?

Inconsistent Generate Change Script

I add a column of type tinyint and being set to not allow nulls in a table and generate the change scripts. The table has data in it at this time. The script has code that creates a temp table and ...

Performance of Sql subqueriesfunctions

I am currently working on a particularly complex use-case. Simplifying below :) First, a client record has a many-to-one relationship with a collection of services, that is, a single client may have ...

selecting one value out of an xml column

I have a particularly troublesome xml column to query data from. The schema is fixed by the Quebec Ministry of Revenue so "it is what it is" The important part of the query looks like this: with ...

Selecting records during recursive stored procedure

I ve got a content management system that contains a hierarchical structure of categories, with sub-categories subject to different ordering options at each level. Currently, that s retrieved by a (...

热门标签