I ve looked all over and haven t yet found an intelligent way to handle this, though I feel sure one is possible:
One table of historical data has quarterly information:
CREATE TABLE Quarterly (
unique_ID INT UNSIGNED NOT NULL,
date_posted DATE NOT NULL,
datasource TINYINT UNSIGNED NOT NULL,
data FLOAT NOT NULL,
PRIMARY KEY (unique_ID));
Another table of historical data (which is very large) contains daily information:
CREATE TABLE Daily (
unique_ID INT UNSIGNED NOT NULL,
date_posted DATE NOT NULL,
datasource TINYINT UNSIGNED NOT NULL,
data FLOAT NOT NULL,
qtr_ID INT UNSIGNED,
PRIMARY KEY (unique_ID));
The qtr_ID field is not part of the feed of daily data that populated the database - instead, I need to retroactively populate the qtr_ID field in the Daily table with the Quarterly.unique_ID row ID, using what would have been the most recent quarterly data on that Daily.date_posted for that data source.
For example, if the quarterly data is
101 2009-03-31 1 4.5
102 2009-06-30 1 4.4
103 2009-03-31 2 7.6
104 2009-06-30 2 7.7
105 2009-09-30 1 4.7
and the daily data is
1001 2009-07-14 1 3.5 ??
1002 2009-07-15 1 3.4 &&
1003 2009-07-14 2 2.3 ^^
then we would want the ?? qtr_ID field to be assigned 102 as the most recent quarter for that data source on that date, and && would also be 102 , and ^^ would be 104 .
The challenges include that both tables (particularly the daily table) are actually very large, they can t be normalized to get rid of the repetitive dates or otherwise optimized, and for certain daily entries there is no preceding quarterly entry.
I have tried a variety of joins, using datediff (where the challenge is finding the minimum value of datediff greater than zero), and other attempts but nothing is working for me - usually my syntax is breaking somewhere. Any ideas welcome - I ll execute any basic ideas or concepts and report back.