Type 2 surrogate key with type 3 attribute In many Type 2 and Type 6 SCD implementations, the
surrogate key from the dimension is put into the fact table in place of the
natural key when the fact data is loaded into the data repository. The surrogate key is selected for a given fact record based on its effective date and the Start_Date and End_Date from the dimension table. This allows the fact data to be easily joined to the correct dimension data for the corresponding effective date. Here is the Supplier table as we created it above using Type 6 Hybrid methodology: Once the Delivery table contains the correct Supplier_Key, it can easily be joined to the Supplier table using that key. The following SQL retrieves, for each fact record, the current supplier state and the state the supplier was located in at the time of the delivery: SELECT delivery.delivery_cost, supplier.supplier_name, supplier.historical_state, supplier.current_state FROM delivery INNER JOIN supplier ON delivery.supplier_key = supplier.supplier_key;
Pure type 6 implementation Having a Type 2 surrogate key for each time slice can cause problems if the dimension is subject to change. A pure Type 6 implementation does not use this, but uses a surrogate key for each
master data item (e.g. each unique supplier has a single surrogate key). This avoids any changes in the master data having an impact on the existing transaction data. It also allows more options when querying the transactions. Here is the Supplier table using the pure Type 6 methodology: The following example shows how the query must be extended to ensure a single supplier record is retrieved for each transaction. SELECT supplier.supplier_code, supplier.supplier_state FROM supplier INNER JOIN delivery ON supplier.supplier_key = delivery.supplier_key AND delivery.delivery_date >= supplier.start_date AND delivery.delivery_date A fact record with an effective date (Delivery_Date) of August 9, 2001 will be linked to Supplier_Code of ABC, with a Supplier_State of 'CA'. A fact record with an effective date of October 11, 2007 will also be linked to the same Supplier_Code ABC, but with a Supplier_State of 'IL'. While more complex, there are a number of advantages of this approach, including: •
Referential integrity by DBMS is now possible, but one cannot use Supplier_Code as
foreign key on Product table and using Supplier_Key as foreign key each product is tied on specific time slice. • If there is more than one date on the fact (e.g. Order_Date, Delivery_Date, Invoice_Payment_Date) one can choose which date to use for a query. • You can do "as at now", "as at transaction time" or "as at a point in time" queries by changing the date filter logic. • You don't need to reprocess the fact table if there is a change in the dimension table (e.g. adding additional fields retrospectively which change the time slices, or if one makes a mistake in the dates on the dimension table one can correct them easily). • You can introduce
bi-temporal dates in the dimension table. • You can join the fact to the multiple versions of the dimension table to allow reporting of the same information with different effective dates, in the same query. The following example shows how a specific date such as '2012-01-01T00:00:00' (which could be the current datetime) can be used. SELECT supplier.supplier_code, supplier.supplier_state FROM supplier INNER JOIN delivery ON supplier.supplier_key = delivery.supplier_key AND supplier.start_date '2012-01-01T00:00:00'; == Type 7: Hybrid - Both surrogate and natural key ==