Another
technique to consider when you are building the data warehouse is
to use incrementing integer surrogate keys as the main key technique
in your data structures. Surrogate keys are values that are assigned
sequentially as needed to populate a dimension. They are very useful
because they can shield users from changes in the operational systems
that might invalidate the data in a warehouse (and thereby require
redesign and reloading). For example, if the operational system changes
its key length or type, then a surrogate key remains valid. An operational
key does not remain valid.
The SCD
Type 2 transformation includes a surrogate key generator. You can
also plug in your own methodology that matches your business environment
to generate the keys and point the transformation to it. A Surrogate
Key Generator transformation can be used to build incrementing integer
surrogate keys.
Avoid
character-based surrogate keys. In general, functions that are based
on integer keys are more efficient because they avoid the need for
subsetting or string partitioning that might be required for character-based
keys. Numeric strings are also smaller in size than character strings,
thereby reducing the storage required in the warehouse.