Another technique to consider when you
are building the data warehouse is to use incrementing integer surrogate
keys as the main key technique in your data structures. Surrogate
keys are values that are assigned sequentially as needed to populate
a dimension. They are very useful because they can shield users from
changes in the operational systems that might invalidate the data
in a warehouse (and thereby require redesign and reloading). For example,
if the operational system changes its key length or type, then a surrogate
key remains valid. An operational key does not remain valid.
The SCD Type 2 transformation
includes a surrogate key generator. You can also plug in your own
methodology that matches your business environment to generate the
keys and point the transformation to it. A Surrogate Key Generator
transformation can be used to build incrementing integer surrogate
keys.
Avoid character-based
surrogate keys. In general, functions that are based on integer keys
are more efficient because they avoid the need for subsetting or string
partitioning that might be required for character-based keys. Numeric
strings are also smaller in size than character strings, thereby reducing
the storage required in the warehouse.