Another technique to consider when you are building the data warehouse is to use
incrementing integer surrogate keys as the main key technique in your data structures.
Surrogate keys are values that are assigned sequentially as needed to populate a
dimension. They are very useful because they can shield users from changes in the operational
systems that might invalidate the data in a warehouse (and thereby require redesign
and reloading). For example, if the
operational system changes its key length or type, then a
surrogate key remains valid. An operational key does not remain valid.
The SCD Type 2 transformation includes a surrogate key generator. You can also plug
in your own methodology that matches your business environment
to generate the keys and point the transformation to it. A Surrogate Key Generator
transformation can be used to build incrementing integer surrogate keys.
Avoid character-based
surrogate keys. In general, functions that are based on integer keys
are more efficient because they avoid the need for subsetting or string
partitioning that might be required for character-based keys. Numeric
strings are also smaller in size than character strings, thereby reducing
the storage required in the warehouse.