Refreshing a dynamic
cluster table uses a fraction of the disk space that a traditional
SPD Server table with the same amount of data uses. The dynamic cluster
table architecture enables users to refresh many large tables concurrently,
while conserving disk and I/O resources. With very large traditional
SAS or SPD Server tables, available disk space can limit the number
of tables that can be concurrently refreshed.
In the life cycle of
data warehouses, tables can be refreshed to recapture disk space after
rows have been updated or deleted. Refreshing tables can reorder data
for optimized performance. However, refreshing a table can temporarily
use twice the disk space of the table itself. With very large tables,
disk space can be a problem when updating a data warehouse or data
mart. When disk space is limited on a server, the amount of data that
can be simultaneously refreshed is constrained. The window of time
that is required to load and refresh can become huge.
Because dynamic cluster
tables can be quickly unbound into smaller SPD Server tables, refreshing
dynamic cluster tables does not use twice the disk space of the original
table itself. Instead, only twice the disk space of the largest member
table in the dynamic cluster table is used.
After the dynamic cluster
table is unbound, disk space equal to the first member table is required
to perform a refresh. A backup of the refresh is created, and then
the old version is deleted, which creates more available disk space.
The refresh process repeats for each successive member table until
all members in the dynamic cluster table have been refreshed and updated.
Then, the member tables are merged into a dynamic cluster table again.
When a server has enough
disk space and I/O resources to refresh more than one member table
at a time, the benefits of parallel processing can be realized.