The
option entered in the
Hold Buffer Size field
in the
Input pane on the
Options tab in the Clickstream Parse transformation can have a significant
effect on the performance of the transformation. When Web servers
write raw data to the logs, the records are typically written in chronological
order. The hold buffer size option represents the amount of this data
that is held in memory before it is written to the output table.
For example,
the default value of
120
causes all records
that have a timestamp within the last 120 seconds of the latest timestamp
to be held in memory. With this value, any records that have a date-and-time
stamp that is not within that 120-second range are added to the output
table. This hold buffer usually enables any incoming records that
are slightly out of chronological order to be corrected. Thus, a subsequent
sort of the data can generally be avoided.
However,
the default hold buffer size does not always work as expected. If
you find that your incoming data is out of chronological order and
exceeds this 120-second threshold, you can the increase the hold buffer
size. However, the larger hold buffer increases the memory used by
the Clickstream Parse transformation because more data is held in
the buffer before it is sent to the output table.
If the
hold buffer functionality is consistently unable to prevent a sort,
it can be switched off with a value of
0
.
This setting can result in a subsequent sort being required. However,
it removes some of the processing overhead that occurs in managing
the buffer.