The Clickstream
Sessionize transformation reads data from the input transformation
(typically the Clickstream Parse transformation). Once the input data
is clean and you have identified a Visitor ID, then you need to identify
sessions. A session consists of the series of the user's clicks from
the time that the user enters the Web site, clicks on certain pages,
and then exits at another point.
The Clickstream
Sessionize transformation enables you to identify the user sessions,
identify spiders and other non-human visitors, and manage sessions
that span Web logs. The output goes to a table or continues within
the job for additional processing. The Clickstream Sessionize transformation
is shown in the following display.
Clickstream Sessionize Transformation
The Clickstream Sessionize transformation passes the
same set of columns that it receives on the input to the output. The
transformation also adds the following columns:
Clickstream Sessionize Generated Columns
|
|
|
|
|
|
|
Specifies the assigned
session identifier for this visitor session.
*Default name for the
column identified as holding or representing the Session ID.
|
If the Session_ID column
is present on the input table and has a value, then this value is
used as the identifier for this visitor's session. If the Session_ID
value is blank or the Session_ID column is not present in the incoming
table, then it is derived from User-Defined Rules or the default configuration
option. (This option combines CLK_Client_IP , CLK_cs_UserAgent, and
date time.)
|
|
|
|
|
Specifies whether the
record belongs to an open or closed session. When this value is set
to 1 , it indicates that this record
belongs to a closed session. A value of 0 indicates that this record belongs to an open session.
|
Is set to 1 when a session has exceeded the session timeout
value. Otherwise, this value is set to 0 .
|
|
|
|
|
Specifies whether this
is the first click of the visitor's session. When this value is set
to 1 , it indicates that this is the
first click of the visitor’s session. Otherwise, this value
is set to 0 .
|
Examines in date_time
order. The first click entry_point is set to 1 ; all others are set to 0 .
|
|
|
|
|
Specifies whether this
is the last click of the visitor's session and whether it belongs
to an open or closed session.
When this value is set
to 1 , it indicates that this is the
last click of the visitor’s session and it belongs to a closed
session. When this value is set to 2 , it indicates that this is the last click of the visitor’s
session and it belongs to an open session. Otherwise, this value is
set to 0 .
|
Examines clicks in date_time
order. The final click exit_point is set to 1 ; all others are set to 0 or 2 .
|
|
|
|
|
Specifies the amount
of time the visitor spent on the page before the next click.
|
Subtracts date_time
of current click from date_time of subsequent click. The last click
in a session is set to missing.
|
|
|
|
Note: Extra columns
that are on the input to the Clickstream Sessionize transformation
are passed through. The generated columns are added to the output
detail data table.
Typical
user tasks for the Clickstream Sessionize transformation include the
following:
-
specifying the way that non-human
visitors are detected and handled
-
managing sessions that span Web
logs
-
specifying options for the Clickstream
Sessionize transformation