Running a Page-Tagging ETL Job

Problem

You want to process the data collected by a clickstream collection server.

Solution

You can process the job in the page tagging job template. Unlike other template job processing, the page tagging template uses two Clickstream Parse transformations to extract the tagged data. The following overview shows the steps that are executed.
  1. Clickstream Log: Reads in the tagged data from the raw tagged Web log.
  2. Checkpoint for Clickstream Log.
  3. Parse Tagged Data Items: This step is responsible for extracting all tagged data elements and for generating output ready for the subsequent Clickstream Parse.
  4. Checkpoint for Parse Tagged Data Items.
  5. Parse: This step is responsible for processing the data from the original requested file.
  6. Checkpoint for Clickstream Parse.
  7. Clickstream Sessionize: Sessions the data as normal and includes the tagged data elements extracted in the first Clickstream Parse transformation.
  8. Checkpoint for Clickstream Sessionize.
With SAS Data Integration Studio 4.2 and later, you can add notes to the job. A Read Me First note in the job flow informs the user to open the job properties window and edit the default value for the Email Address for Checkpoint Notifications parameter on the Parameters tab. The value that you set is used by all the Checkpoint transformations in this job. These Checkpoint transformations notify you when errors occur at strategic points in the job.
Perform the following tasks to run the page tagging default job:

Tasks

Prepare the Job

If you have not done so already, you should run a copy of the setup job for the page tagging template, which is named clk_0010_setup_page_tagging. When you actually process the data, you should copy and rename the page tagging template job before you run it. For example, you might run a job named clk_0020_page_tagging_detail_Site1 job. Renaming a copy of the job ensures that you keep the original template job and retain access to its default values. (See Copying the Page Tagging Template.)
The following display shows a sample renamed template job.
Copied Page Tagging Template Job
Copied Page Tagging Job

Run the Job and Examine the Output

Perform the following steps to run the page tagging job and examine its output:
  1. Open the job.
    The following display shows a successfully completed job.
    Completed Page Tagging Template Job
    Completed Page Tagging Job
  2. If the job is completed without error, right-click the Tagged_DDS table at the end of the job and click Open in the pop-up menu.
    The View Data window appears, as shown in the following display.
    Page Tagging Output
    Page Tagging Output