Running the Standard Web Log Tutorial Job

Overview

You can use the standard Web log tutorial job when you want to process a single clickstream log that was produced by a standard Web server. Note that the standard Web logs basic template generally provides better performance because it uses parallel processing. However, the standard Web log tutorial can help you learn about the structure of clickstream templates and the properties of clickstream transformations.
If you have not done so already, you should run a copy of the setup job for the tutorial Web log template, which is named clk_0010_tutorial_weblog_setup. The setup job creates test data and a folder called ClickstreamTemplates with subfolders to hold the output from the standard Web log tutorial. When you actually process the data, you should run this copy of the standard Web log tutorial job, which is named clk_0200_tutorial_weblog_load_weblog_detail. For information about how to set up the copy of the required folder structure, see Copying the Folder Structure of a Clickstream Job. By running a copy, you protect the original template.
Perform the following tasks to run a single log job:

Review and Prepare the Job

You can examine the standard Web log tutorial job on the Diagram tab of the SAS Data Integration Studio Job Editor before you run it. You can also specify the location of the clickstream log to process.
The following display shows a sample renamed tutorial job:
Copied Standard Web Log Tutorial Job
Copied Standard Web Log Tutorial Job
Note that the following tables are created as additional outputs in the Clickstream Parse and Clickstream Sessionize transformations:
  • UNIQUEPARMS: the unique parameters (as well as the type of parameter) found while processing the data
  • SESSIONS: information about the sessions found while processing the data
  • SPIDERS: information about the non-human visitor sessions found while processing the data
  • SPIDER DETAIL: the detail activity of non-human visitor sessions identified in the SPIDERS table
These tables are created during the processing necessary to produce the final WEBLOG_DETAIL output table. They are stored in the Additional Output library that is specified on the Clickstream Parse transformation or the Clickstream Sessionize transformation.
Perform the following steps to review and prepare the job:
  1. Open the renamed copy of the standard Web log tutorial job.
  2. Scroll through the job on the Diagram tab and review the following items:
    • the section that processes the source clickstream log
    • the section that parses the data into meaningful columns
    • the section that creates sessions and generates an output table
  3. Open the File Location tab in the properties window for the Clickstream Log transformation and review the file path to the clickstream log in the File name field. Specify another path if you need to process a different log. Click OK to close the properties window when you are finished.
    Note: You can click the Preview button to view the first few lines of the file and confirm that you have selected a valid path.

Run the Job and Examine the Output

Perform the following steps to run a standard Web log tutorial job and examine its output:
  1. Run the job.
  2. If the job completes without error, right-click the OUTPUT_DETAIL table at the end of the job and select Open in the pop-up menu.
    The View Data window appears, as shown in the following display.
    Standard Web Log Tutorial Job Output
    Standard Web Log Tutorial Job Output