Running a Subsite Job

Problem

You want to isolate the data from one or more subsites from a clickstream log in a job.

Solution

You can process the clickstream log in the subsite job template. If you have not done so already, you should run a copy of the setup job for the subsite template, which is named clk_0010_setup_sub_site. When you process the data, you should run a copy of the subsite job, which is named clk_0200_create_sub_site_tables. By running a copy, you protect the original template. For information about running the setup job and creating a copy of the original job, see Copying the Sub Site Templates Folder.
Perform the following tasks to run the template:

Tasks

Review and Prepare the Job

You can examine the subsite job on the Diagram tab of the SAS Data Integration Studio Job Editor before you run it. You can also configure the job to change the file location of the clickstream log that you process and adjust the global rules that are applied to the log before the subsites are processed.
Perform the following steps to make these adjustments:
  1. Open the renamed subsite job.
  2. Scroll through the job on the Diagram tab.
    Note the following components:
    • the section that validates the source clickstream log and applies global rules
    • the transformations that isolate subsites, identify sessions, and generate subsite output tables
    • the section that parses the source clickstream log as whole, identifies sessions, and generates an output table for all of the log data
    For an overview of how the job is processed, see Stages in the Subsite Template Job.
  3. Open the File Location tab in the properties window for the Clickstream Log transformation and review the file path to the clickstream log in the File name field. Specify another path if you need to process a different log. Click OK to close the properties window when you are finished.
  4. Open the Rules tab in the properties window for the Clickstream Parse - Global Rules transformation.
    Note that the following rules are enabled:
    • Filter graphics pages
    • Filter non-pages
    • Filter spiders by user agent
    To display the properties windows for any of these rules, right-click the rule and click Properties in the pop-up menu. You can make any needed changes in the Rule Properties window. For example, you can edit the types of graphics files that are filtered by the Filter graphics file rule. Open the properties window for the rule and click Search Options next to the Column field under the Column search radio button. Close the windows that you have opened before you return to the application.

Run the Job and Examine the Output

Perform the following steps to run a subsite job and examine its output:
  1. Run the job.
    The following display shows a successfully completed sample job:
    Completed Subsite Job
    Completed Subsite Job
  2. If the job completes without error, right-click the output table from one of the subsites and click Open in the pop-up menu.
    The View Data window for the table appears, as shown in the following display:
    Single Subsite Output
    Single Subsite Output
  3. Right-click the ALL_SUBSITES table and click Open in the pop-up menu.
    The following display shows the View Data window for the table:
    Output from All Subsites
    Output From All Subsites