Problem Formulation

Consider the following scenario. As part of your business, you sell books and other publications from your website. You web log data contains information about the navigation patterns that visitors make within your website. You want to examine the files that are requested. In addition, you want to determine the most commonly occurring paths that visitors take from your home page to the check-out page. In other words, you want to determine what paths visitors take when they make a purchase.
The data set SAMPSIO.WEBPATH contains a set of hypothetical web log data. The data set contains the following information:
Variable Name
Model Role
Measurement Level
Description
REFERRER
Rejected
Nominal
Referring URL
REQUESTED_FILE
Target
Nominal
File that the visitor clicked
SESSION_IDENTIFIER
ID
Nominal
Unique ID code for the session
SESSION_SEQUENCE
Sequence
Interval
Order in which files were requested within a session