Constructing a SAS Scalable Performance Data Server Star Join

Problem

You want to construct SAS Scalable Performance Data (SPD) Server star joins.

Solution

You can use the SAS Data Integration Studio SQL Join transformation to construct SAS SPD Server star joins when you use SAS SPD Server version 4.3 or later.

Tasks

Construct an SPD Server Star Join

Star joins are useful when you query information from dimensional models that are constructed of two or more dimension tables that surround a centralized fact table, which is known as a star schema. SAS SPD Server star joins are queries that validate, optimize, and execute SQL queries in the SAS SPD Server database for performance. If the star join is not used, the SQL is processed in the SAS SPD Server by using pair-wise joins, which require one step for each table to complete the join. When the SAS SPD Server options are set, the star join is enabled.
You must meet the following requirements in order to enable a SAS SPD Server star join:
  • All dimension tables must surround a single fact table.
  • Dimension-to-fact table joins must be equal joins, and there should be one join per dimension table.
  • You must have two or more dimension tables in the join condition.
  • The fact table must have at least one subsetting condition placed on it.
  • All subsetting and join conditions must be specified in the WHERE clause.
  • Star join optimization must be enabled through the setting of options on the SAS SPD Server library.
In order to enable star join optimization, code that runs on the generated Pass SAS SPD Server system library must have the following options added to the library:
  • LIBGEN=YES*
  • IP=YES
Here is a commented example of a WHERE clause that enables a SAS SPD Server star join optimization:
where
/* dimension1 equi-joined on the fact */
   hh_&statesimple.geosur = hh_dim_geo_&statesimple.geosur
/* dimension2 equi-joined on the fact */
   and hh_&statesimple.utilsur = hh_dim_utility_&statesimple.utilsur
/* dimension3 equi-joined on the fact */
   and hh_dim_family_&statesimple.famsur =
hh_dim_family_&statesimple.famsur
/* subsetting condition on the fact */
   and hh_dim_family_&statesimple.PERSONS = 1
;
Note: The SAS SPD Server requires all subsetting to be implemented on the Where tab in the SQL Join transformation. For more information about SAS SPD Server support for star joins, see the SAS Scalable Performance Data Server: User's Guide. When the code is properly configured, the following output is generated in the log: SPDS_NOTE: STARJOIN optimization used in SQL execution.