Star joins
are useful when you query information from dimensional models that
are constructed of two or more dimension tables that surround a centralized
fact table, which is known as a star schema. SAS SPD Server star joins
are queries that validate, optimize, and execute SQL queries in the
SAS SPD Server database for performance. If the star join is not used,
the SQL is processed in the SAS SPD Server by using pair-wise joins,
which require one step for each table to complete the join. When the
SAS SPD Server options are set, the star join is enabled.
You must
meet the following requirements in order to enable a star join SAS
SPD Server:
-
All dimension tables must surround
a single fact table.
-
Dimension-to-fact table joins must
be equal joins, and there should be one join per dimension table.
-
You must have two or more dimension
tables in the join condition.
-
The fact table must have at least
one subsetting condition placed on it.
-
All subsetting and join conditions
must be specified in the WHERE clause.
-
Star join optimization must be
enabled through the setting of options on the SAS SPD Server library.
In order
to enable star join optimization, code that runs on the generated
Pass SAS SPD Server system library must have the following options
added to the library:
Here is
a commented example of a WHERE clause that enables a SAS SPD Server
star join optimization:
where
/* dimension1 equi-joined on the fact */
hh_&statesimple.geosur = hh_dim_geo_&statesimple.geosur
/* dimension2 equi-joined on the fact */
and hh_&statesimple.utilsur = hh_dim_utility_&statesimple.utilsur
/* dimension3 equi-joined on the fact */
and hh_dim_family_&statesimple.famsur =
hh_dim_family_&statesimple.famsur
/* subsetting condition on the fact */
and hh_dim_family_&statesimple.PERSONS = 1
;
Note: The SAS SPD
Server requires all subsetting to be implemented on the
Where tab in the SQL Join transformation. For more information
about SAS SPD Server support for star joins, see the
SAS
Scalable Performance Data Server 4.4: User's Guide. When
the code is properly configured, the following output is generated
in the log:
SPDS_NOTE: STARJOIN optimization used in
SQL execution