Previous Page | Next Page

Working with Spatial Data

SAS Data Sets

A SAS data set is a collection of data values and their associated descriptive information that is arranged and presented in a form that can be recognized and processed by SAS. SAS data sets can be data files or views. A SAS data file contains the following elements:

A SAS view contains the following elements:

SAS data sets can be indexed by one or more variables, known as key variables. A SAS index contains the data values of the key variables that are paired with location identifiers for the observations that contain the variables. The value and identifier pairs are ordered in a B-tree structure that enables the engine to search by value. SAS indexes are classified as simple or composite, according to the number of key variables they contain.

For more information about SAS data sets, SAS files, SAS views, and SAS indexes, refer to SAS Language Reference: Concepts.


SAS/GIS Data Sets

As a component of SAS, SAS/GIS stores all of its data in SAS data sets. The SAS/GIS spatial database works as one logical entity, but is physically separated into six different categories of data sets:

A given SAS/GIS map can reference only one chains, nodes, details, and label data set, but it can reference multiple polygonal index and attribute data sets. Multiple SAS/GIS maps can use a single set of chains, nodes, and details data sets.


Chains Data Set

The chains data set contains coordinates for the polylines that are used to form line and polygon features. A polyline consists of a series of connected line segments that are chains. A chain is a sequence of two or more points in the coordinate space. The end points, the first and last points of the chain, must be nodes. Each chain has a direction, from the first point toward the last point. The first point in the chain is the from-node, and the last point is the to-node. Relative to its direction, a chain has a left side and a right side. Points between the from-node and the to-node are detail points, which serve to trace the curvature of the feature that is represented by the chain. Detail points are not nodes.

The chains data set also lists the from-node and to-node row numbers in the nodes data set, as well as the number of detail points and the corresponding details data set row number. The left and right side attribute values (for example, ZIP codes and FIPS codes) are also stored in the chains data set.


Nodes Data Set

The nodes data set contains the coordinates of the end points for the chains in the chains data set and the linkage information that is necessary to attach chains to the correct nodes. A node is a point in the spatial data with connections to one or more chains. Nodes can be discrete points or the end points of chains. A node definition can span multiple records in the nodes data set, so only the starting record number for a node is a node feature ID.


Details Data Set

The details data set stores the curvature points of a chain between the two end nodes, which are also called the from-node and the to-node. That is, the details data set contains all the coordinates between the intersection points of the chain. The node coordinates are not duplicated in the details data set. The details data set also contains the chains data set row number of the associated chain.


Polygonal Index Data Sets

The polygonal index data set contains one observation for each polygon that was successfully closed during the index creation process. It is called a polygonal index because each observation is an index to a polygon in the chains data set. That is, it points to the starting chain in the chains data set for each of the polygons.

If polygon areas, perimeter distances, and centroid locations were computed, then that information is also stored in the polygonal index data set.


Label Data Set

The label data set defines the attributes of labels to be displayed on the map. The attributes include all of the information that is applicable for each label, such as location, color, size, source of the text for a text label, as well as other behavioral and graphical attributes.


Attribute Data Sets

Attribute data sets contain values related to the map features. The observations in attribute data sets must be associated with observations in the chains data set. Attribute data is used to display themes on the map and for spatially oriented reports, graphs, map actions, and so forth.


Managing Data Set Sizes

By their nature, spatial databases tend to be rather large. Users of spatial data want as much detail in the maps as they can get, which increases the demands on storage and processing capacity. Spatial data that is not carefully managed can become too large for easy use.

Here are five actions that you can take to manage the size of your spatial data sets. You need to perform most of these actions before importing your data into SAS/GIS.

Of the five actions, reducing the number of attributes is the easiest to perform. Use the Import window, which you access by selecting Modify Composites from the GIS Spatial Data Importing window, to remove and drop unneeded composite variables from your data set as it is imported.


Import Type Specific Variables

The following tables describe the composites and variables that are created for each of the import types. All of the variables are located in the chains data set except for the X and Y variables, which are in the nodes data set.

Partial Listing of Composites and Variables Specific to the ArcInfo Interchange Import Type
Composite Variable 1 Variable 2 Type (table note 1) Description
ARCID ARCIDL ARCIDR A or C ARCID from the ArcInfo coverage. Maps made from line and point coverages do not have left and right variables.
ARCNUM

C ARCNUM from the coverage.
'COVERAGE' 'COVERAGE'_L 'COVERAGE'_R A or C This variable is derived from the input filename. It is the last word preceding the file extension. For example, /local/gisdata/montana.e00 would have a 'COVERAGE' (table note 2) name of montana . The left variable would be montanal , the right variable would be montanar , and the composite type would be Area. Line and point coverages do not have left- and right-side variables, and the composite type would be Classification.
AREA AREAL AREAR A AREA from the coverage.
PERIMETER PERIML PERIMR A PERIMETER from the coverage.
'ATTRIB' 'ATTRIB'L 'ATTRIB'R
All variables in the polygon, line, or point attribute tables are saved as composite variables. In the case of the polygon coverages, an L or an R is added to the end of the first five characters of the actual variable name.
_COVER_ _COVEL _COVER A or C This variable contains the name stored in the 'COVERAGE' variable.
_SRC_ _SRCL _SRCR C Contains the string 'ARC'.
X X
X X coordinate.
Y Y
Y Y coordinate.

TABLE NOTE 1:  Values for Type are as follows:

A Area
C Classification
x X coordinate
Y Y coordinate
 [arrow]

TABLE NOTE 2:   Names in single quotation marks, such as 'COVERAGE' and 'ATTRIB,' are GIS composite names. [arrow]

Partial Listing of Composites and Variables Specific to the Digital Line Graph (DLG) Import Type
Composite Variable 1 Variable 2 Type (table note 1) Description
LMAJOR(n) LMAJOR(n)
C Major line attribute code.
LMINOR(n) LMINOR(n)
C Minor line attribute code.
NMAJOR(n) NMAJOR(n)
C Major node attribute code.
NMINOR(n) NMINOR(n)
C Minor node attribute code.
MAJOR(n) AMAJORR(n) AMAJORL(n) A Major area attribute code.
MINOR (n) AMINORL(n) AMINORR(n) A Minor area attribute code.
X X
X X coordinate.
Y Y
Y Y coordinate.

TABLE NOTE 1:   Values for Type are as follows:

A Area
C Classification
x X coordinate
Y Y coordinate
 [arrow]

Partial Listing of Composites and Variables Specific to the Drawing Interchange File (DXF) Import Type
Composite Variable 1 Variable 2 Type (table note 1) Description
'ATTRIB' 'ATTRIB'L 'ATTRIB'R A or C All polygon, line, or point attributes are saved as composite variables. In the case of polygon maps, an L or R is added to the end of the first seven characters of the actual variable name.

TABLE NOTE 1:   Values for Type are as follows:

A Area
C Classification
 [arrow]

Partial Listing of Composites and Variables Specific to the Genline Import Type
Composite Variable 1 Variable 2 Type (table note 1) Description
ID ID
C The ID variable from the data set.
'ATTRIB' 'ATTRIB' 'ATTRIB' C Any other variable in the data set is saved as a classification composite.
X X
X X coordinate.
Y Y
Y Y coordinate.

TABLE NOTE 1:   Values for Type are as follows:

C Classification
x X coordinate
Y Y coordinate
 [arrow]

Partial Listing of Composites and Variables Specific to the Genpoint Import Type
Composite Variable 1 Variable 2 Type (table note 1) Description
ID ID
C The ID variable from the data set.
'ATTRIB' 'ATTRIB' 'ATTRIB' C Any other variable in the data set is saved as a classification composite.
X X
X X coordinate.
Y Y
Y Y coordinate.

TABLE NOTE 1:   Values for Type are as follows:

C Classification
x X coordinate
Y Y coordinate
 [arrow]

Partial Listing of Composites and Variables Specific to the MapInfo Import Type
Composite Variable 1 Variable 2 Type (table note 1) Description
'ATTRIB' 'ATTRIB'L 'ATTRIB'R A or C All polygon, line, or point attributes are saved as composite variables. In the case of polygon maps, an L or R is added to the end of the first seven characters of the actual variable name.
LINELYR

C This variable is derived from the input filename. It is the last word preceding the file extension. For example, /local/gisdata/montana.mif would have a LINELYR name of montana.
PTLYR

C This variable is derived from the input filename. It is the last word preceding the file extension. For example, /local/gisdata/montana.mif would have a PTLYR name of montana .
POLYLYR

A This variable is derived from the input filename. It is the last word preceding the file extension. For example, /local/gisdata/montana.mif would have a POLYLYR name of montana .
'MAP' 'MAP'L 'MAP'R A or C This variable is derived from the input filename. It is the last word preceding the file extension. For example, /local/gisdata/usa.mif , would have a 'MAP' name of usa . The left variable would be usal , the right variable would be usar and, in this case, the composite type would be Area. Line and point maps do not have left- and right-side variables, and the composite would be Classification.

TABLE NOTE 1:   Values for Type are as follows:

A Area
C Classification
 [arrow]

Partial Listing of Composites and Variables Specific to the SAS/GRAPH and Genpoly Import Types
Composite Variable 1 Variable 2 Type (table note 1) Description
'IDVAR'n 'IDVAR'L 'IDVAR'R A An area composite variable is created for each ID variable (IDVAR) selected by the user in the ID vars list box. In the case of polygon maps, an L or R is added to the end of the first seven characters of the actual variable name.

TABLE NOTE 1:   Values for Type are as follows:

A Area
 [arrow]

Composites and Variables Specific to the TIGER and DYNAMAP Import Types
Composite Variable 1 Variable 2 Variable 3 Variable 4 Type (table note 1) Description
ADDR FRADDL FRADDR TOADDL TOADDR ADDR Address range.
BLOCK BLOCKL BLOCKR

A Block number.
CFCC CFCC


C Feature classification code.
COUNTY COUNTYL COUNTYR

A County FIPS code.
DIRPRE DIRPRE


ADDRP Feature direction prefix.
DIRSUF DIRSUF


ADDRS Feature direction suffix.
FEANAME FEANAME


C Feature name.
MCD MCDL MCDR

A Minor civil division.
PLACE PLACEL PLACER

A Incorporated place code.
RECTYPE RECTYPE


C Record type.
STATE STATEL STATER

A State FIPS code.
TRACT TRACTL TRACTR

A Census tract.
ZIP ZIPL ZIPR

A ZIP code.
BG BGL BGR

A Block group.
LONGITUDE X


X Longitude.
LATITUDE Y


Y Latitude.

TABLE NOTE 1:   Values for Type are as follows:

A Area
ADDR Address
ADDRP Address Prefix
ADDRS Address Suffix
C Classification
x Longitude
Y Latitude
 [arrow]

Previous Page | Next Page | Top of Page