space
Previous Page | Next Page

SYSTEM 2000 Software

SYSTEM 2000 Databases


Overview of Database Definition

A SYSTEM 2000 database is hierarchical because you can store and access data according to organized relationships in groups of associated data. When a SYSTEM 2000 database is created, a plan called a database definition is devised, in which

Usually, a database is organized according to the types of data and the way you want to use the data. To create descriptor files for the SAS/ACCESS interface, you must understand and be familiar with the contents of the database and its organization in order to retrieve and update information accurately and efficiently. Database Definition for the Database EMPLOYEE shows an excerpt of the database definition for EMPLOYEE, which is the output from a DESCRIBE statement in SYSTEM 2000. (For a complete listing of the database definition for EMPLOYEE, see Example Programs.) More information about a database definition is given in the sections that follow Output 2.1.

Database Definition for the Database EMPLOYEE

  SYSTEM RELEASE NUMBER  12.1
  DATA BASE NAME IS      EMPLOYEE
  DEFINITION NUMBER             2
  DATA BASE CYCLE NUMBER       25
       1*  EMPLOYEE NUMBER (INTEGER NUMBER 9999)
       2*  LAST NAME (CHAR X(10) WITH  FEW FUTURE OCCURRENCES )
       3*  FORENAME (NON-KEY CHAR X(20))
               .
               .
               .
      16*  ZIP CODE (CHAR X(5) WITH  FEW FUTURE OCCURRENCES )
     100*  POSITION WITHIN COMPANY (RECORD)
       101*  POSITION TITLE (NON-KEY CHAR X(10) IN 100)
       102*  DEPARTMENT (CHAR X(14) IN 100 WITH SOME FUTURE OCCURRENCES )
       103*  MANAGER (CHAR XXX IN 100 WITH  FEW FUTURE OCCURRENCES )
       104*  POSITION TYPE (CHAR X(12) IN 100 WITH SOME FUTURE OCCURRENCES )
       105*  START DATE (DATE IN 100)
       106*  END DATE (NON-KEY DATE IN 100)
       110*  SALARY WITHIN POSITION (RECORD IN 100)
         111*  PAY RATE (MONEY $9999.99 IN 110)
         112*  PAY SCHEDULE (CHAR X(7) IN 110)
         113*  EFFECTIVE DATE (DATE IN 110)
         114*  CURRENT DEDUCTION (NON-KEY MONEY $9999.99 IN 110)
                .
                .
                .
     400*  EDUCATIONAL BACKGROUND (RECORD)
       410*  EDUCATION (RECORD IN 400)
         411*  SCHOOL (CHAR X(15) IN 410)
         412*  DEGREE/CERTIFICATE (CHAR X(7) IN 410 WITH  FEW FUTURE OCCUR
               RENCES )
         413*  DATE COMPLETED (DATE IN 410)
         414*  MAJOR FIELD (NON-KEY CHAR X(16) IN 410)
         415*  MINOR FIELD (NON-KEY CHAR X(12) IN 410)

Database Names

The database name is a unique name, from 1 to 16 characters in length, that is assigned to a specific SYSTEM 2000 database definition. Each database also has one or more passwords associated with it.

To create descriptor files for the SAS/ACCESS interface, you must know the name and password for the SYSTEM 2000 database that you want to access.


Labeling Data

A database definition consists of schema items and schema records, which describe a blueprint for the type of data to be stored. For example, in the database definition for EMPLOYEE, EMPLOYEE NUMBER is a schema item and POSITION WITHIN COMPANY is a schema record.

A schema item names and defines the characteristics of a group of values. A schema item has a name, a type, and a picture (length). Each value stored in a SYSTEM 2000 database corresponds to a schema item. For example, the following schema item describes the numbers used as employee identification numbers. The four 9s indicate that each employee number can contain a maximum of four digits.

    1* EMPLOYEE NUMBER (INTEGER NUMBER 9999)

A schema record groups associated schema items. Schema records are explained in Grouping Schema Records.

Schema items and schema records are referred to as schema components, and each is identified by a component number and a component name, as shown in Figure 2.1. (A component number can also be referred to as a C-number, for example, C101.)

Schema Components

[Schema Components]

To access data stored in a SYSTEM 2000 database, you must specify either the component number or the component name. Both are unique in the database definition to avoid ambiguity. Each line in a database definition begins with the component number and the component name.

When you create descriptor files for SYSTEM 2000 databases, PROC ACCESS creates corresponding SAS variable names from the SYSTEM 2000 schema item names. You can then use the variable names in SAS procedures.


Grouping Schema Items

In a SYSTEM 2000 database, associated schema items are grouped by schema records. That is, different schema records store different groups of data, and a schema item belongs to only one schema record. Grouping associated schema items into schema records is similar to planning a form. A form is usually divided into sections with one section for each set of related data.

For example, look at the database definition for EMPLOYEE in Database Definition for the Database EMPLOYEE. Schema items C1 through C16 contain personal information about each employee. These items are grouped into one record, the ENTRY or C0 record. (The component number and name for the ENTRY record is not listed in a database definition unless it has been renamed.) The schema items C101 through C106, which contain information about an employee's position, are grouped in schema record C100, POSITION WITHIN COMPANY.


Grouping Schema Records

Schema record relationships are established by arranging the schema records into levels. Each schema record is placed at a specific level, which creates a hierarchical structure.

These levels are achieved by ranking schema items with values that occur only one time per employee (such as an employee's name and address) at a higher level than schema items with multiple values (such as an employee's job titles and salaries). That is, schema items having a one-to-many relationship with other schema items rank higher in the database hierarchy than the other schema items.

Look at the database definition for EMPLOYEE shown in Levels in a Database Definition. Notice that schema items are indented under their parent schema record, and schema records are indented farther to the right. This reflects the relationships among the records. For example:

Levels in a Database Definition

[Levels in Database Definition]

The next set of terms refers to the relationships between the levels, which are like relationships in a family.

Schema records are disjoint if their paths are different. When you create a view descriptor, you cannot include items that are from disjoint schema records. For example, items from C110 and items from C410 cannot be included in the same view descriptor.


Logical Entries

A database consists of groups of logically related data called logical entries. The database definition serves as a pattern to create logical entries for the database and to interpret them. A logical entry contains groups of related data called data records. A data record is an identifiable set of values that are treated as a unit and associated with a schema record.

For example, in the database EMPLOYEE, logical entries contain data about employees; all data records that pertain to one employee make up a single logical entry. Each logical entry has a data record for personal data (such as the employee's name, address, and birthday), and a data record that pertains to the position that the employee holds in the company (such as title, department, manager, and pertinent dates). If the employee held several positions in the company, there is a data record for each position.

Using the layout of the database definition, Logical Entry shows the schema items with values for one employee. David Reid held two positions: programmer and assistant programmer. In addition, he has three salary data records for his programmer position.

Logical Entry

[Logical Entry]

When you use a view descriptor in a SAS program to access a SYSTEM 2000 database, you must be familiar with the database structure in order to understand how the interface view engine maps a SYSTEM 2000 logical entry into multiple SAS observations and back again. This process is explained in the next section.


Mapping Data between SAS and SYSTEM 2000

When you create a view descriptor to access data stored in a SYSTEM 2000 database, you define one path in the database. For example, using the database EMPLOYEE, you can define a view descriptor that includes the items LAST NAME, FORENAME, POSITION TITLE, and PAY RATE. When you access the data using the view descriptor, the interface view engine maps the specified path for each logical entry into multiple observations. Logical Entry Mapped into SAS Observations shows the logical entry for David Reid mapped into SAS observations.

Logical Entry Mapped into SAS Observations

  
   LASTNAME   FORENAME   POSITION               PAYRATE

   REID       DAVID G.   ASSISTANT PROGRAMMER   $1,000.00
   REID       DAVID G.   PROGRAMMER             $1,100.00
   REID       DAVID G.   PROGRAMMER             $1,200.00
   REID       DAVID G.   PROGRAMMER             $1,300.00

When browsing SYSTEM 2000 data, such as with the FSVIEW procedure, the results would be similar to those shown in Logical Entry Mapped into SAS Observations. (See Browsing and Updating SYSTEM 2000 Data for more information.)


Null Data (Missing Values)

A logical entry does not need data at every level of the database definition. A logical entry can contain nulls, that is, missing values or empty records.

Note:   In SAS, nulls are referred to as missing values. SYSTEM 2000 and SAS handle nulls (missing values) differently. However, the interface view engine takes care of the differences in a predictable, useful way. See Missing Values (Nulls) for a discussion of the differences.  [cautionend]

space
Previous Page | Next Page | Top of Page