Glossary
- Apache Hadoop
-
a framework that allows for the distributed processing
of large data sets across clusters of computers using a simple programming
model.
- bar chart
-
a chart that consists of a grid and some vertical
or horizontal columns (bars). Each column represents quantitative
data.
- bar-line chart
-
a bar chart with an overlaid line graph.
- box plot
-
a graphical display of five statistics (the minimum,
lower quartile, median, upper quartile, and maximum) that summarize
the distribution of a set of data. The lower quartile (25th percentile)
is represented by the lower edge of the box, and the upper quartile
(75th percentile) is represented by the upper edge of the box. The
median (50th percentile) is represented by a central line that divides
the box into sections. The extreme values are represented by whiskers
that extend out from the edges of the box.
- calculated column
-
a column that does not exist in any of the tables
that are accessed, but which is created as a result of a column expression.
- capability
-
an application feature that is under role-based
management. Typically, a capability corresponds to a menu item or
button. For example, a Report Creation capability might correspond
to a New Report menu item in a reporting application. Capabilities
are assigned to roles.
- cell
-
a distinct rectangular subregion of a graph that
can contain plots, text, or legends.
- choropleth map
-
a two-dimensional map that uses color and fill
pattern combinations to represent different categories or levels of
magnitude.
- co-located data provider
-
a distributed data source, such as SAS Visual
Analytics Hadoop or a third-party vendor database, that has SAS High-Performance
Analytics software installed on the same machines. The SAS software
on each machine processes the data that is local to the machine or
that the data source makes available as the result of a query.
- crosstab
-
a two-dimensional table that shows frequency distributions
or other aggregate statistics for the intersections of two or more
category data items. In a crosstabulation table, categories are displayed
on both the columns and rows, and each cell value represents the data
result from the intersection of the categories on the specific row
and column.
- data brushing
-
a feature that enables you to select data values
in a report object or visualization, and to readily see the corresponding
data values highlighted in other report objects or visualizations.
- data item
-
an item in a data source that is either a logical
view of a data field or a calculation. The author of a report decides
which data items to use in a particular section of a report. There
are three types of data items: hierarchies, categories, and measures.
- data source
-
a table, view, or file from which you will extract
information. Sources can be in any format that SAS can access, on
any supported hardware platform. The metadata for a source is typically
an input to a job.
- dependency
-
a trigger condition that must be met before a
job can run in a scheduled flow.
- deployed job
-
a job that has been saved in a deployment directory
and can be scheduled.
- deployment directory
-
the location for generated SAS DATA step programs
that will be executed by the batch server as part of a scheduled flow.
- file event
-
a file-related occurrence that is used as a trigger
in a scheduled flow. For example, a file event occurs when a scheduling
server determines that a specified file exists.
- filter
-
specified criteria that are applied to data in
order to identify the subset of data for a subsequent operation, such
as continued processing.
- flow
-
a set of jobs and associated dependencies that
is scheduled in the Schedule Manager plug-in in SAS Management Console.
- heat map
-
a graphical representation of data where the values
taken by a variable in a two-dimensional map are represented as colors.
- job
-
a collection of SAS tasks that can create output.
- job event
-
a job-related occurrence that is used as a trigger
in a scheduled flow. For example, a job event occurs when the scheduling
server issues a command to determine whether a job ran successfully.
- job flow
-
a group of jobs and their dependencies, including
dependencies on other jobs, on files, or on specified dates and times.
- join condition
-
a combination of join keys and a comparison operator.
- list table
-
a two-dimensional representation of data, in which
the data values are arranged in rows and columns.
- local data
-
data that is accessible through the file systems
on a computer. This includes data on hard drives or available through
network file systems.
- locale
-
a setting that reflects the language, local conventions,
and culture for a geographic region. Local conventions can include
specific formatting rules for paper sizes, dates, times, and numbers,
and a currency symbol for the country or region. Some examples of
locale values are French_Canada, Portuguese_Brazil, and Chinese_Singapore.
- localization
-
the process of adapting software for a particular
geocultural region (locale). Translation of the user interface, system
messages, and documentation is a large part of the localization process.
- pie chart
-
a circular chart that is divided into slices by
radial lines. Each slice represents the relative contribution of each
part to the whole.
- query
-
a set of instructions that requests particular
information from one or more data sources.
- remote data
-
data that is not accessible through the file systems
available to a computer. To use remote data, you must direct a SAS
server to access the data that is available through file systems on
the remote machine.
- report
-
output that is generated by running custom SAS
code against the data in your project.
- role
-
a set of capabilities within an application that
are targeted to a particular group of users.
- SAS Management Console
-
a Java application that provides a single user
interface for performing SAS administrative tasks.
- SAS Stored Process
-
a SAS program that is stored on a server and defined
in metadata, and which can be executed by client applications. Short
form: stored process.
- scatter plot
-
a two- or three-dimensional plot that shows the
joint variation of two (or three) variables from a group of table
rows. The coordinates of each point in the plot correspond to the
data values for a single table row (observation).
- scatter plot matrix
-
a grid of scatter plots showing pairwise combinations
of multiple numeric variables.
- scheduling server
-
a server that runs deployed jobs in a scheduled
flow. Before running a job, the scheduling server determines when
the schedule for the deployed job as well as all of the dependencies
for the job have been met.
- source
-
See data source
- subquery
-
a query-expression that is nested as part of another
query-expression. Depending on the clause that contains it, a subquery
can return a single value or multiple values.
- time series
-
an ordered sequence of values of a variable that
are observed at equally spaced time intervals.
- Unicode
-
a 16-bit encoding that is the industry standard
for supporting the interchange, processing, and display of characters
and symbols from most of the world's writing systems.
- user role
-
See role
- UTF-8
-
a method for converting 16-bit Unicode characters
to 8-bit characters. This format supports all of the world's
languages, including those that use non-Latin 1 characters.
- visual exploration
-
a metadata object that contains visualizations
and data settings that are saved from a session of SAS Visual Analytics
Explorer.
- visualization
-
an interactive visual representation of data.
A visualization can be a table, a chart, or a geographic map.
- waterfall chart
-
a form of data visualization that is used to understand
or explain the cumulative effect on an initial value of sequentially
introduced positive or negative values. Usually, the initial and the
final values are represented by whole columns, and the intermediate
values are denoted by floating columns.
Copyright © SAS Institute Inc. All rights reserved.