Glossary :: SAS(R) Visual Analytics 7.2: User's Guide

Apache Hadoop

a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model.

bar chart

a chart that consists of a grid and some vertical or horizontal columns (bars). Each column represents quantitative data.

bar-line chart

a bar chart with an overlaid line graph.

box plot

a graphical display of five statistics (the minimum, lower quartile, median, upper quartile, and maximum) that summarize the distribution of a set of data. The lower quartile (25th percentile) is represented by the lower edge of the box, and the upper quartile (75th percentile) is represented by the upper edge of the box. The median (50th percentile) is represented by a central line that divides the box into sections. The extreme values are represented by whiskers that extend out from the edges of the box.

calculated column

a column that does not exist in any of the tables that are accessed, but which is created as a result of a column expression.

capability

an application feature that is under role-based management. Typically, a capability corresponds to a menu item or button. For example, a Report Creation capability might correspond to a New Report menu item in a reporting application. Capabilities are assigned to roles.

cell

a distinct rectangular subregion of a graph that can contain plots, text, or legends.

choropleth map

a two-dimensional map that uses color and fill pattern combinations to represent different categories or levels of magnitude.

co-located data provider

a distributed data source, such as SAS Visual Analytics Hadoop or a third-party vendor database, that has SAS High-Performance Analytics software installed on the same machines. The SAS software on each machine processes the data that is local to the machine or that the data source makes available as the result of a query.

crosstab

a two-dimensional table that shows frequency distributions or other aggregate statistics for the intersections of two or more category data items. In a crosstabulation table, categories are displayed on both the columns and rows, and each cell value represents the data result from the intersection of the categories on the specific row and column.

data brushing

a feature that enables you to select data values in a report object or visualization, and to readily see the corresponding data values highlighted in other report objects or visualizations.

data item

an item in a data source that is either a logical view of a data field or a calculation. The author of a report decides which data items to use in a particular section of a report. There are three types of data items: hierarchies, categories, and measures.

data source

a table, view, or file from which you will extract information. Sources can be in any format that SAS can access, on any supported hardware platform. The metadata for a source is typically an input to a job.

dependency

a trigger condition that must be met before a job can run in a scheduled flow.

deployed job

a job that has been saved in a deployment directory and can be scheduled.

deployment directory

the location for generated SAS DATA step programs that will be executed by the batch server as part of a scheduled flow.

file event

a file-related occurrence that is used as a trigger in a scheduled flow. For example, a file event occurs when a scheduling server determines that a specified file exists.

filter

specified criteria that are applied to data in order to identify the subset of data for a subsequent operation, such as continued processing.

flow

a set of jobs and associated dependencies that is scheduled in the Schedule Manager plug-in in SAS Management Console.

heat map

a graphical representation of data where the values taken by a variable in a two-dimensional map are represented as colors.

job

a collection of SAS tasks that can create output.

job event

a job-related occurrence that is used as a trigger in a scheduled flow. For example, a job event occurs when the scheduling server issues a command to determine whether a job ran successfully.

job flow

a group of jobs and their dependencies, including dependencies on other jobs, on files, or on specified dates and times.

join condition

a combination of join keys and a comparison operator.

list table

a two-dimensional representation of data, in which the data values are arranged in rows and columns.

local data

data that is accessible through the file systems on a computer. This includes data on hard drives or available through network file systems.

locale

a setting that reflects the language, local conventions, and culture for a geographic region. Local conventions can include specific formatting rules for paper sizes, dates, times, and numbers, and a currency symbol for the country or region. Some examples of locale values are French_Canada, Portuguese_Brazil, and Chinese_Singapore.

localization

the process of adapting software for a particular geocultural region (locale). Translation of the user interface, system messages, and documentation is a large part of the localization process.

pie chart

a circular chart that is divided into slices by radial lines. Each slice represents the relative contribution of each part to the whole.

query

a set of instructions that requests particular information from one or more data sources.

remote data

data that is not accessible through the file systems available to a computer. To use remote data, you must direct a SAS server to access the data that is available through file systems on the remote machine.

report

output that is generated by running custom SAS code against the data in your project.

role

a set of capabilities within an application that are targeted to a particular group of users.

SAS Management Console

a Java application that provides a single user interface for performing SAS administrative tasks.

SAS Stored Process

a SAS program that is stored on a server and defined in metadata, and which can be executed by client applications. Short form: stored process.

scatter plot

a two- or three-dimensional plot that shows the joint variation of two (or three) variables from a group of table rows. The coordinates of each point in the plot correspond to the data values for a single table row (observation).

scatter plot matrix

a grid of scatter plots showing pairwise combinations of multiple numeric variables.

scheduling server

a server that runs deployed jobs in a scheduled flow. Before running a job, the scheduling server determines when the schedule for the deployed job as well as all of the dependencies for the job have been met.

source

See data source

subquery

a query-expression that is nested as part of another query-expression. Depending on the clause that contains it, a subquery can return a single value or multiple values.

time series

an ordered sequence of values of a variable that are observed at equally spaced time intervals.

Unicode

a 16-bit encoding that is the industry standard for supporting the interchange, processing, and display of characters and symbols from most of the world's writing systems.

user role

See role

UTF-8

a method for converting 16-bit Unicode characters to 8-bit characters. This format supports all of the world's languages, including those that use non-Latin 1 characters.

visual exploration

a metadata object that contains visualizations and data settings that are saved from a session of SAS Visual Analytics Explorer.

visualization

an interactive visual representation of data. A visualization can be a table, a chart, or a geographic map.

waterfall chart

a form of data visualization that is used to understand or explain the cumulative effect on an initial value of sequentially introduced positive or negative values. Usually, the initial and the final values are represented by whole columns, and the intermediate values are denoted by floating columns.