What Is SAS LASR Analytic Server? :: SAS(R) LASR(TM) Analytic Server 2.6: Reference Guide

The SAS LASR Analytic Server is an analytic platform that provides a secure, multi-user environment for concurrent access to data that is loaded into memory. The server can take advantage of a distributed computing environment by distributing data and the workload among multiple machines and performing massively parallel processing. The server can also be deployed on a single machine where the workload and data volumes do not demand a distributed computing environment.

The server handles both big data and smaller sets of data, and it is designed with a high-performance, multi-threaded, analytic code. The server processes client requests at extraordinarily high speeds due to the combination of hardware and software that is designed for rapid access to tables in memory. By loading tables into memory for analytic processing, the server enables business analysts to explore data and discover relationships in data at the speed of RAM.

The server can also perform text analysis on unstructured data. The unstructured data is loaded to memory in the form of a table, with one document in each row. The TEXTPARSE statement in the IMSTAT procedure can then provide similar analysis to what is available with the HPTMINE procedure.

Another use for the analytic platform that the server provides is to create a recommender system. Creating recommender systems introduces the concept of an application in the server. The recommender system contains the application and might contain four or five tables. Each of the tables can be used in different ways, depending on the task and which method you apply. For example, making an item-based prediction for a nearest-neighbor method requires different data structures than a singular-value decomposition. You can associate a particular method or a set of methods with the application. You can execute one method or an ensemble. The flexibility provided by the server enables you to add and drop methods from the application. As a modeler, you want to explore and evaluate with different methods and different parameter configurations for the methods until you have optimized the system for your purposes. Then, you can deploy the recommender system in an online scoring environment.

The architecture for the server was originally designed for optimal performance in a distributed computing environment. A distributed server runs on multiple machines. A typical distributed configuration is to use a series of blades as a cluster. Each blade contains both local storage and large amounts of memory. Local storage is used to store large data sets in distributed form. Data is loaded into memory and made available so that clients can quickly access that data.

For distributed deployments, having local storage available on machines is critical in order to store large data sets in a distributed form. The server supports the Hadoop Distributed File System (HDFS) as a co-located data provider. HDFS is used because the server can read from and write to HDFS in parallel. In addition, HDFS provides replication for data redundancy. HDFS stores data as blocks in distributed form on the blades and the replication provides failover capabilities.

In a distributed deployment, the server also supports some third-party vendor databases as co-located data providers. Teradata Data Warehouse Appliance and Greenplum Data Computing Appliance are massively parallel processing database appliances. You can install the SAS LASR Analytic Server software on each of the machines in either appliance. The server can read in parallel from the local data on each machine.

For the SAS LASR Analytic Server 1.6 release (concurrent with the SAS Visual Analytics 6.1 release) the server supports a non-distributed deployment. A non-distributed server can perform the same in-memory analytic operations as a distributed server. However, a non-distributed deployment does not support parallel I/O from HDFS or third-party vendor appliances.