Load Data to LASR

Introduction

Load Data to LASR icon in SAS Data Loader window
Use the Load Data to LASR directive to copy Hadoop tables to a grid of SAS LASR Analytic Servers. On the SAS LASR Analytic Servers, you can analyze tables using software such as SAS Visual Analytics.
Note: The Load Data to LASR directive is distinct and separate from the Load to LASR capability that is provided by the SAS LASR Analytic Server.

Prerequisites

In order to use this directive, you must connect to a grid of SAS LASR Analytic Servers. Ask your SAS LASR Analytic Server administrator to verify that the following prerequisites have been met:
  • A grid of SAS LASR Analytic Servers, release 2.5 or later, must be licensed, installed, and configured.
  • SAS Visual Analytics 6.4 or later must be installed and configured on the SAS LASR Analytic Servers.
  • The SAS LASR Analytic Servers must be registered on a SAS Metadata Server.
  • The SAS LASR Analytic Servers must be configured to start automatically.
  • The SAS LASR Analytic Servers must have memory and disk allocations that are large enough to accept Hadoop tables. The Load Data to LASR directive does not check the SAS LASR Analytic Servers for available memory or disk space.
After verifying the prerequisites above, ask your Hadoop administrator if your Hadoop cluster is secured with Kerberos. If so, you are ready to specify a connection to the SAS LASR Analytic Server grid. Follow the steps that are described in Connect to a SAS LASR Analytic Server Grid.
If your Hadoop cluster is not secured with Kerberos, ask the SAS LASR Analytic Server administrator to configure Secure Shell (SSH) keys for SAS Data Loader on your SAS LASR Analytic Server grid. Direct your server administrator to the steps that are described in Configure SSH Keys on a SAS LASR Analytic Server Grid.

Example

Follow these steps to create and run the Load Data to LASR directive:
  1. Open SAS Data Loader for Hadoop, as described in Get Started.
  2. In the Directives page, click Load Data to LASR.
  3. In the Source Table page, click the schema that contains the source table that you want to load. Clicking the schema displays the tables in that schema. Click the table that you want to load onto your grid of SAS LASR Analytic Servers, and then click Next.
  4. In the Target Table page, click the SAS LASR Analytic Server that you want to receive the target table. Clicking displays target table configuration fields and controls.
  5. As needed, change the name in the Target table name field. The field defines the name of the table on the grid of SAS LASR Analytic Servers.
  6. Select options as needed to replace any existing table of the same name or to compress the target table on the grid of SAS LASR Analytic Servers.
  7. Click the Locations link to view or change the default storage options for the target table on the grid of SAS LASR Analytic Servers.
  8. In the Locations window, you can change the SAS folder, the library name, and the required tag that accompanies the table name.
  9. In the Target Table page, click Next.
  10. In the Result page, click Start loading data. SAS proceeds to generate code for the directive and displays the Code icon code icon. Click the icon to open or save the text of the SAS code that comprises the directive.
  11. During the execution of the directive, the Result page displays the Log icon log icon. Click the icon to open or save the SAS log file that is generated during the execution of the directive.
  12. At the conclusion of the directive, the Result banner receives a status icon that indicates the success or failure of the directive. To view the target table on the SAS LASR Analytic Server, click the View Results icon View Results icon.

Connect to a SAS LASR Analytic Server Grid

Your SAS LASR Analytic Server administrator can provide the information that you need to configure connections to a SAS LASR Analytic Server grid. Follow these steps to configure a connection:
  1. Open SAS Data Loader for Hadoop, as described in Get Started.
  2. In the SAS Data Loader panel, click the More icon More menu icon and select Configuration.
  3. Click SAS LASR Analytic Servers.
  4. To configure a new SAS LASR Analytic Server, click the Add icon Add icon. If you are changing an existing server connection, click that connection in the list, and then click the Edit icon Edit icon. To delete a server connection, select it and click the Delete icon Delete icon.
  5. In the LASR Server Configuration window, enter or change your choice of server name and description in the Name and Description fields.
    LASR Server Configuration Window
  6. In the Host field, add or change the full network name of the host of the SAS LASR Analytic Server. A typical name is similar to lasr03.us.ourco.com.
  7. In the Port field, add or change the number of the port that the SAS LASR Analytic Server uses to listen to connections from SAS Data Loader. The default port number is 10010.
  8. In the field LASR authorization service location, add or change the HTTP address of the authorization service.
  9. Under Connection Profile, in the lower of the two Host fields, add or change the network name of the SAS Metadata Server that is accessed by the SAS LASR Analytic Server.
  10. In the lower of the two Port fields, add or change the number of the port that the SAS Metadata Server uses to listen for client connections. The default value 8561 is frequently left unchanged.
  11. In the User ID and Password fields, add or change the credentials that SAS Data Loader will use to connect to the SAS Metadata Server. These values are stored in encrypted form.
  12. In the Repository field, specify the name of the repository on the SAS LASR Analytic Server that will receive the downloads from Hadoop. The default value Foundation might suffice.
  13. In the field SAS folder for tables, specify the path inside the repository that will contain the downloads from Hadoop. The default value /SharedData might suffice.
  14. In the Library location field, add or change the name of the SAS library that will be referenced by the Load Data to LASR directive.
  15. In the LASR server tag field, add or change the name of the tag that will be associated with each table that is downloaded from Hadoop. The tag is required. It is used along with the table name to uniquely identify tables that are downloaded from Hadoop.
  16. Review your entries and click OK to return to the Configuration window.
    At this point, you can define or edit a connection to another SAS LASR Analytic Server.

Configure SSH Keys on a SAS LASR Analytic Server Grid

If your Hadoop cluster is not secured with Kerberos, ask your SAS LASR Analytic Server administrator to configure Secure Shell (SSH) keys for SAS Data Loader on your SAS LASR Analytic Server grid. After that, you can configure a connection to the SAS LASR Analytic Server grid as described above.
The server administrator will perform these steps:
  1. On the SAS LASR Analytic Server grid, the administrator must create the user sasdldr1, as described in the SAS LASR Analytic Server: Reference Guide.
  2. The administrator must generate a public key and a private key for sasdldr1 and install those keys, as described in the SAS LASR Analytic Server: Reference Guide
  3. The administrator must copy the public key file from SAS Data Loader at vApp-install-path\vApp-instance\shared–folder \Configuration\sasdemo.pub. A typical path is C:\Program Files\SASDataLoader\dataloader-3p.22on94.1-devel-vmware.vmware (1)\dataloader-3p.22on94.1-devel-vmware\SASWorkspace\Configuration.
    Append the SAS Data Loader public key to the file ~sasdldr1/.ssh/authorized_keys on the head node of the grid.
    CAUTION:
    To maintain access to the SAS LASR Analytic Servers, you must repeat step 3 each time you replace your installation of SAS Data Loader for Hadoop.
    It is not necessary to repeat this step if you update your vApp by clicking the Update button in the SAS Data Loader Information Center.

Usage Notes

The Load Data to LASR directive moves entire tables. To improve performance, you can filter the rows and manage the columns before you load the table to the SAS LASR Analytic Server grid. To reduce table size, use the directives Transform Data in Hadoop or Query Data in Hadoop.