Using the sFTP Transformation to Securely Access and Transfer Data

Overview

The first step in any data management process is to access and transfer the data in a secure manner. You can use the sFTP transformation to securely access servers, navigate to the correct directories, and transfer data to another machine.
The sFTP transformation uses the Secure File Transfer Protocol (SFTP) access method in the FILENAME statement to provide a secure connection for file transfers between two hosts (client and server) over a network. Both commands and data are encrypted. SFTP supports only OpenSSH on UNIX and PUTTY on Windows. For more information, see the SAS Statements: Reference topic FILENAME Statement, SFTP Access Method.
Note: If you connect to Amazon Web Services (AWS) for your Cloud Computing Service, they provide a Privacy Enhanced Mail Security Certificate also known as a PEM file. If you are connecting to a Linux instance, you must convert this file to a PuTTY Private Key (*.ppk) file type using a program called PuttyGen. For more information, see Connecting to Your Linux Instance from Windows Using PuTTY.

Problem

You are working on multiple projects that require you to download data from and upload reports to a customer’s server on a regular basis. You are looking for ways to automate those steps, reduce the risk of errors, and make it less time consuming.

Solution

Use the sFTP transformation to access, log on, navigate folders, copy data, upload or download data to a customer’s server, and reformat the data files.

Task: Use the sFTP Transformation to Transfer Data from a Windows Host to a Remote Linux Host

In this example, the sFTP job converts the input SAS data set to a CSV file and then uploads the CSV file to a remote server. Perform the following steps to create and populate the job:
  1. Create an empty SAS Data Integration Studio job.
  2. Select and drop the source table onto the Diagram tab of the Job Editor window. Source tables must be registered in SAS Data Integration Studio.
    Note: Data in the source table is expected to be in its final form before uploading. If any modifications are required, use another transformation, such as Extract, between the source table and the sFTP transformation to modify the data prior to the transfer.
  3. Select and drop the sFTP transformation from the Access folder in the Transformations tree onto the Diagram tab.
  4. Add an input port to the sFTP transformation by right-clicking on the transformation and selecting Ports in the pop-up menu. Then, click Add Input Port in the sub-menu.
  5. Drag the cursor from the source table to the input port of the sFTP transformation.
  6. Open the sFTP Properties and select the Options tab.
  7. On the General pane, set the following options:
    Note: Make sure you specify the path options in the transformation as designated in the Help.
    Option
    Example Value
    Path to psftp.exe(local computer)
    C:\psftp\psftp.exe
    Note: Avoid using double-byte character sets (DBCS). Also note that the psftp.exe file should be on the computer running the workspace server.
    Remote computer(ip address)
    10.120.6.87
    Tip:You can also use the DNS name.
    User name to log in to remote computer – sftp server
    scncga
    Options such as password can be passed in here
    -i C:\puttyPrivateKey\scncgal.ppk
    Note: Avoid using double-byte character sets (DBCS). For Windows, use double quotation marks around passwords, but do not use quotation marks around other values. For Linux, do not use quotation marks around values.
    File on local computer
    leave blank
    Convert data file to CSV before SFTP
    true
    Path and File name on remote computer
    leave blank
    Note: When a file is transferred to a Windows machine, this field is ignored. Instead, the file is transferred to the default folder specified on the sFTP server on the remote computer.
    Additional option on local computer filename statement
    leave blank
    Options on filename statement for remote computer
    recfm=v
    Download file from remote computer(sftp from remote to local)
    false
    Note: If you are uploading a file, set this value to false. If you are downloading a file, set this value to true.
  8. Save and run the job.
  9. Verify that the file is uploaded to the remote server correctly. It should be in the remote directory with the <sourcefile>.csv name.

sFTP Usage Notes

Using PuTTY

If you are using PuTTY on Windows, you should confirm that "psftp.exe" was installed with PuTTY and that the directory where psftp.exe is located is also included in your system PATH.

Understanding Error Messages

When an error occurs, see the log for more information. Note that the message in red is simply what can be retrieved at that point. Once the error is corrected, rerun the job. Additional errors might occur. Look at the information in the entire log to obtain additional information about the cause of the problem.

Avoiding Paths That Include Double-byte Character Sets (DBCS)

Double-byte character sets (DBCS) are not supported in Data Integration Studio. A DBCS typically supports languages such as Chinese, Japanese, and Korean. One example of a DBCS is Filename xyz sftp path=”C:\DIS\测试 ......”. A path in this format not allowed.
If you include a DBCS in your SAS Data Integration Studio job, the job runs successfully, but the file is not uploaded to the remote server correctly in the remote working directory. You might see an error message in the log similar to the following messages:
  • "ERROR: Connection refused, check key authentication and sshd server status."
  • "ERROR: Public Key Authentication is required, Password validation is not supported.”
Tip
To correct the error, replace the DBCS.
Last updated: January 16, 2018