The SASEQUAN Interface Engine

The LIBNAME libref SASEQUAN Statement

  • LIBNAME libref SASEQUAN 'physical-name' options;

The LIBNAME statement assigns a SAS library reference (libref) to the physical path of the directory of Quandl data files in which the downloaded Quandl XML data are stored. The required physical-name argument specifies the location of the folder where your Quandl XML data reside. It should end with a backslash if you are in a Windows environment and a forward slash if you are in a UNIX environment.

You can specify the following options in the LIBNAME libref SASEQUAN statement.

APIKEY='quan_apikey'

specifies the Quandl authentication token or access key that enables you to access the data that the Quandl website provides. The Quandl access key is a 20-character mixed-case alphanumeric string, and it is required. It must be enclosed on single quotation marks. You can request your quan_apikey by visiting the website at the following URL:

https://www.quandl.com/users/sign_up

AUTOMAP=REPLACE | REUSE

specifies whether or not to overwrite the existing XML map file.

REPLACE

specifies that the XML map file be overwritten, and ensures that the most current XML map that is generated by the SASEQUAN engine and named by the XMLMAP= option is used.

REUSE

specifies that the XML map file not be overwritten, and ensures that a pre-existing XML map file that is named by the XMLMAP= option is used.

By default, AUTOMAP=REPLACE.

MAPREF=quan_xmlmapref

specifies the fileref to use for the map assignment. For an example of the SASEQUAN engine that uses the MAPREF= and XMLMAP= options in the FILENAME statement in order to assign a filename, as in the following, see the section Reading All Financial Ratios Data:

   FILENAME MyMap "U:\quan950\test\gstart.map";

You can use the MAPREF= and XMLMAP= options to control where the map resides, what you name the map, and how you refer to it with a fileref. You can use the OUTXML= option to name your XML data file. It is placed in the folder that is designated by physical-name, and you can reference it by using the myLib libref in your SASEQUAN LIBNAME statement. This is shown in the section Getting Started: SASEQUAN Interface Engine, inside the DATA step in the SET statement. The SET statement reads observations from the input data set myLib.GSTART and stores them in a SAS data set named All_Fin.

OUTXML=quan_xmlfile

specifies the name of the file where the XML data that are returned from the Quandl website are stored. Each Quandl code that is listed in the IDLIST= option is given a positional numeral: 1 for the first code in the IDLIST, 2 for the second code in the IDLIST, and so on. The engine appends this numeral to the filename of the XML of each data set that the website returns. When all the XML files are retrieved, the data are merged into a SAS data set. When only one Quandl code is used in the IDLIST= option, the filename has the numeral 1 appended to the OUTXML filename.

XMLMAP=quan_xmlmapfile

specifies the fully qualified name of the location where the XML map file is automatically stored.

COLLAPSE=DAILY | WEEKLY | MONTHLY | QUARTERLY | ANNUAL | NONE

specifies the frequency to which you want to collapse the reporting frequency.

DAILY

collapses the report to a daily frequency.

WEEKLY

collapses the report to a weekly frequency.

MONTHLY

collapses the report to a monthly frequency.

QUARTERLY

collapses the report to a quarterly frequency.

ANNUAL

collapses the report to an annual frequency.

NONE

does not collapse the report.

COLLAPSE= is optional. By default, COLLAPSE=NONE when IDLIST=option specifies one Quandl code, but when the IDLIST= option specifies more than one Quandl code, the default for the collapse frequency is set to the same frequency that is specified in the FREQ= option.

The Quandl frequency-collapsing feature reports the native (higher-frequency) time series at a lower-frequency (the collapse frequency). When you collapse the frequency of a data set, Quandl returns the last observation for the given period. So if you collapse a daily data set to monthly, you get a sample of the original data set in which the observation for each month is the last data point available for that month. When you specify more than one Quandl code in the IDLIST= option, it is important to check that the from date and to date of every selected series use the same fiscal year, so that the reporting interval of the merged date values from all the data sets aligns to the same date for the first observation in the range. For example, if multiple Quandl codes are listed in the IDLIST= option, some annual time series have from dates that start in January, and some annual time series have from dates that start in June, then the merged data set will have observation dates reported for both January and June (if COLLAPSE=NONE), resulting in a semiannual interval instead of an annual interval in the merged data. To preserve the annual frequency, specify COLLAPSE=ANNUAL so that each annual time series aligns to the appropriate annual date in the merged data set. The COLLAPSE= option is applied to each Quandl data set that is specified in the IDLIST= option, so that when the data sets are merged, the reporting frequency is equal to the COLLAPSE= frequency. The resulting merged SAS data set contains the same data as a Quandl superset that is created from the same Quandl codes in the IDLIST= option. Quandl supersets are being deprecated by QUANDL but already existing supersets are still supported. Supersets are mentioned at the following URL: https://www.quandl.com/help/supersets .

Note: COLLAPSE=MONTHLY reports the daily, weekly, and monthly native frequencies of the time series at a monthly frequency (collapse frequency). If an annual native frequency time series is specified in the IDLIST= option, then it will not be selected when COLLAPSE=MONTHLY is specified. Only the time series that have native frequencies higher than the reporting frequency specified in the COLLAPSE= option are selected.

Note: It is highly recommended that you use the COLLAPSE= option when more than one Quandl code is specified in the IDLIST= option.

Caution: If the COLLAPSE= NONE is specified then undesirable time intervals can occur when more than one Quandl code is specified in the IDLIST= option.

FORMAT=XML

specifies the format of the file to be received from the Quandl website. Although Quandl can report data in many formats, the SASEQUAN engine for SAS 9.4M2 (SAS/ETS 13.2) supports only the XML format.

FREQ=DAILY | WEEKLY | MONTHLY | QUARTERLY | ANNUAL

specifies a lower frequency to aggregate values to. The FREQ= option also selects only those time series that aggregate to the specified frequency. In Quandl data, the highest frequency is daily, and the lowest frequency is annual.

DAILY

selects time series that aggregate to a daily frequency.

WEEKLY

selects time series that aggregate to a weekly frequency.

MONTHLY

selects time series that aggregate to a monthly frequency.

QUARTERLY

selects time series that aggregate to a quarterly frequency.

ANNUAL

selects time series that aggregate to an annual frequency.

The FREQ= option is not required, and the default value is the native frequency of the Quandl data set.

Note: An error is returned if you specify a frequency higher than the native frequency of the selected series. For example, if a series has the native frequency "Annual," it is not possible to aggregate the series to the higher "Monthly" frequency. To find the native frequency of an economic time series, enter the following URL in your web browser, and click on the Quandl code. The output gives you the list of available time series and their native frequencies and descriptions, including the "Frequency" field, which shows the native frequency of that time series.

https://www.quandl.com/resources/data-sources

You can use the data-sources URL to obtain the necessary information to request time series data for any of the listed data sources that you want to query.

Note: When you specify a single Quandl code in the IDLIST= option and the FREQ= option is not specified or is an empty string, the native frequency of the time series in that data set is used as the reporting frequency unless you specify the reporting frequency in the COLLAPSE= option. When you specify multiple data sets (and time series) in the IDLIST= option, the "Annual" frequency is used as the default frequency unless you specify the reporting frequency in the COLLAPSE= option. If any time series in the IDLIST= option have a lower native frequency than the specified frequency, then those time series are dropped from the list and excluded from the output.

IDLIST=‘quan_idlist’

specifies the list of Quandl codes for the data sets that contain the time series to be included in the output SAS data set. There is a limit of nine Quandl codes in the IDLIST= option. This list is comma-delimited and must be enclosed in single quotation marks.

START=‘quan_startdate’

specifies the start date for the time series in the format YYYY-MM-DD. START= is optional, and the default is 1776-07-04 (earliest available). The date must be enclosed in single quotation marks.

END=‘quan_enddate’

specifies the end date for the time series in the format YYYY-MM-DD. END= is optional, and the default is 9999-12-31 (latest available). The date must be enclosed in single quotation marks.

PROXY="quan_proxyserver"

specifies the proxy server to use. PROXY= is optional. The specified proxy server is used only when a connection-refused error or a connection-timed-out error occurs. For quan_proxyserver, specify the server’s HTTP address followed by a colon and the port number, and enclose that string with double quotation marks. For example, PROXY="http://inetgw.unx.sas.com:8118".

ROWS=quan_rows

specifies the maximum number of rows (time series observations) to return, which is an integer between 1 and 100,000. ROWS= is optional, and the default is ROWS=100000.

SORT=ASC | DESC

specifies the order in which to sort the date of time series observations.

ASC

sorts time series observations in ascending date order.

DESC

sorts time series observations in descending date order.

SORT= is optional, and the default is SORT=ASC.

TRANS=CUMUL | DIFF | NORMALIZE | RDIFF | NONE
TRANSFORMATION=CUMUL | DIFF | NORMALIZE | RDIFF | NONE

specifies the data value transformation.

CUMUL

performs the cumulative function.

DIFF

performs the difference function.

NORMALIZE

performs the normalize function.

RDIFF

performs the ratio difference function.

NONE

does no transformation on the data.

TRANS= is optional, and the default is TRANS=NONE. The details of the arguments and the corresponding function formulas are presented in Table 50.2.

Table 50.2: Quandl Transformation Codes

Trans Code

Description

Formula

cumul

Cumulative

$x_ t + x_{t-1} + \cdots + x_{t-N}$

diff

Difference

$x_ t - x_{t-1}$

normalize

Normalize

$(\frac{x_ t}{x_{t-N}}) \times 100$

rdiff

Ratio difference

$(\frac{x_ t - x_{t-1}}{x_{t-1}})$

$x_ t$ is the value of series x at time period t. N is the number of observations per year, which differs by frequency: Daily $(N=260)$, Annual $(N=1)$, Monthly $(N=12)$, Quarterly $(N=4)$, and Weekly $(N=52)$.