DataFlux Data Management
Studio provides connection definitions that you can use to connect
to a wide variety of data sources.
The following supported
connection types are supported:
-
-
domain-enabled ODBC connections
-
ODBC connections for Excel
-
ODBC connections for Hadoop
-
-
Federation Server connections
-
-
Custom SQLite connections
-
-
-
-
documents from third-party applications
such as Adobe Acrobat, Microsoft PowerPoint, and Microsoft Visio
Most types of data connections
in DataFlux Data Management Studio are set with the ODBC
Data Source Administrator window. You can scroll the
list of data types to select the type of connection that you need.
Then, you can use the tabs in the window to set options and parameters
for the connection. You can see the current list in of the available
databases in the “Supported Databases for Data Storage”
section of the “Working with Databases” topic. This
topic can be found in the DataFlux Data Management Studio:
User’s Guide for your version of DataFlux Data
Management Studio. The “Working with Databases” topic
is located in the “Data Riser Bar” chapter.
You can create domain-enabled
ODBC connections that reference the ODBC connection and an appropriate
authentication server domain, which can prevent users from needing
to constantly authenticate to the connection.
Domain-enabled ODBC
connections are based on the following prerequisites:
-
a standard ODBC DSN for the data
source that you want to access
-
an authentication domain, user,
and login for this ODBC DSN
Note: Domain-enabled connections
cannot use shared logins.
Specialized ODBC connections
for Excel and Hadoop simplify the process of accessing these data
types. The Excel process enables you to select the appropriate driver
for your version of Excel and create an ODBC DSN to read named ranges
in an Excel spreadsheet. The Hadoop process enables you to select
the appropriate DataFlux Apache Hive Wire Protocol driver or the DataFlux
Impala Wire Protocol driver for your site. Then, you can click System
DSN to create a connection to a data source that all
users on the machine can access.
SAS data sets can be
accessed through the SAS Data Set Connection window,
which connects to a folder that contains one or more SAS data sets.
The data is accessed directly on disk without mediation by a SAS Application
Server. The SAS connection points to a folder on the file system that
contains SAS data sets.
The host that executes
the connection must be able to access the folder that contains the
SAS data. For example, the DataFlux Data Management Studio host is
a Windows host. If the SAS data sets are on a UNIX host, you need
a networking protocol like SAMBA (SMB/CIFS). You also could use a
network file system (NFS) that exposes the UNIX file system as a Windows
directory.
These SAS data set connections
can be configured in the SAS Data Set Connection window.
For example, you can specify an access level and specify whether the
data should be compressed. You can also specify options for features
such as table locking and encryption. Finally, you can check the connection
string to see whether the appropriate options encoding has been selected
for a given connection.
If a SAS Federation
Server is available on your site, you can use the Data riser to connect
to that server and access the DSN connections that are managed by
that server. The Federation Server Connection window
enables you to specify a server and port for the connection. It also
supports compression and credentials settings. You can also test the
connection to the server.
You can add a user-defined
connection to an SAP system. This connection could be used as the
data source for the SAP Remote Function Call node,
a data job node. SAP libraries (DLLs) must be installed on all computers
where this custom connection is used.
You can also add a user-defined
connection for an SQLite database file. For example, an SQLite connection
can be used in the definition for an Address Update repository.
DataFlux Data Management
Studio contains a set of data job nodes that enable you to use XML
data and XML column data as inputs and outputs in data jobs. Similarly,
it supports the Java Message Service (JMS). This Java API enables
applications to create, send, receive, and read messages with data
and process job reader and writer nodes. Web services are supported
with the Web Service and HTTP
Request data job nodes. You can use the Document
Extraction data job node to find information that you
need to process that is not always found in traditional databases.
For example, you might need to take data from a Microsoft Word file
or an HTML file. Then you can convert it into a format that you can
process in a DataFlux Data Management Studio job.