You are here: Administration Riser Bar>Maintaining Repositories>Understanding Repository Definitions

DataFlux Data Management Studio 2.7: User Guide

Understanding Repository Definitions

Default Repository Definitions for Client and Server

A default repository definition file (.RCF file) is installed with DataFlux Data Management Studio. Here is an example location for this file on a Windows host: 

STUDIO_HOME\etc\repositories\DataFluxSample.rcf

When the client starts, it reads this file if it is specified to be loaded on start-up. It also reads other repository definition files, if they are specified to be loaded on start-up. You can work with multiple, active repositories in DataFlux Data Management Studio.

A default repository definition file is also installed with DataFlux Data Management Server. Here is an example location for this file on a Windows host: 

SERVER_HOME\etc\repositories\server.rcf

When the server is started, it reads this file or another repository definition that is specified the BASE/REPOS_SYS_PATH option in the server.cfg file. The server will not start unless it can access its repository. A server can access one repository at a time.

The default repositories use an SQLite file for the repository database. The server and client have separate repositories by default. If you want to add a new repository, you can use the Repository Definition dialog to add a new definition file.

Repository Definition Dialog

The Repository Definition dialog in DataFlux Data Management Studio is used to add or update repository definition files. The next display shows an example of this dialog.

When you enter values for a new definition and click OK, a new repository definition file (.RCF file) is created. It is saved to a location determined by the selection or deselection of the Private checkbox. A new repository database is created in the location that is specified in the Data storage section of the definition. Later, when jobs and related objects are created in DataFlux Data Management Studio, they will be stored in the folder specified in the File storage section of the definition.

Here is a description of the fields and controls on this dialog.

Name - Specifies the name of the repository as it appears in DataFlux Data Management Studio.

Data Storage

The Data storage section of the dialog enables you to specify a storage method for the repository database. This database stores metadata for data explorations, profiles, and all objects in the Business Rule Manager (rules, tasks, custom metrics, sources, and fields). The Data storage section has two sets of controls: database file controls and database connection controls. You must use one or the other to specify a metadata storage method.

Database File Controls

Database file - When selected, enables you to specify an SQLite database file for a new repository or an existing repository. The tables for the repository are stored in the SQLite file that is specified in the Location field.

Location - Specifies the physical path for an SQLite database file. Include the .RPS file name at the end of the path, such as Repository2.rps. If this file does not exist, it is created when you click OK in this dialog.

The path to the SQLite database file can be almost any physical path that is accessible by DataFlux Data Management Studio. File-based repositories should not be stored in the File storage location below, however.

It is possible to specify a location that is remote from the DataFlux Data Management Studio host. However, the client might perform best when the SQLite database file is local to DataFlux Data Management Studio. Use a UNC path, not a path with a mapped drive letter, to specify a location that is remote from the DataFlux Data Management Studio host.

Example: \\HostName.MyDomain.com\path.

Database Connection Controls

Database connection - When selected, enables you to specify a DBMS connection for a new repository or an existing repository. The tables for the repository are stored in the database format that is specified in the Name field.

Name - Enables you to select an ODBC DSN connection or a SAS Federation Server connection for a new repository or an existing repository. This connection must be created before you can select it here. For information about data connections, see Maintaining Data Connections. Supported databases are listed in Database Storage for Repositories.

Table prefix (optional) - Specifies a table prefix. A unique table prefix enables you to distinguish the tables that are associated with a particular repository from other tables that might be stored in the same database connection. If you enter a prefix that includes a period, the portion to the left of the period will be considered to be the name of an existing schema. For example, you specify WIP.REPOS as a prefix, the repository creation will fail if there is no WIP schema in the database.

Test Connection - Enables you to test the database connection, when applicable.

Save Repository DDL (optional) - Active for repositories that are stored in a database management system. This button enables you to save a Database Definition Language (DDL) file for the current repository. The file includes a series of statements that create the repository, such as DROP TABLE, DROP INDEX, CREATE TABLE, CREATE INDEX, AND INSERT VALUES. Use the Browse button to specify a path to the DDL file. You can use the DDL file to understand what permissions are required in order for DataFlux Data Management Studio to create a repository in the target database. Also, in some cases you might not have appropriate privilege to create a repository in the target database. In that case, you can save a DDL file and give it to a database administrator. The administrator can use the DDL file as a reference for creating the repository in the target database.

File Storage

The File storage section of the dialog enables you to specify a folder for DataFlux Data Management Studio objects that are stored as files, such as data jobs, process jobs, queries, SAS code files (.SAS files), and Entity Resolution Output files (.SRI files).

NoteNote: DataFlux Data Management Server ignores the File storage section of a repository definition. It puts jobs in the default location for the server or in a folder specified in the DMSERVER/JOBS_ROOT_PATH option in dmserver.cfg.

Folder - Specifies the physical path where DataFlux Data Management Studio jobs and other file-based objects are stored. The file storage location can be any physical path that is accessible by DataFlux Data Management Studio.

It is possible to specify a location that is remote from the DataFlux Data Management Studio host. However, the client might perform best when the file storage location is local to DataFlux Data Management Studio. Use a UNC path, not a path with a mapped drive letter, to specify a location that is remote from the DataFlux Data Management Studio host. For more information about this storage method, see File Storage for Repositories.

Connect to repository at startup - When selected, specifies that a connection is established to the repository when DataFlux Data Management Studio is started.

Private - Specifies whether the repository definition file is saved to a private location for the current user or to a public location that might be accessible to other users. The default is private. The default is best in most cases. For more information, see Private and Public Repository Definitions.

OK - When you click OK for a new definition, a new repository definition file (.RCF file) is created. It is saved to the location specified by the selection or deselection of the Private checkbox. A new repository database is created in the location that is specified in the Data storage area of the definition. You are connected to the new repository.

Private and Public Repository Definitions

At start up, DataFlux Data Management Studio looks for repository definition files (.RCF files) in two locations: 

The Private checkbox in the Repository Definition dialog determines where the repository definition file is saved. If Private is selected, the repository definition file is saved in a user-specific folder. Other users cannot see a repository whose definition file is stored in a user-specific folder.

If Private is not selected, the repository definition file is saved to a subfolder in the DataFlux Data Management Studio installation directory. All users who can access DataFlux Data Management Studio on that computer can see a repository whose definition file is stored in the installation directory.

NoteNote: If you attempt to create a public repository, and you get an error that says that you cannot access the path that is specified for Data storage or File storage, then change the path to a path that is accessible to those who share the public repository.

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfDMStd_T_ReposUnder.html