Exporting Lineage Data to Web Studio

You are here: Business Data Riser Bar>Using Business Data Network>Exporting Lineage Data to Web Studio

DataFlux Data Management Studio 2.5: User Guide

Exporting Lineage Data to Web Studio

Overview

You can export the lineage data in a selected repository in DataFlux Data Management Studio or a Data Management Server to DataFlux Web Studio. (Usually, you will export the lineage from your Data Management Server repository.) Then you can associate the data to terms in Business Data Network and view the relationships in a Relationships diagram. Perform the following tasks:

Copy the Export Lineage Job
Perform a Test Run of the pjExportLineage Job
Schedule and Run the Job
Avoid References to Outdated Lineage

Copy the Export Lineage Job

When Data Management Server is installed, the job that supports lineage export is installed in a backup folder. In order to export lineage, you must copy the job that supports it from its back up folder to the folders where Data Management Server expects it.

Copy FROM dmserver_home\share\web\batch jobs\Lineage TO dmserver_home\var\batch jobs\lineage

A script is provided to move these jobs to the correct location. Under Windows, click the shortcut dmserver_home\Update Web Studio Jobs. Under UNIX, run dmserver_home/bin/update_jobs.

Perform a Test Run of the pjExportLineage Job

The pjExportLineage job is installed with the Data Management Server. This job calculates the lineage for the current repository and uploads the results to the configured Business Data Network import service. Note that you should never need to edit or even open this job.

You should perform a test run of the job before you schedule and run it to ensure that the lineage data is properly exported and the data is imported by DataFlux Web Studio. Perform the following steps:

Open the Data Management Servers riser.
Open the folder for the Data Management Server that contains the repository that contains the lineage that you need to export.
Log on to the server, if prompted.
Navigate to the Batch Jobs folder. Then, open the lineage folder.
Right-click pjExportLineage and click Run in the pop-up menu to access the Test Real-Time Process Service window.

Enter the variables listed in the following table into the appropriate rows in the table in the Variable Inputs and Outputs tab:

Variable	Description
DEBUG_FILE (optional)	The location of the file that you would like to store debugging information. If this value is not set, then debugging information will not be output.
TMPDIR (optional)	The storage location of the lineage XML export.
EXPORT_USERNAME (required)	The Base64 encoded representation of the user ID for the DataFlux Web Studio Business Data Network user.
EXPORT_PASSWORD (required)	The Base64 encoded representation of the password for the DataFlux Web Studio Business Data Network user.
EXPORT_URI (required)	The URL to the import service, such as http://servname:21079/webstudio/lineage/. Be sure to include webstudio/lineage. The “/lineage/” after "webstudio" is required. Note the trailing slash.

Note that the pjExportLineage job expects all user names and passwords to be Base64 encoded to prevent plain text credentials from being saved and exposed. You can encode your credentials with the following PERL command:

perl -e "use MIME::Base64; print encode_base64("your user name or password here");"

You can also find Internet websites that will accept a text string and return a Base64 encoded string. The following display shows a complete set of variables:

Click Run. This test run actually exports the lineage data from the Data Management Server to DataFlux Web Studio.
Verify that the lineage data is imported into DataFlux Web Studio. For instructions, see the "Import Relationship Data" topic in the Business Data Network section of the DataFlux Web Studio Help.

If the job fails to run, you might need to increase the connectionTimeout variable for the Web Service node. For more information, see What Can I Do About Time-Out Errors in Data Jobs with the Web Service Node or the HTTP Request Node?

Schedule and Run the Job

You can schedule and run the pjExportLineage job in the scheduling application of your choice. The run command uses the following pattern:

path_to/dmpexec –j path_to/pjExportLineage.djf –i DEBUG_FILE=...–i TMPDIR=...–i EXPORT_USERNAME=… -i EXPORT_PASSWORD= -I EXPORT_URI=…-o BASE/REPOS_FILE_ROOT=)

The variables that you enter are the same as those listed in the table in the Test the pjExportLineage Job section (except -o BASE/REPOS_FILE_ROOT=). For information about using the dmpexec command to run the job, see Running Jobs from the Command Line.

Note that you can also set variables in the macros file to correspond to these variables. Then you can simply add the macros file to the command with the –j filename option to the dmpexec command. For general information about using macro variables, see Using Macro Variables. The macro variables that correspond to the pjExportLineage job variables are listed in the following table.

Variable	Description
LINEAGE/DEBUG_FILE	This macro variable corresponds to the DEBUG_FILE variable in the pjExportLineage job.
LINEAGE/TMPDIR	This macro variable corresponds to the TMPDIR variable in the pjExportLineage job.
LINEAGE/EXPORT_USERNAME	This macro variable corresponds to the EXPORT_USERNAME variable in the pjExportLineage job.
LINEAGE/EXPORT_PASSWORD	This macro variable corresponds to the EXPORT_PASSWORD variable in the pjExportLineage job.
LINEAGE/EXPORT_URI	This macro variable corresponds to the EXPORT_URI variable in the pjExportLineage job.
BASE/REPOS_FILE_ROOT=(required)	This macro variable specifies the root for the repository on the Data Management Server.

The job will export the data to the DataFlux Web Studio implementation specified in the EXPORT_URI variable or the LINEAGE/EXPORT_URI macro variable. See the "Importing Relationship Data and Reviewing Relationships" topic in the "Business Data Network" section in the DataFlux Web Studio Help for information about reviewing and using the lineage data.

Avoid References to Outdated Lineage

Objects in the DataFlux Web Studio lineage database are not automatically removed when the object is deleted, renamed, or has its relationships severed in the object’s native application. This problem can also occur when you update your Web Studio installation to a newer version. When existing lineage is imported again but has changes since the first import, some of the old changes can remain in the relationship.

For example, suppose that you export a Data Management Server repository that contains job A. Job A, in turn, contains an external file named old.text. Later, you rename the external file to new.text and export that same repository a second time. Because objects in the lineage database are not automatically deleted, both old.text and new.text are contained in the lineage database. Therefore, the relationships of both the current external file new.text and the deleted externalfile old.text to job A are displayed in the relationship diagram, even though old.text no longer exists in the Data Management Server repository.

To avoid this problem, perform the following steps to return to a normalized state between each scheduled run of the pjExportLineage job:

Right-click and run the L_ClearAllLineageData batch job to clear the existing Business Data Network lineage from DataFlux Web Studio. This job is located under the Web Studio Server directory in the Data Management Servers riser in DataFlux Data Management Studio. Navigate to the Lineage directory under the Batch Jobs directory.
Right-click and run the BDG_UpdateAllLineage batch job to restore the cleared BDN lineage. This job is located under the Web Studio Server directory in the Data Management Servers riser in DataFlux Data Management Studio. Navigate to the BusinessData directory under the Batch Jobs directory.
Re-export the latest data from the originating application (DataFlux Data Management Studio, SAS Data Integration Studio, or a supported third-party application). Note that these steps clears all of the lineage data in DataFlux Web Studio from all sources. You must re-export from all sources to rebuild your lineage data completely.

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfDMStd_T_BDN_ExpLineage.html