You are here: Technical Support>Frequently Asked Questions

DataFlux Data Management Studio

Frequently Asked Questions (FAQ)

The following questions and answers are designed to assist you when working with Data Management Studio. If you do not find your answer, please contact DataFlux Technical Support.

General

I cannot save global options for DataFlux Data Management Studio under Microsoft Vista or Windows Server 2008. Why?

DataFlux Data Management Studio saves global options to a user.config file that is hidden by default under Microsoft Vista and Windows Server 2008. You must un-hide this file in order to save global options. The physical path to the file is as follows:

C:\Documents and Settings\<UserID>\Local Settings\Application Data\DataFlux\ProDesigner.vshost.exe_Url_<some_hash_code>

Are there any options that I need to set for the DataFlux Data Management Server for SAS? 

If you are using the DataFlux Data Management Server for SAS, then select the following global option: 

  1. From the main Data Management Studio window, select ToolsData Management Studio Options. The global options dialog will display.
  2. Select the Data Management Server section.
  3. Select the checkbox: Connect to Data Management Server for SAS.
  4. Click OK to save your changes.
Why does my screen not repaint when I am prompted to log into a table with ODBC?

If you access a table with ODBC, you might be prompted to log in to the database. If your screen does not repaint properly, try setting the Show window contents while dragging option for Microsoft Windows. Consult the documentation for your version of Windows for details about setting this option.

Can Data Management Studio distinguish between file names that differ only in case?

Not in the current release. Data Management Studio will not permit you to add or update file names that differ in case only. This means, for example, that you cannot give names to file-based objects – such as jobs and profiles – that differ in case only, such as Job1 and JOB1.

How can I obtain a copy of the SAS XML Mapper?

You can download a free copy of the SAS XML Mapper, along with the documentation, from http://support.sas.com/demosdownloads/setupcat.jsp?cat=Base+SAS+Software. The XML Mapper works with XML input and XML output and XML column input and XML column output. The SAS 9.2 SAS XML Mapper creates version 1.9 XML Maps, and the SAS 9.3 XML Mapper creates version 2.1 XML Maps.

What version of Data Management Server must I use with Data Management Studio version 2.3 and later?

If you plan to access a Data Management Server with Data Management Studio 2.3, then the server must be version 2.3 or later. If you must access a Data Management Server that is version 2.2 or earlier, then you must use a 2.2 or earlier version of Data Management Studio.

Data Connections

See Data Connection Usage Notes.

Jobs, Profiles, Data Explorations

What Can I Do About Time-Out Errors in Data Jobs with the Web Service Node or the HTTP Request Node?

If you run a data job that includes either the Web Service node or the HTTP Request node, and you get time-out errors for either of these nodes, then you can do the following to increase the time-out value:

To change the default timeout value for a specific job that contains the Web Service node or the HTTP Request node, use the WSCP_CONFIG advanced property for these nodes to specify the path to a custom time-out configuration file.

To change the default timeout value for all jobs that use a Web Service node or an HTTP Request node, use the WEBSERVICE/CONFIG_FILE option in a system configuration file to specify the path to a custom time-out configuration file. See also Format for a Custom Time-out Configuration File.

Where do Data Management Studio jobs write temporary files?

Any temporary files that are generated by the job, such a sort files, are saved to the location that is specified in the BASE/TEMP option in a DataFlux configuration file, if this option has been set. If the BASE/TEMP option has not been set, temporary files will be written to the temporary file location that is specified for the operating system. To see if the BASE/TEMP option has been set, select Tools > Data Management Studio Options > Job > Advanced.

What is the maximum length for character variables (such as column names) in DataFlux Data Management Studio?

In data jobs, Data Input nodes and Data Output nodes support very long character fields for SAS data. They successfully work with 32K (32767 bytes) fields, which is the maximum length for character fields in SAS data sets. QKB-related nodes only process the first 256 characters and ignore the rest. Expression node string functions should work, including the mid() and len() functions. The 256 character limitation applies to regular expressions or QKB-related-functions.

In profiles, report metrics such as Data Type, Data Length, Unique Count, and Frequency Distribution are correct for strings up to 32K in length. Pattern Frequency Distribution only uses the first 254 characters instead of 256.

How can I specify Quality Knowledge Base options for profiles and data explorations?

Configuration options for the QKB are set in the Quality Knowledge Base Engine section of the app.cfg file. For example, the QKB\PATH option enables you to specify the path to the QKB. The QKB/ON_DEMAND option determines whether the QKB is loaded on demand or all at once. By default, the option is set to YES. The QKB/ALLOW_INCOMPAT specifies how newer QKB definitions are handled. By default, this option is set to NO. You might want to change this option to YES if a profile or data exploration fails due to an incompatible (newer) QKB definition. The QKB\COMPATVER option enables you to specify the QKB version. Finally, the QKB\SURFACEALL determines whether all parse definitions are surfaced.

You can use Data Management Studio to change the QKB/ALLOW_INCOMPAT option. Click Tools in the main menu and select Options to display the Data Management Studio Options dialog. Click the General section of the dialog and update the checkbox for Allow use of incompatible Quality Knowledge Base definitions.

To change other QKB options, you must edit the app.cfg file. See the "Configuration" section of the Data Management Studio Installation and Configuration Guide.

How are SAS Data Types Converted When DataFlux Software Reads or Writes SAS Data?

DataFlux software and SAS data sets support different data types. Accordingly, automatic data-type conversions will take place when Data Management Studio software reads or writes SAS data sets. Also, nulls and missing values will be converted to other values. These changes can impact features that depend on particular data types. For example, when a profile reads a SAS data set, SAS fields with a format that applies to datetime values will be reported as datetime. SAS fields with a format that applies to time values will be reported as a time and SAS fields with a format that applies to date values will be reported as a date. As a result, the profile will not calculate some metrics such as Blank Count or Maximum Length for those fields.

The following data-type conversions are made automatically when DataFlux software, such as a data job or a profile, reads SAS data.

Nulls and missing values will be converted to other values, as follows.

The following data-type conversions are made automatically when a Data Management Studio data job writes SAS data.

DataFlux Input Data Type SAS Output Data Type
  Type Length Format
Boolean num 8  
date num 8 datetime19.2
integer num 8  
real num 8  
string char 255  
Why are my fields not propagating automatically even though I set the Tools > Options > Job > Output Fields to "All"?

Some data job nodes require you to manually add all of the output fields before they can be propagated. These nodes include, Data Sorting, Data Union, and Cluster analysis, among others. Manually add all available fields by selecting the double-right arrow in the Output fields section of the node's Properties window. Once you manually add the fields, the fields will propagate correctly.

I get an error that says that the “ASE has run out of LOCKS.” What does this mean and how can I fix it?

In some cases, when you run a profile that is stored in a Sybase repository that uses a 32-BIT Sybase Wire Protocol DSN, you might get an error that says “DataFlux [name of the ODBC Sybase Wire Protocol driver] [name of the SQL Server] ASE has run out of LOCKS.” If you get this error, one possible remedy is to raise the maximum lock count in Sybase. Another possible remedy is to raise the commit interval in Data Management Studio. To do that, open the properties dialog for the affected profile. Then select Tools > Profile Options. On the General tab, select the "Commit every N rows" checkbox. Specify a number such as 10.

I have a job that calls a fork, which calls 5 data jobs, and then the process job calls a parallel iterator that tries to use 9 threads. Is there a maximum number of available threads?

You can set the WFE/ENGINE_THREAD_LIMIT configuration option to limit the number of engine threads within the application. Otherwise, the default is 0, which means unbounded, and the application defers to the system for the thread pool limits.

When I run a profile job, it fails when writing long floating point numbers to Teradata float columns.

The reason for this behavior is that the float field is being bound as a string, which is typical behavior when the type of parameter can't be determined. The workaround is to turn on "Enable Extended Statement Information" in the Options tab of the Teradata driver. This will enable the driver to determine the parameter type and thus bind as float. Restart Data Management Studio for the changes to take effect.

How can I make the SAS XML Mapper work with the XML data job nodes?

You need to disable the auto-generation feature. which is enabled by default. You also need to make sure that you generate version 1.9 or version 2.1 maps. The default map version is 1.2.

Do Data Management Studio jobs support Extract, Transform , and Load (ELT)?

Yes, Extract, Transform , and Load (ELT) is supported by using nodes in which you write database-specific SQL. This SQL is then pushed to the database.

In a data job, you can use the SQL Execute and Parameterized SQL Query nodes. In a process job, you can use the Create Table, Insert Rows, SQL Execute and Update Rows nodes. You could also consider the SQL Execute node in process jobs to be a general purpose node that enables you to build any SQL construct. The other SQL-based process nodes provide templates for more specific ELT tasks.

Do the SQL nodes in Data Management Studio support multiple database formats in one query? One query would access tables in Oracle, DB2, and SAS, for example?

No. If you want to access tables in multiple database formats, use a DataFlux Federation Server data connection.

Can I use multiple Address Verification (World) nodes in a data job?

Yes, but it is better to use a single Address Verification (World 2) node. If you use multiple Address Verification (World) nodes in a job, all of the nodes will use the same country code as the first Address Verification (World) node in the flow. The best remedy is to use the Address Verification (World 2) node instead.

What is the maximum number of lines that can be entered into the Query Text Window in the SQL Execute node?

When you click the Add button in the properties window for the SQL Execute node, the Query Text window displays. This window has a limit of approximately 635 lines of SQL code.

When running an address verification job on Data Management Server on a UNIX platform, I received a memory error. How can I fix this?

If you are running an address verification job on Data Management Server on a UNIX platform, you might have run out of memory and received the following error: 'Job terminated due to error in one or more nodes: Verification Plugin Out of memory.' Also, when the VERIFY/PRELOAD option is active in the application configuration (app.cfg) file, it requires a large amount of memory. The workaround: If the VERIFY/PRELOAD=ALL option is active, and a substantial amount of memory is not allocated for this configuration, allocate more memory to prevent memory errors occurring on UNIX Data Management Servers. Note that this only applied to US or Canadian Address Verification nodes.

How can I specify the size of the frequency distribution memory cache for a profile?

The properties window for a profile enables you to specify the size of the frequency distribution memory cache for the profile. To do so, open a profile, and then select Tools > Profile Options. Specify the desired cache size. However, in the current release, the value in the Frequency Distribution Memory Cache field is not applied to the profile until you click OK on the profile options window. Even if the field contains a correct custom value in the Distribution Memory Cache field, you must click OK to apply that value to the profile.

Repositories

See Repository Usage Notes.

Quality Knowledge Bases (QKBs)

See Quality Knowledge Base Usage Notes.

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_FAQ.html