Data Masking

Overview

Data masking is a method of hiding sensitive information within data sources. The purpose of data masking is to protect the original data by using a functional substitute in situations where the audience is not privileged to access the original data. The primary focus of data masking is to protect data that is classified as sensitive, or as personally identifiable information (PII).
Data masking with SAS Federation Server is implemented using a set of rules (rule types) and arguments that are run with a system function, SYSCAT.DM.MASK. Data masking rules consist of individual rule types that define the specific masking action or algorithm to apply to the data. The rules in effect are as follows:
  • ENCRYPT and DECRYPT
  • HASH
These rule types are valid for use in FedSQL queries and are applied over literal values or individual columns to hide personally identifiable information.

About the SYSCAT.DM.MASK Function

The SYSCAT.DM.MASK function accepts defaults configured as package options in addition to the various arguments associated with each rule type. For example, the KEY argument for the ENCRYPT rule type defaults to the value configured as the ENCRYPT_KEY package option. To configure default masking parameters as package options, use the ALTER SERVER DDL command. The following example sets a default encryption key used by the ENCRYPT and HASH rule types:
ALTER SERVER {options PACKAGE(name 'DM',
     ENCRYPT_KEY ’212e8ba6b7f84796a87a985d54277f2f’)}
Configured parameters revert to a static default value if they are dropped.
Note: Column data encrypted with the static default key cannot be reversed with the DECRYPT rule type.
Use the SYSCAT.DM.MASK function with the specified rule types and arguments to mask a value containing PII. The rule type argument must be a string constant. Argument names are not case sensitive. Here is the syntax:
SYSCAT.DM.MASK( 'rule-type', value [, rule-arguments])
Here is the syntax for the rule arguments:
[, 'rule-arg-name1', 'rule-arg-value1',
[, 'rule-arg-name2', 'rule-arg-value2', ...]] )
Data Masking Arguments and Descriptions
Argument
Description
rule-type
Name of the rule type — ENCRYPT, DECRYPT, or HASH.
value
Data value that requires masking (for example, a column reference or other value expression).
rule-arg-name1
Name of the first masking rule argument.
rule-arg-value1
Value of the first masking rule argument.
rule-arg-name2, ...
Name of the second and subsequent position masking rule arguments.
rule-arg-value2
Value of the second and subsequent position masking rule arguments.
Here is an example of data masking that uses the HASH rule-type. This rule masks the ‘LastName’ column in the HR.EMPLOYEES table using an HMAC-MD5 hash and aliases the result as LN:
select SYSCAT.DM.MASK('HASH', "LastName", 
                      'alg', 'MD5', 
                      'key', 'abc123!' ) as "LN"
 from HR.EMPLOYEES

Data Masking Rule Types and Arguments

ENCRYPT / DECRYPT

Here is the syntax for ENCRYPT and DECRYPT:
SYSCAT.DM.MASK( 'ENCRYPT', value [, rule-arguments ] ),
SYSCAT.DM.MASK( 'DECRYPT', value [, rule-rguments ] )
The ENCRYPT rule encrypts a single value using symmetric key encryption. Encrypted values cannot be decrypted if no KEY argument is specified and the ENCRYPT_KEY package configuration is not configured.
The DECRYPT rule decrypts a previously encrypted value using symmetric key encryption. The DECRYPT rule returns NULL if a KEY argument is not specified and the ENCRYPT_KEY package configuration is not configured.
Use HASH if uniqueness or reversibility (decryption) is not a requirement.
Unique Results are unique if the input values are unique.
Deterministic Results are deterministic if DETERMINISTIC YES is specified.
Reversible Possible if key is poor quality or divulged.

Arguments

ALG

Argument type: Required. Case-insensitive string constant.

Specifies the algorithm name which is one of the following:
AES/FIPS* AES/FIPS: RSA licensed FIPS compliant AES encryption.
AES* RSA-licensed AES encryption.
SAS002 SAS Proprietary.
BASE64 Base 64 encoding.
SAS004 Alias for AES/FIPS.
SAS003 Alias for AES.
SAS001 Alias for BASE64.
Note: *AES/FIPS and AES require installation of SAS/SECURE software.

KEY

Argument type: Case-insensitive string constant.

Symmetric encryption/decryption key. The key defaults to the ENCRYPT_KEY package configuration parameter or, for ENCRYPT only, the static default when none is configured.
Exception: Not valid with the SAS002 algorithm, which does not accept an encryption key.

DETERMINISTIC

Argument type: Case-insensitive Boolean string constant or 1 or 0.

Use with ENCRYPT only.
Boolean string constant values must contain one of {YES, TRUE, ON, 1, NO, FALSE, OFF, 0 }.
Use this option when deterministic output is required. Controls whether encrypted value is DETERMINISTIC. The default value is FALSE.
Exception: Not valid with the SAS002 algorithm, which does not support a custom key.

EXPAND_PREC

Argument type: Case-insensitive Boolean string constant or 1 or 0.

Use with ENCRYPT only.
Boolean string constant values must contain one of {YES, TRUE, ON, 1, NO, FALSE, OFF, 0 }.
This option causes the precision of the output value to accommodate the possibility of data bloat when using the specified encryption or encoding method. EXPAND_PREC is active by default. If the encrypted value does not fit in the column, this option returns an empty string for VARCHARor NVARCHAR, or all-blank for CHAR or NCHAR.
  • an empty string for VARCHARor NVARCHAR.
  • a completely padded blank for CHAR or NCHAR.

CASE

Argument type: Case-insensitive string constant.

Use with ENCRYPT only.
Controls case in output. Does not apply to the BASE64 algorithm, which produces upper and lower cased characters.
U[PPER] Use upper case hexits A-F.
L[OWER] Use lower case hexits a-f.

STRIP

Argument type: Case-insensitive string constant.

Use with ENCRYPT only.
Specifies whether to strip trailing whitespace from the input value prior to encryption. By default, values are not stripped.
Valid values are BLANKS, UNICODESP, UNICODESPACE, ANY, ALL, WS. BLANKS specifies to strip ASCII blank (0x20) characters only. UNICODESP and UNICODESPACE specify to strip Unicode whitespace. ANY, ALL, and WS specify to strip Unicode whitespace as well as certain format control characters.

HASH

Syntax
SYSCAT.DM.MASK( 'HASH', value [, rule-arguments ] ),
SYSCAT.DM.MASK( 'DECRYPT', value [, rule-rguments ] )
The HASH rule hashes a single value into a fixed-length hash digest or HMAC string and is not reversible.
Unique Results may be unique if the input values are unique.
Deterministic Yes
Reversible No
Algorithm
Value Type
Return Type
MD5
CHAR or VARCHAR
CHAR(32)
WCHAR or WVARCHAR
WCHAR(32)
other
BINARY(16)
SHA256
CHAR or VARCHAR
CHAR(64)
WCHAR or WVARCHAR
WCHAR(64)
other
BINARY(32)

Arguments

ALG

Argument type: Case-insensitive string constant.

Required. Specifies the algorithm name:
MD5 Robert Rivest’s 128–bit algorithm (1991).
SHA256* NSA’S 256–bit algorithm (2001). SHA256 is the default.
Note: *SHA256 requires installation of SAS/SECURE software.

CASE

Argument type: Case-insensitive string constant.

Controls case in output. Does not apply to the BASE64 algorithm, which produces upper and lower cased characters.
U[PPER] Use upper case hexits A-F.
L[OWER] Use lower case hexits a-f.

KEY

Argument type: Case-insensitive string constant.

By specifying a KEY argument, or defaulting to the ENCRYPT_KEY parameter configured in the package, a ‘hash message authentication code’ (HMAC) that complies with RFC 2104 (http://tools.ietf.org/pdf/rfc2104) is computed. Otherwise, the specified raw digest is computed directly.

STRIP

Argument type: Case-insensitive string constant.

Use with ENCRYPT only.
Specifies whether to strip trailing whitespace from the input value prior to encryption. By default, values are not stripped.
Valid values are BLANKS, UNICODESP, UNICODESPACE, ANY, ALL, WS. BLANKS specifies to strip ASCII blank (0x20) characters only. UNICODESP and UNICODESPACE specify to strip Unicode whitespace. ANY, ALL, and WS specify to strip Unicode whitespace as well as certain format control characters.