Reading Missing Values from a Data Set

Goal

You want your application to test for missing values.
Note: The OLE DB specification supports only one type of missing value for each data type. Consequently, the SAS providers support only missing numeric values and missing character values.
This recipe applies to the local, SAS/SHARE, IOM, and Base SAS providers. This recipe includes sample code for ADO and OLE DB.

ADO Implementation

How Missing Values Are Represented

ADO represents missing numeric values as a Variant data type with a value of Null, which means that the IsNull function can be used to test for missing values. If a numeric field value in a record is missing, then IsNull returns a true value for that field.
To test for missing character values, you compare trimmed character values in the data set against the empty string " ".
Note: Schema rowsets return missing character values as Null values in order to provide greater interoperability with third-party clients such as Microsoft Excel, The IsNull function also returns true for missing character values in the rowsets. Always check the type of the column to determine whether there is a missing character value.
Representation of missing character values depends on whether blank padding is preserved (see Padding Character Data with Blanks).
  • If blank padding is preserved, then missing character values are represented as a string of spaces with a length equal to the width of the particular column.
  • If blank padding is not preserved, then missing character values are represented as an empty (zero length) string.
Note: If you are not sure about the state of the "SAS Preserve Trailing Blanks" property, you can use the trim function and compare the value against the empty string " ".

Sample ADO Code That Tests for Missing Values

The following Visual Basic code tests every character and numeric value in a data set to determine whether a value is missing or not.
' obConnection is an open Connection object.
Dim obRecordset As New ADODB.Recordset
Dim field As Field

obRecordset.Open "sasuser.shoes", obConnection, adOpenDynamic, adLockPessimistic, adCmdTableDirect
obRecordset.MoveFirst
While Not obRecordset.EOF
 For Each field In obRecordset.Fields
 If (field.Type = adVarChar Or field.Type = adVarWChar) And _
 (IsNull(field.Value) Or Trim(field.Value) = "") Then
 Debug.Print "missing character value found in column " & field.Name
 ElseIf IsNull(field.Value) Then
 Debug.Print "missing numeric value found in column " & field.Name
 Else
 Debug.Print "no missing value found"
 End If
 Next field
 obRecordset.MoveNext
Wend
obRecordset.Close

Sample Code That Traps Errors Caused by Missing Numeric Values (ADO)

Because missing numeric values are represented as a Variant data type with a value of Null instead of as a numeric data type, you cannot use missing numeric values in operations that require numeric data types. If you do, your application will produce a run-time error on the first missing numeric value that it encounters.
To trap the error and display a message to the user, use the Visual Basic On Error statement. The following Visual Basic code shows how this task is done:
Function TotalStores() as Integer
	Dim Total as Integer
	Total = 0
	On Error GoTo Missing
	While Not obRecordset.EOF
 		Total = Total + obRecordset!Stores
 		obRecordset.MoveNext
	Wend
	On Error GoTo 0
	TotalStores = Total
	Exit Function

Missing:
	If Err.Number = 94 Then
 		MsgBox "Missing value encountered"
	Else
 		MsgBox "Run-time error: " & Err.Number & vbNewLine & Err.description
	End If
	Resume Next
End Function

OLE DB Implementation

How Missing Values Are Represented

When you write to the OLE DB interface, use the IRowset::GetData method to test for missing numeric values in a data set. If a numeric field value in a column in the client's row data buffer is missing, then the dwStatus value is set to DBSTATUS_S_ISNULL. If the client wishes to receive the appropriate SAS missing value indicator, then the client should set the DBPROP_SAS_MISSING_VALUES property to True. The possible missing value indicator values are as follows:
  • NumericMissingIgnore(0)
  • NumericMissingUnderscore(16)
  • NumericMissingDot(32)
  • NumericMissingA(33)
  • NumericMissingZ(58)
  • NotMissing(127)
If a character field value in a column in the client's row data buffer is missing, then the dwLength value is set to zero (0) or the bData value contains only spaces.
Note: Schema rowsets will return missing character values as NULL values in order to provide greater interoperability with third-party clients such as Microsoft Excel. The dwStatus flag will also be set to DBSTATUS_S_ISNULL for missing character values in the rowsets. Always check the column type to determine whether the value is a missing character or a missing numeric.
The representation of missing character values depends on whether blank padding is preserved. In the case of OLE DB, the DBPROP_SAS_BLANKPADDING property controls this behavior.
  • If blank padding is not preserved (DBPROP_SAS_BLANKPADDING is set to VARIANT_FALSE), then the dwLength member of the column in the client's row data buffer will be set to zero (which represents the width of the empty column).
  • If blank padding is preserved (DBPROP_SAS_BLANKPADDING is set to VARIANT_TRUE), then the bData member of the column in the client's row data buffer will be a string of spaces.

Sample Code That Tests for Missing Values (OLE DB)

The following C++ code shows how to test for missing values in a single observation that is returned from IRowset::GetData. The code assumes that you have completed the following tasks:
  1. called IColumnsInfo::GetColumnInfo to get a DBCOLUMNINFO array and to determine the number of columns in the data set
  2. used the DBCOLUMNINFO array to create a DBBINDING array
  3. called IAccessor::CreateAccessor to request an accessor
  4. called IAccessor::GetBindings to recover the bindings for the accessor that you created
  5. used the bindings to determine how much space to allocate for the row data
  6. called IRowsetLocate::GetRowsAt or IRowset::GetNextRows to get an HROW
  7. used the HROW and the accessor in the call to GetData
Note: The exact steps that show how to get a row of data from an OLE DB provider can be found in the OLE DB Programmer’s Reference and Data Access SDK.
hr = pIRowset->GetData(hRow, hAccessor, pRowData)
if (FAILED(hr)){ /* an error occurred */ }
for (DBORDINAL i = 0; i < cNumColumns; i++)
{
	pCol = (COLUMNDATA*)(pRowData + pBindings[i].obLength);
	if (pBindings[i].wType == DBTYPE_R8)
	{
		if (pCol->dwStatus == DBSTATUS_S_ISNULL)
		{
			//This is a missing numeric value
		}
		else
		{
			//This is a non-missing numeric value
		}
	}
	else if (pBindings[i].wType == DBTYPE_STR || pBindings[i].wType == DBTYPE_WSTR)
	{
		DWORD j;
		// Loop through each character in the data member to
		// see if it contains a string of blanks or if it has a length of zero 
		for (j = 0;
			j < pCol->dwLength &&
			(char*)pCol->bData[j] &&
			(char*)pCol->bData[j] == ' ';
			j++);
		if (j == pCol->dwLength)
		{
			//This is a missing character value
		}
		else 
		{
			//This is a non-missing character value
		}
	}
	else
	{
		// An unexpected type was returned
	}
}