Understanding Data Federation and Best Practices

Overview

Successful data federation projects require careful preparation and attention in two areas:
  • data model
  • data security
Proper understanding of your organization's data access needs is key to a successful data federation deployment. The data model controls the type of underlying work required at run time to satisfy requests for data. A poorly constructed data model can result in inefficient performance and incorrect results if the data is not well understood. Data security design should consider how your users access the back-end data. Will users access the data under their individual authorizations? Will you establish one or more data owners who act as data access proxies for the end users? There are several options to consider in the design phase so that SAS Federation Server can be properly configured.

Data Model

A good starting place is to identify your data sources, understand the relationships between different sets of data, and derive a data model that meets the needs of your business. This is the same type of background work that typically goes into a data warehousing project, where the end result is often a set of tables or views that are loaded in a data mart. However, unlike data warehousing, with SAS Federation Server, the data does not need to be copied from the source into a data mart. Instead, the data can be fetched from their source locations and processed in real time during the data requests. These operations should be as efficient as possible and require a well-planned data model.
Data caching should be considered when designing the data model. It can provide a vital role in optimizing query performance by pre-executing and storing intermediate results. This can be particularly useful for back-end data sources that have low availability, slow access speeds, expensive access fees, or contain data where real-time values are not required. A data model can consist of a set of user-visible FedSQL views that are dependent on other restricted FedSQL views, which can be dependent on back-end data, such as tables and database views. FedSQL views are very similar to relational database views, except that the data can come from a heterogeneous set of data sources.
Any combination of FedSQL views can be cached, although the views must be definer’s rights views. The cached views can be refreshed on a periodic basis through the scheduler, or refreshed at any time through direct administration SQL commands. If certain data sets will be used frequently or involve complicated SQL operations to return the results, you should consider data caching for those results. Cached results can be joined with uncached data to provide quick responses to user queries.
Another consideration is to use the Memory Data Store (MDS) to store frequently used or temporary data. MDS allows tables to be stored in the memory of the server process. This allows for extremely fast data access performance. These tables must be managed manually and are automatically deleted when the server is shut down. FedSQL can join an MDS table with tables from any other data source, whether they are cached or uncached, including other MDS tables.

Data Security

Overview

In a federated system, data can be acquired from a large variety of data sources, and users might not always have direct access to those systems. Even if they do, administering a large number of database credentials and setting up database privileges so that users can access data with personal credentials can be burdensome. The security capabilities of SAS Federation Server provide some alternatives that can help ease security administration.

Invoker’s Rights Views

If users already have direct access to the back-end systems and you want them to access the data under their individual or shared logins, then you can configure SAS Federation Server to access the data under the rights of the invoking user. This is done by creating invoker’s rights FedSQL views. When these views execute, any connections to back-end data require the invoking user to have proper credentials to access the data. In addition, you should ensure that SAS Federation Server security settings allow invokers to access the required data. For example, if a FedSQL view is defined to select data from an Oracle data source and a DB2 data source, the administrator needs to ensure that proper privileges are assigned to each invoker of the view on the data source, as well as the FedSQL view itself. The privileges are granted to a group, or set of groups for which the invokers are members, instead of to each individual invoker.
Because users must be able to directly access all of their data, you should secure each object for each user. Also, invoker’s rights views need to be sensitive to these security settings on underlying objects, particularly column level security. Consider a view that selects columns C1 and C2 from table T, as shown in the following example:
SELECT C1, C2 FROM T
If table T has been secured through the Federation Server such that User1 does not have SELECT privilege on the table, then when User1 selects from the view, User1 receives an error when attempting to access table T. In this case, you might consider denying SELECT privilege on the view to User1. Furthermore, if User2 has SELECT privilege on table T1 and column C1 but not on column C2, then User2 receives an error when selecting from the view. Assigning column-level security on the view that denies SELECT privilege on column C2 to User2 does not help, because the underlying view definition specifically requires access to column C2 in table T. When creating views that require data from other objects with column-level security, you might consider selecting all columns using an asterisk (*). Here is an example:
SELECT * FROM T
When the Federation Server is configured with SelectStarExpansion=VISIBLE, the * expands all columns for which the invoking user has SELECT privilege. This enables you to create a single view that can be used by all invokers, yet each invoker sees only the columns for which the invoker has SELECT privilege.

Definer’s Rights Views

If your users do not have direct access to the back-end systems, or you want to read objects under the authorizations of a single user, then you can configure SAS Federation Server to access the data under the rights of the defining user. This is accomplished by creating definer’s rights FedSQL views. When these views are executed by a user, connections to back-end data are made under the definer of the view. The definer is actually the owner of the schema that contains the view.
In this security model, you typically deny access to the back-end data to all users except the view definer. Any column-level, table-level, or row-level security will be set on the view itself. For example, if your view is defined as SELECT C1, C2 FROM T, you should ensure that the view definer has full access to columns C1 and C2 in table T.
All other users do not require access to back-end data. And in many cases, if your users are interfacing only with exposed top-level objects of the data model, then you can deny privileges to those users on back-end objects. Individual security settings can then be consolidated on the exposed objects of the data model, which are usually FedSQL views. Then users access the views only, and no other credentials are required. This greatly simplifies the security settings in the Federation Server and reduces administration on all the back-end data sources as well.

Mixed Models

Note that you can use a combination of approaches with both invoker’s and definer’s rights views intermingled. FedSQL views can be nested, and view types are honored. For example, you might create view V1 that is owned by Owner1 and data is accessed through the credentials of Owner1. However, if view V1 selects from view V2, which is another definer’s rights view but owned by Owner2, then all data within view V2 is accessed under the authorizations of Owner2. If view V2 selects from view V3, which is an invoker’s rights view, then any data within view V3 continues to be accessed by Owner2, who is the invoker of the view.

Dropping Objects

SAS Federation Server persists metadata for security settings in its set of system tables. The server attempts to synchronize this metadata with the actual objects that it represents. For example, if the administrator has granted SELECT privilege on table T1 to User1, and subsequently table T1 is dropped, the server will likewise drop its security metadata onto T1. If T1 is subsequently re-created, it will have no security set on the object itself. All security definitions for T1 come from its inheritance objects, including the schema, catalog, data service, and server.
If you prefer that the security metadata is retained on the server, there are a couple suggested possibilities. First, it might be that a job is dropping the table only to re-create it in a subsequent step. This is often the case when refreshing the contents of a table. Rather than dropping and re-creating the table, which might also affect indexes on the table, consider issuing a DELETE FROM TABLE command to delete all the rows from the table. In doing this, the underlying table remains intact, as well as the security metadata stored in the system tables.
A second approach is to use definer’s rights views to assist with individual object security. Often, data in back-end data sources is being manipulated (created and dropped) by a few power users only. These users require CREATE, DROP TABLE, and VIEW privileges. The CREATE privilege set applies only to container objects (schema, catalog, data service, and server) and not individual tables or views. The DROP privilege can be applied at the object level. However, if designed correctly, the DROP privilege can be used from the container object.
Business users are typically the ones requiring table-level and column-level privileges to access the data. If you choose the security model using definer's rights views, then you can place your individual privileges on the views, which do not go away when a back-end table is dropped and re-created. Back-end data can be secured by denying privileges on the container object to business users, but granting SELECT to view owners. An approach similar to this can be used to eliminate the need for table-level, column-level, and row-level security on back-end data source objects.
Last updated: March 6, 2018