SAS Data Curation Professional consists of four separate courses that cover a variety of topics. First, you receive an introduction to data curation and an overview that addresses prerequisite understanding. Next, you discover how to access your data from a variety of sources, create processes to manage and transform data, and ensure the reliability and consistency of your data. Subsequent materials discuss the Hadoop environment, Apache Hive, and Apache Pig, as well as various SAS methods for interacting with Hadoop. Finally, you learn to use additional SAS data management technologies to access, manage, and govern your data.
- Introduce the field of data science and define data curation.
- Identify the components of computing environments.
- Explore the field of data science and the role of data scientists.
- Introduce the roadmap to data curation with SAS.
- Read and write data with SAS/ACCESS technologies.
- Perform extract, transform, and load (ETL) tasks using SAS Data Integration Studio.
- Discover capabilities of the SAS Quality Knowledge Base.
- Use DataFlux Data Management Studio to understand and improve your data.
- Understand the structure and functionality of the SAS Quality Knowledge Base.
- Access the components of SAS Quality Knowledge Base programmatically using SAS code.
- Process and prepare structured and unstructured big data for analysis.
- Organize data into a variety of storage formats for the Hadoop Distributed File System (HDFS).
- Use Hive and Pig to query and process data in Hadoop.
- Write SAS code to integrate with Hive and Pig.
- Leverage the SAS DS2 procedure to process data in Hadoop.
- Work with Hadoop data using the point-and-click interface of SAS Data Loader for Hadoop.
- Maintain, configure, and monitor data access from a single point of administration with SAS Federation Server.
- Create a secured virtualized data layer that unifies disparate data sources into FedSQL views.
- Develop SAS Event Stream Processing applications to ingest, process, and analyze streaming data in real time.
- Govern data using SAS Business Data Network.
- View relationships in SAS Lineage.