Data Preparation and Data Quality for Analytics
Business Knowledge Series course
Presented by Gerhard Svolba, Ph.D. is a product manager and pre-sales consultant at SAS Institute Inc. in Austria, where he specializes in Analytics and customer intelligence. His project experience ranges from conceptual considerations (both business and technical) to data preparation and analytic modeling. Responsible for SAS Analytic Solutions (including Enterprise Miner, Forecast Server, Model Manager, STAT, ETS, OR, IML) and selected Business Solutions (Demand Forecasting, Fraud Detection). Involved in analytic projects and topics across industries: Customer Behaviour and analytical CRM, Loyalty Cards and Sales Analysis, Credit Scoring and Validation of Scoring Models, Demand Forecasting, Supply Network Optimisation and Transport Optimisation)
This two-day masterclass teaches you how to build powerful data marts for analytical modeling and data science in an efficient way. You learn about the ecosystem for analytic data preparation and the role of the data scientist in this environment. The most commonly used analytic data structures and their adequacy for certain analytic business questions are discussed. You receive guidelines for how to approach the creation of important derived variables to increase the predictive power of your models. The topic “Data Quality” is discussed from an analytical viewpoint. Relevant data quality criteria for analytics are discussed and methods are shown how the quality status of the data can be profiled and improved with analytical methods. As not all data quality problems can be corrected, results of simulations studies that quantify the consequences of poor data quality, are shown. This allows a better decision whether to proceed with inferior data or not.
A basic understanding of statistical analysis, eventually also have experience in the context of data mining, statistics or forecasting. A basic understanding of data tables with rows and columns is expected. Programming skills in SAS or another statistical programming language may be helpful not mandatory.
The Data Preparation for Analytics - Ecosystem