Plan

The Plan phase gives you the chance to define of the people, processes, and technologies that are used for your data management project. It also gives you time to discover and categorize your data assets. Groceryrama and GreenVillage staff include data scientists, data architects, business users, system administrators, and managers. They need to meet, discuss their data needs, and discover solutions to their problems. These meetings can begin addressing a series of questions that help define the parameters of the project. These questions include the following:
  • People: Who is involved? And for what purpose?
  • Roadmap: Where are we now? Where do we want to go? What obstacles are in our way?
  • Source systems: What data do we need? Where is that data coming from?
  • Business processes: Which business processes are affected? How can better data enhance how the organization operates?
  • Business rules and data definitions: How do we define “customer?” How do we want to optimize procurement and spend?
The answers to these questions guide the collection, organization, enhancement, monitoring, and retirement of your data assets throughout the process. While you do not need all the answers at the beginning, you need a solid plan about how to proceed and what the ultimate success indicators will be. These discussions help identify the business rules and data definitions that guide the data management project. For example, you need clear guidance on the reason for the project (such as to cut costs, mitigate risks, and enhance revenue).
During the discovery portion of planning, Groceryrama and GreenVillage business analysts and data architects might run the following processes:
  • Data exploration: This diagnostic phase is concerned with documenting the data in your organization and the characteristics of that data.
  • Data profiling and auditing: Data profiling alerts you to data that does not match the characteristics defined in the metadata compiled during data exploration.
  • Data cataloging and business vocabulary: You need a development environment where data sources can be combined and rationalized.
Most of the processes in the planning stage are supported by tools in DataFlux Data Management Studio. Data explorations enable you and your organization to identify data redundancies and extract and organize metadata from multiple sources. Then, the Groceryrama and GreenVillage team can use the profiling tools to dig deeper and identify data management issues and plan and scope data quality processes appropriately. Profiling is one of the SAS applications that uses the SAS Quality Knowledge Base (QKB). The QKB provides a set of files that contains rules, expressions, and reference data that are combined to analyze and transform text data in various SAS products. Finally, data collections are provided as a means to select data fields in different tables of different data connections. A collection provides a convenient way for you to build up a data set using those fields.
Data cataloging lays the groundwork for all data management tasks to follow. Data catalogs must be augmented with business definitions and vocabularies, allowing the business user to comfortably navigate the landscape. SAS Business Data Network enables you to manage these business terms. You can set up workflows and establish relationships between terms and processes. These tasks promote a common understanding of the key concepts and practices used in an enterprise.
Planning comes first. However, you might want to refer back to your plan as you move forward and adjust it as you learn more about the data needs of Groceryrama and GreenVillage.