The Plan phase gives
you the chance to define of the people, processes, and technologies
that are used for your data management project. It also gives you
time to discover and categorize your data assets. Groceryrama and
GreenVillage staff include data scientists, data architects, business
users, system administrators, and managers. They need to meet, discuss
their data needs, and discover solutions to their problems. These
meetings can begin addressing a series of questions that help define
the parameters of the project. These questions include the following:
-
People: Who
is involved? And for what purpose?
-
Roadmap: Where
are we now? Where do we want to go? What obstacles are in our way?
-
Source systems: What
data do we need? Where is that data coming from?
-
Business processes: Which
business processes are affected? How can better data enhance how
the organization operates?
-
Business rules and
data definitions: How do we define “customer?”
How do we want to optimize procurement and spend?
The answers to these
questions guide the collection, organization, enhancement, monitoring,
and retirement of your data assets throughout the process. While you
do not need all the answers at the beginning, you need a solid plan
about how to proceed and what the ultimate success indicators will
be. These discussions help identify the business rules and data definitions
that guide the data management project. For example, you need clear
guidance on the reason for the project (such as to cut costs, mitigate
risks, and enhance revenue).
During the discovery
portion of planning, Groceryrama and GreenVillage business analysts
and data architects might run the following processes:
-
Data exploration: This
diagnostic phase is concerned with documenting the data in your organization
and the characteristics of that data.
-
Data profiling and
auditing: Data profiling alerts you to data that does
not match the characteristics defined in the metadata compiled during
data exploration.
-
Data cataloging and
business vocabulary: You need a development environment
where data sources can be combined and rationalized.
Most of the processes
in the planning stage are supported by tools in DataFlux Data Management
Studio. Data explorations enable you and your organization to identify
data redundancies and extract and organize metadata from multiple
sources. Then, the Groceryrama and GreenVillage team can use the profiling
tools to dig deeper and identify data management issues and plan and
scope data quality processes appropriately. Profiling is one of the
SAS applications that uses the SAS Quality Knowledge Base (QKB). The
QKB provides a set of files that contains rules, expressions, and
reference data that are combined to analyze and transform text data
in various SAS products. Finally, data collections are provided as
a means to select data fields in different tables of different data
connections. A collection provides a convenient way for you to build
up a data set using those fields.
Data cataloging lays
the groundwork for all data management tasks to follow. Data catalogs
must be augmented with business definitions and vocabularies, allowing
the business user to comfortably navigate the landscape. SAS Business
Data Network enables you to manage these business terms. You can set
up workflows and establish relationships between terms and processes.
These tasks promote a common understanding of the key concepts and
practices used in an enterprise.
Planning comes first.
However, you might want to refer back to your plan as you move forward
and adjust it as you learn more about the data needs of Groceryrama
and GreenVillage.