Data profiling jobs
help you assess the composition, organization, and quality of Hadoop
tables. They help you recognize patterns, identify scarcity in the
data, and calculate frequency and basic statistics. Data profiling
can also aid in identifying both redundant data across tables and
cross-column dependencies. All of these tasks are critical to optimal
planning and monitoring.
The profile directives
enable you to generate and view reports for one or more Hadoop tables.
The reports display sample data, column information, and measurements
of data quality. You create profile reports with the Profile Data
directive and use the Saved Profile Reports directive to access and
manage profile reports.
Here is an example of
a profile report: