Viewing Block Information

View Block Details

To view block details, select a file in HDFS and click Block details.
Block Details
Sample Block Details display
The following information about each block is provided:
Block Details Information
Field
Description
Host Name
Specifies the machine in the cluster that stores the block of data.
Block Name
Specifies the filename for the block.
Path
Specifies the directory to the block.
Record Length
Specifies the sum of the column lengths for the variables in the data.
Records
Specifies the number of rows stored in the block. Because redundant blocks are listed in the table, the sum of the records listed does not equal the number of rows in the data.
Owner
Specifies the user account that added the data to HDFS.
Group
Specifies the primary UNIX group for the user account that stored the data.
Permissions
Specifies the Read, Write, and Execute access permissions for owner, group, and other.
You can sort by the column headings to identify anomalies. It is normal for several blocks to be stored on the same machine. However, it is not normal for the values of Record LengthOwner, Group, or Permissions to be different from row to row.

View Block Distribution

The files added to HDFS are stored as blocks. One block is the preferred block, and additional copies of the blocks are used to provide data redundancy. The Block Distribution dialog box offers two ways to view this information. The Block Detail View tab enables you to select a block number and view the host names that store the original or redundant blocks. The Node Detail View enables you to select a host name and view the block numbers that are stored on the machine.
To view the block distribution, select a table in HDFS and then click Block distribution.
Block Distribution
Sample Block Distribution display
The following information about the block distribution is provided:
Block Distribution Information
Field
Description
File size
Specifies the size of the file in bytes.
Block size
Specifies the block size for the file.
Blocks
Specifies the number of blocks used to store the original copy of the data.
Copies
Specifies the number of redundant block copies of the data.
Machines used
Specifies the number of machines in the cluster that have original or redundant blocks for the file.
On the Block Detail View tab, you can select a block number to view how many copies of the block exist and the host names for the machines that store the blocks. The value in the Total Copies column equals the number of redundant copies of the block plus the original block. You can select the column heading to sort the rows. In an ideal distribution, the number of total copies is equal for all blocks.
On the Node Detail View tab, you can expand a host name node and then view the block numbers that are stored on that machine. When you select the block number, this host name and any additional machines with copies of the block are identified in the host name list. The following display shows an example:
Block Distribution with Node Detail View Tab
Sample Block Distribution display showing the Node Detail View tab