Viewing Block Information

View Block Details

To view block details, select a file in HDFS and click

The following information about each block is provided:

Block Details Information

Field	Description
Host Name	Specifies the machine in the cluster that stores the block of data.
Block Name	Specifies the filename for the block.
Path	Specifies the directory to the block.
Record Length	Specifies the sum of the column lengths for the variables in the data.
Records	Specifies the number of rows stored in the block. Because redundant blocks are listed in the table, the sum of the records listed does not equal the number of rows in the data.
Owner	Specifies the user account that added the data to HDFS.
Group	Specifies the primary UNIX group for the user account that stored the data.
Permissions	Specifies the Read, Write, and Execute access permissions for owner, group, and other.

You can sort by the column headings to identify anomalies. It is normal for several blocks to be stored on the same machine. However, it is not normal for the values of Record LengthOwner, Group, or Permissions to be different from row to row.

View Block Distribution

The files added to HDFS are stored as blocks. One block is the preferred block, and additional copies of the blocks are used to provide data redundancy. The Block Distribution dialog box offers two ways to view this information. The Block Detail View tab enables you to select a block number and view the host names that store the original or redundant blocks. The Node Detail View enables you to select a host name and view the block numbers that are stored on the machine.

To view the block distribution, select a table in HDFS and then click

The following information about the block distribution is provided:

Block Distribution Information

Field	Description
File size	Specifies the size of the file in bytes.
Block size	Specifies the block size for the file.
Blocks	Specifies the number of blocks used to store the original copy of the data.
Copies	Specifies the number of redundant block copies of the data.
Machines used	Specifies the number of machines in the cluster that have original or redundant blocks for the file.

On the Block Detail View tab, you can select a block number to view how many copies of the block exist and the host names for the machines that store the blocks. The value in the Total Copies column equals the number of redundant copies of the block plus the original block. You can select the column heading to sort the rows. In an ideal distribution, the number of total copies is equal for all blocks.

On the Node Detail View tab, you can expand a host name node and then view the block numbers that are stored on that machine. When you select the block number, this host name and any additional machines with copies of the block are identified in the host name list. The following display shows an example:

Sample Block Distribution display showing the Node Detail View tab