A good understanding of your
Hadoop environment is critical to a successful connection between
SPD Server and Hadoop. It is recommended that you verify your Hadoop environment by becoming
familiar with the following items:
-
Gain working knowledge of the
Hadoop distribution that you are using (for example, Cloudera). You will also need working knowledge
of
HDFS and services for
MapReduce 1, MapReduce 2, and YARN. For more information, see the Apache website or the vendor’s
website.
-
Ensure that the HDFS, MapReduce, and YARN services are running on the Hadoop
cluster.
-
Know the location of the MapReduce home.
-
Know the host name of the NameNode.
-
Determine where the HDFS cluster is running.
-
-
Understand and verify your security
setup. It is recommended that you enable Kerberos for data security.
-
Verify that you can connect to the Hadoop cluster from your client machine outside
of SPD Server with your defined security protocol.