-
make a new useradd for hadoop master
useradd -m its
-
give the root access using visudo
its ALL=(ALL:ALL) ALL
-
change the hostname using
sudo hostname <hadoop-master>
-
download openjdk 8 or 11 and extract the tar
-
Move extracted folder to
usr/local/
or/opt/
so everybody can access java -
add env variable and put in bashr also add path
-
mapping nodes
-
configuring ssh key to all slave
-
download and install hadoop (untar using
tar -xzf <.gz.tar file>
-
hadoop mirror link
-
-
configure hadoop
- hadoop .xml file can be found in
hadoop/etc/hadoop/
- hadoop .xml file can be found in
-
permission problem and copy hadoop to another node
- in order to resyn, you need to mkdir hadoop in slave node with
chown -R its hadoop
sudo chmod -R 777 opt
rsync -avzhP /opt/hadoop/hadoop-3.3.3 hadoop-slave-01@host:/opt/hadoop
- in order to resyn, you need to mkdir hadoop in slave node with
IMPORTANT
-
don’t forget to set up uniform
/etc/host
for master and all nodes -
to format or restart Hadoop make sure you use
bin/hdfs namenode -format
-
every restart make sure to remove
dfs
directory -
the
hadoop/etc/hadoop/worker
in all nodes shoud behostname
don’t uselocalhost
-
check dir
chown
orchmod
-
without avro 16mins → 6.5mb
-
check hadoop directory
bin/hdfs dfs -ls /
-
if the hadoop:9000 is not in
nestat
please check thedfs
dir. hadoop:9000 only working if thedfs
dir is available (this behavior usually happens when you remove thedfs
afterbin/hdfs namenode -format
command is executed). -
if one server/datanode is down, use
hdfs --daemon start datanode
in the node. -
ensure pyarrow installation
-
if error below, check another version of JDK
-
when the datanode is not detected or the datanode is now shown in web UI, please remove the dfs directory in that datanode and stop, format, start again
-
if error below happen → in hadoop master node make a dir
bin/hdfs dfs -mkdir /raw
andbin/hdfs dfs -chmod -R 777 /raw
-
if error below happen → in the client set the env variable
export HADOOP_USER_NAME=<username at master>
References
https://www.tutorialspoint.com/hadoop/hadoop_multi_node_cluster.htm
https://dlcdn.apache.org/hadoop/common/