Posts

Comissioning Node

To add new nodes to the cluster: Add the network addresses of the new nodes to the include file. hdfs-site.xml <property> <name>dfs.hosts</name> <value>/<hadoop-home>/conf/include</value> </property> mapred-site.xml <property> <name>mapred.hosts</name> <value>/<hadoop-home>/conf/include</value> </property> Datanodes that are permitted to connect to the namenode are specified in a file whose name is specified by the  dfs.hosts  property. Includes file resides on the  NameNodes  local filesystem, and it contains a line for each  DataNode , specified by network address (as reported by the  DataNode ; you can see what this is by looking at the  NameNodes  web UI). If you need to specify multiple network addresses for a  DataNode , put them on one line,  separated by whitespace. eg : slave01 slave02 slave03 ….. Similarly,  ...

Decomissioning Nodes

To remove nodes from the cluster: Add the network addresses of the nodes to be decommissioned to the exclude file. Do not update the include file at this point. hdfs-site.xml <property> <name>dfs.hosts.exclude</name> <value>/<hadoop-home>/conf/exclude</value> </property> mapred-site.xml <property> <name>mapred.hosts.exclude </name> <value>/<hadoop-home>/conf/exclude</value> </property> The decommissioning process is controlled by an exclude file, which for HDFS is set by the  dfs.hosts.exclude  property and for MapReduce by the  mapred.hosts.exclude  property. It is often the case that these properties refer to the same file. The exclude file lists the nodes that are not permitted to connect to the cluster. Update the  Nameodes with the new set of permitted  DataNodes , using this command: %  hadoop dfsadmin –refreshNodes Update th...