How to contribute limited/specific storage in the Hadoop Cluster
In this practical regarding the Hadoop cluster, we are going to discuss how we can share a specific amount of storage to the namenode in our Hadoop cluster. To perform this practical I am going to use RHEL8 on top of VirtualBox. Before jumping to the practical let’s be aware of some terms.
What is Hadoop?
Hadoop is an open source framework that is used to store and process a large amount data efficiently. Now a days, data is something that is helping any company to produce a good product but in order to benefited from the data industries are collecting extremely large amount of data. And to store and process that is really a tough task and hence Hadoop role come in play.
With out collecting all the data on a single computation unit, Hadoop breaks the data in different parts and store that in multiple computation units. This group of multiple computational units known as Hadoop cluster.
in Hadoop cluster there is one unit called namenode, it is something like master node which keeps all the spaces shared by different computational units. The different computational unit that shared it’s data is known as datanode. There is one more unit known as clientnode, it helps in uploading or getting back the data.
In today’s practical we are going to use only namenode and datanode. Now, we can jump to our practical.
In a Hadoop cluster find how to contribute the limited/specific storage as slave to the cluster.
- A Hadoop cluster(minimum a datanode and a namenode).
Steps to follow
- Add an additional volume to our datanode.
- Make Partition.
- Format and Mount.
How to add additional volume
In my case, I have RHEL8 virtualized over virtualbox and to add extra hard disk I open my virtualbox and select the virtual machine where to add the hard disk.
Click on setting and then on storage.
Simply click on add hard disk icon and follow the process according to these screenshots.
Once the hard disk is attached come to the next step
Once a hard disk is attached in order to use that we have to make partition in that and therefore I boot up my virtual machine and to check if the hard disk is successfully connected or not use below command
You will get the list of all the hard disk now copy the hard disk name in my case it is /dev/sdb.
Now you are inside the hard disk, for the new partition n is the command. Use p for primary partition. Partition number is 1. And now starting and ending sector are required.
You can also provide the size using +, and in my case I used +1G. This will provide me 1GB of partition to use. At the end w to save the partition.
Format and Mount
To format the partition I used below command
In my case, I am sharing my /dn1 directory to the hadoop cluster and therefore at the time of mounting I am going to use that directory.
mount /dev/sdb1 /dn1
Hurrah! we have done our practical part and to check if it works or not run
hadoop dfsadmin -report
Here you can clearly check that our hadoop cluster has around 1GB of space.
In case of any problem comment down below or you can connect with me on
For more such amazing articles stay tuned with BrighterBees.