How to contribute limited/specific storage in the Hadoop Cluster

Published by Anubhav Singh on

hadoop cluster

In this practical regarding the Hadoop cluster, we are going to discuss how we can share a specific amount of storage to the namenode in our Hadoop cluster. To perform this practical I am going to use RHEL8 on top of VirtualBox. Before jumping to the practical let’s be aware of some terms.

What is Hadoop?

Hadoop is an open source framework that is used to store and process a large amount data efficiently. Now a days, data is something that is helping any company to produce a good product but in order to benefited from the data industries are collecting extremely large amount of data. And to store and process that is really a tough task and hence Hadoop role come in play.

With out collecting all the data on a single computation unit, Hadoop breaks the data in different parts and store that in multiple computation units. This group of multiple computational units known as Hadoop cluster.

in Hadoop cluster there is one unit called namenode, it is something like master node which keeps all the spaces shared by different computational units. The different computational unit that shared it’s data is known as datanode. There is one more unit known as clientnode, it helps in uploading or getting back the data.

In today’s practical we are going to use only namenode and datanode. Now, we can jump to our practical.

Task

In a Hadoop cluster find how to contribute the limited/specific storage as slave to the cluster.

Prerequisite

  • A Hadoop cluster(minimum a datanode and a namenode).

Steps to follow

  • Add an additional volume to our datanode.
  • Make Partition.
  • Format and Mount.

How to add additional volume

If you are working on AWS, you have EBS as the service where you can create an extra volume and attach that to your instance. For CLI you can take help from this link.

In my case, I have RHEL8 virtualized over virtualbox and to add extra hard disk I open my virtualbox and select the virtual machine where to add the hard disk.

Click on setting and then on storage.

Simply click on add hard disk icon and follow the process according to these screenshots.

Once the hard disk is attached come to the next step

Make partition

Once a hard disk is attached in order to use that we have to make partition in that and therefore I boot up my virtual machine and to check if the hard disk is successfully connected or not use below command

fdisk -l

You will get the list of all the hard disk now copy the hard disk name in my case it is /dev/sdb.

fdisk /dev/sdb

Now you are inside the hard disk, for the new partition n is the command. Use p for primary partition. Partition number is 1. And now starting and ending sector are required.

You can also provide the size using +, and in my case I used +1G. This will provide me 1GB of partition to use. At the end w to save the partition.

Format and Mount

To format the partition I used below command

mkfs.ext4 /dev/sdb1

In my case, I am sharing my /dn1 directory to the hadoop cluster and therefore at the time of mounting I am going to use that directory.

mount /dev/sdb1 /dn1

Hurrah! we have done our practical part and to check if it works or not run

hadoop dfsadmin -report

Here you can clearly check that our hadoop cluster has around 1GB of space.

Final word

In case of any problem comment down below or you can connect with me on

Linkedin Twitter

For more such amazing articles stay tuned with BrighterBees.

Categories: Education

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

STAY CONNECT WITH US