Post3

HDFS on Mesos Installation

Posted by

HDFS on Mesos Installation

Mesos cluster optimize the resources and bring the whole data-center at one platform where all the resources can be managed efficiently. Setting up mesos cluster with HDFS would be easy if we follow the right steps.

Minimum requirement is quite high for high availability cluster. To setup a single data node cluster on mesos we need at-least 7 node with below configuration.

1.  System Requirement:

 

RAM min 16GB on each node

HDD min 60GB on each node

CPU min 4 core on each node

For our environment we are using:

 

3 Machine as Mesos master with zookeeper for High Availability

3 Machine as Journal Node process, ZKFC and NameNode1 & NameNode1

1 Machine used as a datanode

Cluster could be in either environment RHEL/Centos/Ubuntu, below steps can be used either in Centos/RHEL

2.  Prerequisite Package Installation:

 

Minimum required packages in all machines are:

sudo yum -y install build-essential python-dev libcurl4-nss-dev libsasl2-dev libsasl2-modules maven libapr1-dev           libsvn-dev curl wget ntp

 sudo yum install java-1.7.0-openjdk

3.  Prerequisite Setup:

 

Login as a root and create a new group and user respectively. Add UID (1000 or above to a user) and set password for the mesos user

sudo su –

groupadd -g 1001 mesos

useradd -g mesos -u 1001 mesos

passwd <mesos>

 

  1. Edit the config file(Via Root) and reboot the system to reflect

vi /etc/selinux/config

           SELINUX=disabled

:wq!

 

  1. Edit the hosts file to as below

vi /etc/hosts

10.X.X.X   mesosmaster (or any name as per your requirement)

:wq!

      Note: In case of ec2 environment we need to add the private domain name and internal IP address as below, complete             below line should be added

                  192.X.X.X    ip-192.X.X.X.internal      ip-192.X.X.X

  1. Edit network file and add the below line

vi /etc/sysconfig/network

HOSTNAME= mesosmaster       (same name as in the step2)

NETWORKING_IPV6=no

IPV6INIT=no

:wq!

     Note: In case of creating cluster in EC2 environment please add the complete private domain name for host name

     E.g.: If ip-192.X.X.X.internal private domain then assigned ip-192.X.X.X as a hostname variable

  1. Generate the keybased login, for these ssh key based authentication should be enabled in the system ( login as a root user)

ssh-keygen –t rsa            (Press Enter for default)

Press enter again for default passphrase

cat /root/.ssh/id_rsa.pub>>/root/.ssh/authorized_keys   

ssh localhost     (Enter yes)

exit

  1. Login as mesos

su – mesos

ssh-keygen –t rsa            (Press Enter for default )

Press enter again for default passphrase

cat /home/mesos/.ssh/id_rsa.pub >> /home/mesos/.ssh/authorized_keys

ssh localhost     (Enter yes)

exit        (connection should be closed)

  1. Login as root again

su – root

Password:

cd  /home/mesos/.ssh

ls –l

chmod 640 authorized_keys       OR

chmod 400 authorized_keys     

  1. Login as mesos

su – mesos

ssh localhost    (password prompt should not be there)

exit

  1. Login as root again and edit the sudoers file

su – root

vi /etc/sudoers

Search for “wheel” and add the below line

mesos    ALL=(ALL)             NOPASSWD:ALL         

:wq!

  1. ulimit  –n             32768
  2. ulimit  –u             60000
  3. Edit the limits.conf file

 vi /etc/security/limits.conf

Append the below line

mesos       hard    nofile    65536

mesos      soft    nofile    65536

mesos      hard    nproc    65536

mesos      soft    nproc    65536

 :wq!

  1. Edit the file /etc/sysctl.conf file and add the below line at the end

 vi    /etc/sysctl.conf

 kernel.pid_max = 4194303

 net.ipv4.ip_local_port_range = 1024      64000

 net.ipv6.conf.all.disable_ipv6 = 1

 :wq!

  1. Disable firewall in all the machine

sudo systemctl stop firewalld && sudo systemctl disable firewalld

  1. Set the java home path as below on each node (this is required for hdfs)

 export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.111-2.6.7.2.el7_2.x86_64

  1. Before proceeding start the service ntp

sudo service ntpd start

  1. Make sure iptables is off or use the below command

chkconfig iptables off

 

4.  Master Node Setup:

 

RHEL6/Centos6

# Repository needs to be added on each master node

sudo rpm -Uvh http://repos.mesosphere.com/el/6/noarch/RPMS/mesosphere-el-repo-6-2.noarch.rpm

sudo yum -y install mesos marathon

 

sudo rpm -Uvh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm

sudo yum -y install zookeeper

 

RHEL7/Centos7

# Repository needs to be added on each master node

sudo rpm -Uvh http://repos.mesosphere.com/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm

sudo yum -y install mesos marathon

sudo yum -y install mesosphere-zookeeper

 

Zookeeper Configuration:

  • Create a file /etc/zookeeper/conf/myid to a unique integer between 1 and 255 on each master node.

 vi /etc/zookeeper/conf/myid

           1

 :wq!

  • Create a file /var/lib/zookeeper/myid to a unique integer between 1 and 255 on each master node as above.

vi /var/lib/zookeeper/myid 

            1

:wq! 

 

Server Addresses:

We need to append the following values to /etc/zookeeper/conf/zoo.cfg on each node

server.1= 10.5.3.15:2888:3888

server.2= 10.5.3.133:2888:3888

server.3= 10.5.3.129:2888:3888

 

Mesos & Marathon Configuration:

  • On each node, replace the IP addresses as below with each master’s IP address to the file /etc/mesos/zk

vi /etc/mesos/zk 

zk://10.5.3.15:2181,10.5.3.133:2181,10.5.3.129:2181/mesos

:wq!

  • Create a file /etc/mesos-master/quorum on each master node to a number greater than the number of masters divided by 2. For example, the optimal quorum size for a five node master cluster would be 3. In this case, there are three masters and the quorum size should be set to 2 on each node.
  • Create a file on each master node /etc/mesos-master/hostname and update the hostname of each master node IP.
  • Create a file on each master node /etc/mesos-master/ip and update the hostname of each master node IP.

To configure marathon follow the below steps

sudo cp /etc/mesos-master/ip /etc/marathon/conf/hostname

sudo cp /etc/mesos/zk /etc/marathon/conf/master

echo zk://10.5.3.133:2181,10.5.3.15:2181,10.5.3.129:2181/marathon | sudo tee /etc/marathon/conf/zk

Once all the configuration being done start the services in all the master node

RHEL6/Centos6

sudo service stop mesos-slave.service ; sudo service disable mesos-slave.service;

sudo service mesos-master start; sudo service marathon start; sudo systemctl start zookeeper;

RHEL7/Centos7

sudo systemctl stop mesos-slave.service ; sudo systemctl disable mesos-slave.service;

sudo systemctl mesos-master start; sudo systemctl marathon start; sudo systemctl start zookeeper;

5.  Slave Node Setup:

 

RHEL6/Centos6

# Repository needs to be added on each slave node

sudo rpm -Uvh http://repos.mesosphere.com/el/6/noarch/RPMS/mesosphere-el-repo-6-2.noarch.rpm

sudo yum -y install mesos

 

RHEL7/Centos7

# Repository needs to be added on each slave node

sudo rpm -Uvh http://repos.mesosphere.com/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm

sudo yum -y install mesos

 

Mesos Configuration:

  • On each node, replace the IP addresses below with each master’s IP address in the file /etc/mesos/zk

vi /etc/mesos/zk 

zk://10.5.3.15:2181,10.5.3.133:2181,10.5.3.129:2181/mesos

wq!

  • Create a file on each master node /etc/mesos-slave/hostname and update the hostname of each master node IP.
  • Create a file on each master node /etc/mesos-slave/ip and update the hostname of each master node IP.

 

Once all the configuration being done start the services in all the master node

RHEL6/Centos6

sudo service stop mesos-master.service ; sudo service disable mesos-master.service;

sudo service mesos-slave start;

RHEL7/Centos7

sudo systemctl stop mesos-master.service ; sudo systemctl disable mesos-master.service;

sudo systemctl mesos- slave start;

6.  Verification:

 

If the packages were installed and configured correctly, we can access the Mesos console at http://<master-ip>:5050 and the Marathon console at http://<master-ip>:8080 (where <master-ip> is any of the master IP addresses) as in the below snapshot of the Agent tab.

verification

Note: If zookeeper is in High availability mode and configured properly it will automatically redirect to the active master node if other IP being used.

 

7.  HDFS Installation:

 

Download the latest archive from the git location to any of the master node and follow the below steps to make the HDFS running

wget https://github.com/mesosphere/hdfs/archive/0.1.6.tar.gz

tar -xvf 0.1.6.tar.gz

cd hdfs-0.1.6

./bin/build-hdfs

cd build/hdfs-mesos-0.1.6

./bin/hdfs-mesos

Initially some of the process get failed but after hitting multiple time it will automatically configure ZKFC/journalnode/namenode/datanode on each of the slave node

hdfs

In the framework tab all the framework would appear as below

service

8.  Some Helpful Commands To Check Service Status:

 

sudo service mesos-master status -l

sudo service zookeeper status -l

sudo service marathon status -l

sudo service mesos-slave status -l

sudo journalctl -f -u  zookeeper -l

echo srvr | ncat 127.0.0.1   2181

 

Related Posts

  • High Availability of ActiveMQ with Shared StorageHigh Availability of ActiveMQ with Shared Storage

    In production environment, there are multiple disaster scenarios that need to be planned for, like – network failures, hardware failures, software failures or power outages. ActiveMQ can be configured to…

  • Understanding Oracle Multitenant 12c database

    Overview of Oracle Multitenant Databases Overview Database 12c Enterprise Edition introduces Multitenant, a new design that empowers clients to effortlessly merge numerous databases, without changing their applications. This new design…

  • Manager’s Dilema: SAS vs R vs Python

    There are countless articles on this topic already, and I must begin by accepting that I am quite late to this superstar battle. However, every time these champions of analytics…

  • Messaging: What to choose and when

    In a previous blog, we gave an overview of the different messaging protocols available to us (AMQP & JMS) and listed each one's benefits and issues. In this blog, we…

  • Content Data Store

    Content Data Store Content Data Store (CDS) is a system to provide storage facilities to massive data sets in the form of images, pdfs, documents and scanned documents. This dataset…

  • Microsoft SQL Server Execution PlanMicrosoft SQL Server Execution Plan

    In a database environment, certain questions related to performance pop up repeatedly: why is my query running slowly? And why is SQL Server not using the index? etc.  As a…

4 comments

  1. ᕼi! I could have sworn I’ve been tߋ this wеbsite
    before but after reaԁing through some оf the post I
    realized it’s new to me. Ⲛonetheless, I’m definitely glad
    І found it and I’ll be bookmаrking and checқing Ƅack frequentlу!

  2. І just coulԁ not depart yoսr sіte prior
    to suggesting thɑt I really loved the usual info a person proᴠide tо your guests?

    Is gonna be back steadily in oгder to investigate cross-check
    new posts

  3. Hello there! Thіs bⅼog post couldn’t bе written muϲh better!
    Reading through this post reminds me of my previous roommate!
    He always kept talkіng ɑboᥙt this. I most certainly will
    forward this infoгmation to him. Fairly certain he’s gօing
    to have a very gߋoɗ read. I appreciate уou for sharing!

Leave a Reply

Your email address will not be published. Required fields are marked *