Hadoop 2.6.3安装过程
Hadoop 2.6.3安装过程
系统版本:CentOS 6.9 x64
系统采取最小化安装,网络配置无问题,以下为Hadoop2.6.3安装过程, 总共5台,规划如下:
IP | hostname | 作用 |
---|---|---|
192.168.1.1 | sms1 | namenode |
192.168.1.2 | sms2 | datanode |
192.168.1.3 | sms3 | datanode |
192.168.1.4 | sms4 | datanode |
192.168.1.5 | sms5 | datanode |
安装系统组件
yum install gcc make ntp vim rsync wget lrzsz -y
同步服务器时间并加入定时任务
ntpdate pool.ntp.org echo "*/20 * * * * $(which ntpdate) pool.ntp.org > /dev/null 2>&1" >> /var/spool/cron/root chmod 600 /var/spool/cron/root
关闭SELinux
setenforce 0sed -i 's/^SELINUX=.*$/SELINUX=disabled/' /etc/selinux/config
调整系统文件打开数
sed -i '/^# End of file/,$d' /etc/security/limits.confcat >> /etc/security/limits.conf <<EOF* soft nofile 1024000* hard nofile 1024000* soft nproc unlimited* hard nproc unlimited* soft core unlimited* hard core unlimited* soft memlock unlimited* hard memlock unlimitedEOF
内核参数调整
cat > /etc/sysctl.conf << EOF fs.file-max=102400net.ipv4.tcp_max_tw_buckets = 60000net.ipv4.tcp_sack = 1net.ipv4.tcp_window_scaling = 1net.ipv4.tcp_rmem = 4096 87380 4194304net.ipv4.tcp_wmem = 4096 16384 4194304net.ipv4.tcp_max_syn_backlog = 65536net.core.netdev_max_backlog = 32768net.core.somaxconn = 32768net.core.wmem_default = 8388608net.core.rmem_default = 8388608net.core.rmem_max = 16777216net.core.wmem_max = 16777216net.ipv4.tcp_timestamps = 1net.ipv4.tcp_fin_timeout = 20net.ipv4.tcp_synack_retries = 2net.ipv4.tcp_syn_retries = 2net.ipv4.tcp_syncookies = 1net.ipv6.conf.all.disable_ipv6 = 1net.ipv4.tcp_tw_reuse = 1net.ipv4.tcp_mem = 94500000 915000000 927000000net.ipv4.tcp_max_orphans = 3276800net.ipv4.ip_local_port_range = 1024 65000vm.swappiness = 0vm.zone_reclaim_mode = 0fs.nr_open=20480000net.ipv6.conf.all.disable_ipv6 = 1net.ipv6.conf.default.disable_ipv6 = 1net.ipv6.conf.lo.disable_ipv6 = 1EOF sysctl -p
安装JDK
推荐安装Server JRE,这是为服务器环境准备的JRE,精简了一些不必要的插件。
下载地址:http://www.oracle.com/technetwork/java/javase/downloads/server-jre8-downloads-2133154.html
cat > /etc/profile << EOF JAVA_HOME=/home/jdk JRE_HOME=/home/jdk/jre PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export JAVA_HOME JRE_HOME PATH CLASSPATH EOF source /etc/profile
Hadoop账号、授信、hosts文件等
useradd hadoop -d /data/hadoop cat > /etc/hosts << EOF192.168.1.1 sms1192.168.1.2 sms2192.168.1.3 sms3192.168.1.4 sms4192.168.1.5 sms5 EOF passwd hadoop #设置个复杂点的密码 su - hadoop cat > ~/.bash_profile << EOF#Hadoop variables export JAVA_HOME=/home/jdk/ export PATH=$PATH:$JAVA_HOME/bin export HADOOP_HOME=/data/hadoop/hadoop export HADOOP_INSTALL=/data/hadoop/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_HOME=$HADOOP_INSTALLexport HADOOP_HDFS_HOME=$HADOOP_INSTALLexport YARN_HOME=$HADOOP_INSTALLexport PATH=$PATH:$HIVE_HOME/bin export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native-x64 export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native-x64"###end of paste EOF source ~/.bash_profilessh-keygen -t rsa #一路回车 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@sms1 #输入密码,先做本机的授信,然后同步复制到其他机器 rsync -a ~/.ssh sms2:/data/hadoop rsync -a ~/.ssh sms3:/data/hadoop rsync -a ~/.ssh sms4:/data/hadoop rsync -a ~/.ssh sms5:/data/hadoop ssh sms2 #确认授信无问题
Hadoop安装
下载hadoop-2.6.3.tar.gz,解压后放到/data/hadoop/ 下,文件夹名称改成 hadoop
在配置文件夹 /data/hadoop/hadoop/etc/hadoop 进行以下修改:
配置 hadoop-env.sh 中的 export JAVA_HOME=/home/jdk
配置 yarn-env.sh 中的 export JAVA_HOME=/home/jdk
配置 slaves 增加节点
sms1
sms2
sms3
sms4
sms5
配置 core-site.xml 增加内容如下:
<configuration><property> <name>fs.defaultFS</name> <value>hdfs://sms1:9000</value></property><property> <name>io.file.buffer.size</name> <value>131072</value></property><property> <name>hadoop.tmp.dir</name> <value>file:/data/hadoop/hadoop/tmp</value> <description>Abasefor other temporary directories.</description></property><property> <name>hadoop.proxyuser.spark.hosts</name> <value>*</value></property><property> <name>hadoop.proxyuser.spark.groups</name> <value>*</value></property></configuration>
配置 hdfs-site.xml 增加内容如下:
<configuration><property> <name>dfs.namenode.secondary.http-address</name> <value>sms1:9001</value></property><property> <name>dfs.namenode.name.dir</name> <value>file:/data/hadoop/hadoop/name</value></property><property> <name>dfs.datanode.data.dir</name> <value>file:/data/hadoop/hadoop/dfs/data</value></property><property> <name>dfs.replication</name> <value>3</value></property><property> <name>dfs.webhdfs.enabled</name> <value>true</value></property><property> <name>dfs.datanode.max.xcievers</name> <value>8192</value></property></configuration>
配置 mapred-site.xml 增加内容如下:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>sms1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>sms1:19888</value> </property> </configuration>
配置 yarn-site.xml 增加内容如下:
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>sms1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>sms1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>sms1:8035</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>sms1:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>sms1:8088</value> </property> </configuration>
把配置好之后完整的hadoop文件夹复制到各节点同等文件夹下
在sms1 机器上格式化namenode
/data/hadoop/hadoop/bin/hdfs namenode -format
启动hdfs
/data/hadoop/hadoop/sbin/start-dfs.sh
启动yarn
/data/hadoop/hadoop/sbin/start-yarn.sh
查看集群状态
/data/hadoop/hadoop/bin/hdfs dfsadmin -report
通过网页查看hdfs 和 yarn
hdfs:http://192.168.1.1:50070/
yarn:http://192.168.1.1:8088/
问题总结:
如果在执行hdfs命令时提示 "Unable to load native-hadoop library for your platform",可以进行如下操作:
wget http://dl.bintray.com/sequenceiq/sequenceiq-bin/hadoop-native-64-2.7.0.tar 解压到 /data/hadoop/hadoop/lib 下,重启hadoop。