大数据 | Hadoop 搭建
本文转载自 QingYingX's Blog: 原文地址
关闭防火墙&SELinux
防火墙
systemctl status firewalld查看防火墙状态systemctl stop firewalld关闭防火墙systemctl disable firewalld永久关闭防火墙
Selinux
临时关闭:输入命令
setenforce 0重启系统后还会开启永久关闭:输入命令
vim /etc/selinux/config将SELINUX=enforcing改为SELINUX=disabled然后保存退出
配置 SSH 免密登录
准备工作修改hosts文件 添加上机器对应的IP以及主机名
sudo vim /etc/hosts
以下为hosts默认内容:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6在本地终端中执行
ssh-keygen命令,提示都不用管,一路回车(Enter)。将公钥上传到机器
ssh-copy-id User@Hostname如果需要删除密钥
ssh-keygen -r <Hostname>
配置JAVA环境
输入命令
tar -zxvf jdkx.x.x.tar.gz解压输入命令
vim /etc/profile打开配置文件 按下i键 再按下 Shift+G 可到达底部 追加以下内容
export JAVA_HOME=
export PATH=$JAVA_HOME/bin:$PATH3. 输入命令 source /etc/profile
4. 验证是否配置成功 java -version 大致为
[root@master java]# java -version
java version "x.x.x_xxx"
Java(TM) SE Runtime Environment (build x.x.x_xxx-xxx)
Java HotSpot(TM) 64-Bit Server VM (build xx.xxx-xxx, mixed mode)Hadoop 配置文件
hadoop-env.sh
export JAVA_HOME=
export HADOOP_HOME=workers
master
slave1
slave2core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/dfs/tmp</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
</configuration>/etc/profile
添加环境变量
export HADOOP_HOME=
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH执行命令 source /etc/profile
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>*****</value>
</property>
</configuration>注意 mapreduce.application.classpath 此处的值请通过 hadoop classpath 执行获取
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>0.0.0.0:50070</value>
</property>
</configuration>/start-dfs.sh&/stop-dfs.sh
HDFS_DATANODE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root/start-yarn.sh&/stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root初始化集群
hdfs namenode -format
成功标志:Successful (需要上翻几行)
启动集群
start-all.sh
PS: historyserver 需要单独启动
mapred --daemon start historyserver
查看hadoop的报告
hdfs dfsadmin -report
再次感谢
在此处再次感谢 QingYingX 的贡献!他是一个非常可靠的队友!
若你看完了这篇博客文章,一定要去看看他的-> Click Here!