大数据实战之配置集群
创始人
2024-02-16 05:05:44
0

设置免密登录

1)生成公钥和私钥

[kfk@bigdata-pro01 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/kfk/.ssh/id_rsa):
/home/kfk/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/kfk/.ssh/id_rsa.
Your public key has been saved in /home/kfk/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:0LUJcD8XkH4jluGWhmFhJ9rw9uWe0ZM7+yP3hlcepcc kfk@bigdata-pro01.kfk.com
The key’s randomart image is:
±–[RSA 2048]----+
| o..+o. |
| Oo
oo . |
| o.===+o |
| o…X=o. …|
| S+.oo.+o.|
| . o.+E|
| o o+o|
| …=+|
| ++=|
±—[SHA256]-----+

2)密钥分发

ssh-copy-id进行密钥分发,具体操作如下:
[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro02
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub”
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
kfk@bigdata-pro02’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro02’”
and check to make sure that only the key(s) you wanted were added.

[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro03
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub”
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
kfk@bigdata-pro03’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro03’”
and check to make sure that only the key(s) you wanted were added.

[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro04
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub”
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
kfk@bigdata-pro04’s password:
Permission denied, please try again.
kfk@bigdata-pro04’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro04’”
and check to make sure that only the key(s) you wanted were added.

[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro05
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub”
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
kfk@bigdata-pro05’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro05’”
and check to make sure that only the key(s) you wanted were added.

同步配置环境变量

/etc/bashrc
export JAVA_HOME=/opt/modules/jdk-18.0.2.1
export PATH=PATH:PATH:PATH:JAVA_HOME/bin:.
/home/kfk/.bash_profile

在其他主机创建/opt/modules目录

mkdir -p /op/modules

分发jdk

scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro02:/opt/modules
scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro03:/opt/modules
scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro04:/opt/modules
scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro05:/opt/modules

分发hadoop

scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro02:/home/kfk/
scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro03:/home/kfk/
scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro04:/home/kfk/
scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro05:/home/kfk/

配置集群

核心配置文件

配置core-site.xml
[atguigu@hadoop102 hadoop]$ vi core-site.xml
在该文件中编写如下配置


fs.defaultFShdfs://hadoop102:9000

hadoop.tmp.dir/opt/module/hadoop-2.7.2/data/tmp

HDFS配置文件

配置hadoop-env.sh
[atguigu@hadoop102 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
配置hdfs-site.xml
atguigu@hadoop102 hadoop]$ vi hdfs-site.xml
在该文件中编写如下配置

dfs.replication3

dfs.namenode.secondary.http-addresshadoop104:50090

YARN配置文件

配置yarn-env.sh
[atguigu@hadoop102 hadoop]$ vi yarn-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
配置yarn-site.xml
[atguigu@hadoop102 hadoop]$ vi yarn-site.xml
在该文件中增加如下配置


yarn.nodemanager.aux-servicesmapreduce_shuffle

yarn.resourcemanager.hostnamehadoop103

MapReduce配置文件

配置mapred-env.sh
[atguigu@hadoop102 hadoop]$ vi mapred-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
配置mapred-site.xml
atguigu@hadoop102 hadoop]$ cp mapred-site.xml.template mapred-site.xml

[atguigu@hadoop102 hadoop]$ vi mapred-site.xml
在该文件中增加如下配置


mapreduce.framework.nameyarn

在集群上分发配置好的Hadoop配置文件

使用scp 或 rsync脚本同步即可

集群单点启动

(1)如果集群是第一次启动,需要格式化NameNode
[atguigu@hadoop102 hadoop-2.7.2]$ hadoop namenode -format
(2)在hadoop102上启动NameNode
[atguigu@hadoop102 hadoop-2.7.2]$ hadoop-daemon.sh start namenode
[atguigu@hadoop102 hadoop-2.7.2]$ jps
3461 NameNode
(3)在hadoop102、hadoop103以及hadoop104上分别启动DataNode
[atguigu@hadoop102 hadoop-2.7.2]$ hadoop-daemon.sh start datanode
[atguigu@hadoop102 hadoop-2.7.2]$ jps
3461 NameNode
3608 Jps
3561 DataNode
[atguigu@hadoop103 hadoop-2.7.2]$ hadoop-daemon.sh start datanode
[atguigu@hadoop103 hadoop-2.7.2]$ jps
3190 DataNode
3279 Jps
[atguigu@hadoop104 hadoop-2.7.2]$ hadoop-daemon.sh start datanode
[atguigu@hadoop104 hadoop-2.7.2]$ jps
3237 Jps
3163 DataNode

集群群点启动

配置slaves

/opt/module/hadoop-2.7.2/etc/hadoop/slaves
[atguigu@hadoop102 hadoop]$ vi slaves
在该文件中增加如下内容:
hadoop102
hadoop103
hadoop104
注意:该文件中添加的内容结尾不允许有空格,文件中不允许有空行。

启动集群

(1)如果集群是第一次启动,需要格式化NameNode
[atguigu@hadoop102 hadoop-2.7.2]$ bin/hdfs namenode -format
(2)启动HDFS
[atguigu@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh
[atguigu@hadoop102 hadoop-2.7.2]$ jps
4166 NameNode
4482 Jps
4263 DataNode
[atguigu@hadoop103 hadoop-2.7.2]$ jps
3218 DataNode
3288 Jps

[atguigu@hadoop104 hadoop-2.7.2]$ jps
3221 DataNode
3283 SecondaryNameNode
3364 Jps
(3)启动YARN
[atguigu@hadoop103 hadoop-2.7.2]$ sbin/start-yarn.sh
注意:NameNode和ResourceManger如果不是同一台机器,不能在NameNode上启动 YARN,应该在ResouceManager所在的机器上启动YARN。

集群基本测试

上传文件到集群

上传小文件

[atguigu@hadoop102 hadoop-2.7.2]$ hadoop fs -mkdir -p /user/atguigu/input
[atguigu@hadoop102 hadoop-2.7.2]$ hadoop fs -put wcinput/wc.input /user/atguigu/input

上传大文件

[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -put
/opt/software/hadoop-2.7.2.tar.gz /user/atguigu/input

上传文件后查看文件存放在什么位置

查看HDFS文件存储路径

[atguigu@hadoop102 subdir0]$ pwd
/opt/module/hadoop-2.7.2/data/tmp/dfs/data/current/BP-938951106-192.168.10.107-1495462844069/current/finalized/subdir0/subdir0

查看HDFS在磁盘存储文件内容

[atguigu@hadoop102 subdir0]$ cat blk_1073741825
hadoop yarn
hadoop mapreduce
atguigu
atguigu

拼接

-rw-rw-r–. 1 atguigu atguigu 134217728 5月 23 16:01 blk_1073741836
-rw-rw-r–. 1 atguigu atguigu 1048583 5月 23 16:01 blk_1073741836_1012.meta
-rw-rw-r–. 1 atguigu atguigu 63439959 5月 23 16:01 blk_1073741837
-rw-rw-r–. 1 atguigu atguigu 495635 5月 23 16:01 blk_1073741837_1013.meta
[atguigu@hadoop102 subdir0]$ cat blk_1073741836>>tmp.file
[atguigu@hadoop102 subdir0]$ cat blk_1073741837>>tmp.file
[atguigu@hadoop102 subdir0]$ tar -zxvf tmp.file

下载

[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -get
/user/atguigu/input/hadoop-2.7.2.tar.gz ./
其他还有需要配置集群时间同步。

集群时间同步

时间同步的方式:找一个机器,作为时间服务器,所有的机器与这台集群时间进行定时的同步,比如,每隔十分钟,同步一次时间。

相关内容

热门资讯

AWSECS:访问外部网络时出... 如果您在AWS ECS中部署了应用程序,并且该应用程序需要访问外部网络,但是无法正常访问,可能是因为...
AWSElasticBeans... 在Dockerfile中手动配置nginx反向代理。例如,在Dockerfile中添加以下代码:FR...
银河麒麟V10SP1高级服务器... 银河麒麟高级服务器操作系统简介: 银河麒麟高级服务器操作系统V10是针对企业级关键业务...
北信源内网安全管理卸载 北信源内网安全管理是一款网络安全管理软件,主要用于保护内网安全。在日常使用过程中,卸载该软件是一种常...
AWR报告解读 WORKLOAD REPOSITORY PDB report (PDB snapshots) AW...
AWS管理控制台菜单和权限 要在AWS管理控制台中创建菜单和权限,您可以使用AWS Identity and Access Ma...
​ToDesk 远程工具安装及... 目录 前言 ToDesk 优势 ToDesk 下载安装 ToDesk 功能展示 文件传输 设备链接 ...
群晖外网访问终极解决方法:IP... 写在前面的话 受够了群晖的quickconnet的小水管了,急需一个新的解决方法&#x...
不能访问光猫的的管理页面 光猫是现代家庭宽带网络的重要组成部分,它可以提供高速稳定的网络连接。但是,有时候我们会遇到不能访问光...
Azure构建流程(Power... 这可能是由于配置错误导致的问题。请检查构建流程任务中的“发布构建制品”步骤,确保正确配置了“Arti...