第4章 Hadoop文件参数配置
实验一:hadoop 全分布配置
1.1 实验目标
完成本实验,您应该能够:
- 掌握 hadoop 全分布的配置
- 掌握 hadoop 全分布的安装
- 掌握 hadoop 配置文件的参数意义
1.2 实验要求
- 认识 hadoop 全分布的安装
- 相识 hadoop 配置文件的意义
1.3 实验过程
1.3.1 实验任务一:在 Master 节点上安装 Hadoop
1.3.1.1 步骤一:解压缩 hadoop-2.7.1.tar.gz 安装包到/usr 目录下
- [root@master ~]# tar zvxf jdk-8u152-linux-x64.tar.gz -C /usr/local/src/
- [root@master ~]# tar zvxf hadoop-2.7.1.tar.gz -C /usr/local/src/
复制代码 1.3.1.2 步骤二:将 hadoop-2.7.1 文件夹重命名为 hadoop
- [root@master ~]# cd /usr/local/src/
- [root@master src]# ls
- hadoop-2.7.1 jdk1.8.0_152
- [root@master src]# mv hadoop-2.7.1/ hadoop
- [root@master src]# mv jdk1.8.0_152/ jdk
- [root@master src]# ls
- hadoop jdk
复制代码 1.3.1.3 步骤三:配置 Hadoop 环境变量
[root@master ~]# vi /etc/profile.d/hadoop.sh
注意:在第二章安装单机 Hadoop 系统已经配置过环境变量,先删除之前配置后添加- #写入以下信息
- export JAVA_HOME=/usr/local/src/jdk
- export HADOOP_HOME=/usr/local/src/hadoop
- export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
复制代码 1.3.1.4 步骤四:使配置的 Hadoop 的环境变量生效
- [root@master ~]# source /etc/profile.d/hadoop.sh
- [root@master ~]# echo $PATH
- /usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
复制代码 1.3.1.5 步骤五:执行以下命令修改 hadoop-env.sh 配置文件
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hadoop-env.sh
- #写入以下信息
- export JAVA_HOME=/usr/local/src/jdk
复制代码 1.3.2 实验任务二:配置 hdfs-site.xml 文件参数
执行以下命令修改 hdfs-site.xml 配置文件。- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hdfs-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:/usr/local/src/hadoop/dfs/name</value>
- </property>
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>file:/usr/local/src/hadoop/dfs/data</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>2</value>
- </property>
- </configuration>
- 创建目录
- [root@master ~]# mkdir -p /usr/local/src/hadoop/dfs/{name,data}
复制代码 对于 Hadoop 的分布式文件系统 HDFS 而言,一样平常都是采用冗余存储,冗余因子通常为3,也就是说,一份数据生存三份副本。所以,修改 dfs.replication 的配置,使 HDFS 文件的备份副本数量设定为2个。
1.3.3 实验任务三:配置 core-site.xml 文件参数
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/core-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://master:9000</value>
- </property>
- <property>
- <name>io.file.buffer.size</name>
- <value>131072</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>file:/usr/local/src/hadoop/tmp</value>
- </property>
- </configuration>
- #保存以上配置后创建目录
- [root@master ~]# mkdir -p /usr/local/src/hadoop/tmp
复制代码 如没有配置 hadoop.tmp.dir 参数,此时系统默认的临时目录为:/tmp/hadoop-hadoop。该目录在每次 Linux 系统重启后会被删除,必须重新执行 Hadoop 文件系统格式化命令,否则 Hadoop 运行会出错。
1.3.4 实验任务四:配置 mapred-site.xml
- [root@master ~]# cd /usr/local/src/hadoop/etc/hadoop/
- [root@master hadoop]# cp mapred-site.xml.template mapred-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>master:10020</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>master:19888</value>
- </property>
- </configuration>
复制代码 1.3.5 实验任务五:配置 yarn-site.xml
- [root@master hadoop]# vi /usr/local/src/hadoop/etc/hadoop/yarn-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>arn.resourcemanager.address</name>
- <value>master:8032</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>master:8030</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>master:8088</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>master:8031</value>
- </property>
- <property>
- <name>yarn.resourcemanager.admin.address</name>
- <value>master:8033</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
- </configuration>
复制代码 1.3.6 实验任务六:Hadoop 别的相关配置
1.3.6.1 步骤一:配置 masters 文件
- #修改 masters 配置文件
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/masters
- #加入以下配置信息
- 10.10.10.128
复制代码 1.3.6.2 步骤二:配置 slaves 文件
- #修改 slaves 配置文件
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/slaves
- #删除 localhost,加入以下配置信息
- 10.10.10.129
- 10.10.10.130
复制代码 1.3.6.3 步骤三:新建用户以及修改目录权限
- #新建用户
- [root@master ~]# useradd hadoop
- [root@master ~]# echo 'hadoop' | passwd --stdin hadoop
- Changing password for user hadoop.
- passwd: all authentication tokens updated successfully.
- #修改目录权限
- [root@master ~]# chown -R hadoop.hadoop /usr/local/src/
- [root@master ~]# cd /usr/local/src/
- [root@master src]# ll
- total 0
- drwxr-xr-x 11 hadoop hadoop 171 Mar 27 01:51 hadoop
- drwxr-xr-x 8 hadoop hadoop 255 Sep 14 2017 jdk
复制代码 1.3.6.4 步骤四:配置master能够免密登录全部slave节点
- [root@master ~]# ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/root/.ssh/id_rsa):
- Created directory '/root/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /root/.ssh/id_rsa.
- Your public key has been saved in /root/.ssh/id_rsa.pub.
- The key fingerprint is:
- SHA256:Ibeslip4Bo9erREJP37u7qhlwaEeMOCg8DlJGSComhk root@master
- The key's randomart image is:
- +---[RSA 2048]----+
- |B.oo |
- |Oo.o |
- |=o=. . o|
- |E.=.o + o |
- |.* BS|
- |* o = o |
- | * * o+ |
- |o O *o |
- |.=.+== |
- +----[SHA256]-----+
- [root@master ~]# ssh-copy-id root@slave1
- /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
- The authenticity of host 'slave1 (10.10.10.129)' can't be established.
- ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
- ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
- Are you sure you want to continue connecting (yes/no)? yes
- /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- root@slave1's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'root@slave1'"
- and check to make sure that only the key(s) you wanted were added.
- [root@master ~]# ssh-copy-id root@slave2
- /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
- The authenticity of host 'slave2 (10.10.10.130)' can't be established.
- ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
- ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
- Are you sure you want to continue connecting (yes/no)? yes
- /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- root@slave2's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'root@slave2'"
- and check to make sure that only the key(s) you wanted were added.
-
- [root@master ~]# ssh slave1
- Last login: Sun Mar 27 02:58:38 2022 from master
- [root@slave1 ~]# exit
- logout
- Connection to slave1 closed.
- [root@master ~]# ssh slave2
- Last login: Sun Mar 27 00:26:12 2022 from 10.10.10.1
- [root@slave2 ~]# exit
- logout
- Connection to slave2 closed.
复制代码 1.3.6.5 步骤五:同步/usr/local/src/目录下全部文件至全部slave节点
- [root@master ~]# scp -r /usr/local/src/* root@slave1:/usr/local/src/
- [root@master ~]# scp -r /usr/local/src/* root@slave2:/usr/local/src/
- [root@master ~]# scp /etc/profile.d/hadoop.sh root@slave1:/etc/profile.d/
- hadoop.sh 100% 151 45.9KB/s 00:00
-
- [root@master ~]# scp /etc/profile.d/hadoop.sh root@slave2:/etc/profile.d/
- hadoop.sh 100% 151 93.9KB/s 00:00
复制代码 1.3.6.6 步骤六:在全部slave节点执行以下命令
- (1)在slave1
- [root@slave1 ~]# useradd hadoop
- [root@slave1 ~]# echo 'hadoop' | passwd --stdin hadoop
- Changing password for user hadoop.
- passwd: all authentication tokens updated successfully.
- [root@slave1 ~]# chown -R hadoop.hadoop /usr/local/src/
- [root@slave1 ~]# ll /usr/local/src/
- total 0
- drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:07 hadoop
- drwxr-xr-x 8 hadoop hadoop 255 Mar 27 03:07 jdk
- [root@slave1 ~]# source /etc/profile.d/hadoop.sh
- [root@slave1 ~]# echo $PATH
- /usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
- (2)在slave2
- [root@slave2 ~]# useradd hadoop
- [root@slave2 ~]# echo 'hadoop' | passwd --stdin hadoop
- Changing password for user hadoop.
- passwd: all authentication tokens updated successfully.
- [root@slave2 ~]# chown -R hadoop.hadoop /usr/local/src/
- [root@slave2 ~]# ll /usr/local/src/
- total 0
- drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:09 hadoop
- drwxr-xr-x 8 hadoop hadoop 255 Mar 27 03:09 jdk
- [root@slave2 ~]# source /etc/profile.d/hadoop.sh
- [root@slave2 ~]# echo $PATH
- /usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
复制代码 第5章 Hadoop集群运行
实验一:hadoop 集群运行
1.1 实验目标
完成本实验,您应该能够:
- 掌握 hadoop 的运行状态
- 掌握 hadoop 文件系统格式化配置
- 掌握 hadoop java 运行状态查看
- 掌握 hadoop hdfs 报告查看
- 掌握 hadoop 节点状态查看
- 掌握停止 hadoop 进程操作
1.2 实验要求
- 认识怎样查看 hadoop 的运行状态
- 认识停止 hadoop 进程的操作
1.3 实验过程
1.3.1 实验任务一:配置 Hadoop 格式化
1.3.1.1 步骤一:NameNode 格式化
将 NameNode 上的数据清零,第一次启动 HDFS 时要进行格式化,以后启动无需再格式化,否则会缺失 DataNode 进程。别的,只要运行过 HDFS,Hadoop 的工作目录(本书设置为/usr/local/src/hadoop/tmp)就会有数据,如果需要重新格式化,则在格式化之前一定要先删除工作目录下的数据,否则格式化时会出题目。
执行如下命令,格式化 NameNode- [root@master ~]# su - hadoop
- Last login: Fri Apr 1 23:34:46 CST 2022 on pts/1
- [hadoop@master ~]$ cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$ ./bin/hdfs namenode -format
- 22/04/02 01:22:42 INFO namenode.NameNode: STARTUP_MSG:
- /************************************************************
复制代码 1.3.1.2 步骤二:启动 NameNode
- [hadoop@master hadoop]$ hadoop-daemon.sh start namenode
- namenode running as process 11868. Stop it first.
复制代码 1.3.2 实验任务二:查看 Java 进程
启动完成后,可以使用 JPS 命令查看是否成功。JPS 命令是 Java 提供的一个显示当前全部 Java 进程 pid 的命令。- [hadoop@master hadoop]$ jps
- 12122 Jps
- 11868 NameNode
复制代码 1.3.2.1 步骤一:切换到Hadoop用户
- [hadoop@master ~]$ su - hadoop
- Password:
- Last login: Sat Apr 2 01:22:13 CST 2022 on pts/1
- Last failed login: Sat Apr 2 04:47:08 CST 2022 on pts/1
- There was 1 failed login attempt since the last successful login.
复制代码 1.3.3 实验任务三:查看 HDFS 的报告
- [hadoop@master ~]$ hdfs dfsadmin -report
- Configured Capacity: 0 (0 B)
- Present Capacity: 0 (0 B)
- DFS Remaining: 0 (0 B)
- DFS Used: 0 (0 B)
- DFS Used%: NaN%
- Under replicated blocks: 0
- Blocks with corrupt replicas: 0
- Missing blocks: 0
- Missing blocks (with replication factor 1): 0
- -------------------------------------------------
复制代码 1.3.3.1 步骤一:生成密钥
- [hadoop@master ~]$ ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
- Created directory '/home/hadoop/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /home/hadoop/.ssh/id_rsa.
- Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
- The key fingerprint is:
- SHA256:nW/cVxmRp5Ht9TKGT61OmGbhQtkBdpHyS5prGhx24pI hadoop@master.example.com
- The key's randomart image is:
- +---[RSA 2048]----+
- | o.oo +.|
- | ...o o.=|
- | = o *+|
- | .o.* * *|
- |S.+= O =.|
- | = ++oB.+ .|
- | E + =+o. .|
- | . .o. .. |
- |.o |
- +----[SHA256]-----+
- [hadoop@master ~]$ ssh-copy-id slave1
- /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
- The authenticity of host 'slave1 (10.10.10.129)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- hadoop@slave1's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'slave1'"
- and check to make sure that only the key(s) you wanted were added.
- [hadoop@master ~]$ ssh-copy-id slave2
- /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
- The authenticity of host 'slave2 (10.10.10.130)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- hadoop@slave2's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'slave2'"
- and check to make sure that only the key(s) you wanted were added.
- [hadoop@master ~]$ ssh-copy-id master
- /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
- The authenticity of host 'master (10.10.10.128)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- hadoop@master's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'master'"
- and check to make sure that only the key(s) you wanted were added.
复制代码 1.3.4 实验任务四:停止dfs.sh
- [hadoop@master ~]$ stop-dfs.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.129: no datanode to stop
- 10.10.10.130: no datanode to stop
- Stopping secondary namenodes [0.0.0.0]
- The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- 0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
- 0.0.0.0: no secondarynamenode to stop
复制代码 1.3.4.1 重启并验证
第6章 Hive组建安装配置
实验一:Hive 组件安装配置
1.1. 实验目标
完成本实验,您应该能够:
- 掌握Hive 组件安装配置
- 掌握Hive 组件格式化和启动
1.2. 实验要求
- 认识Hive 组件安装配置
- 相识Hive 组件格式化和启动
1.3. 实验过程
1.3.1. 实验任务一:下载和解压安装文件
1.3.1.1. 步骤一:基础环境和安装准备
Hive 组件需要基于Hadoop 系统进行安装。因此,在安装 Hive 组件前,需要确保 Hadoop 系统能够正常运行。本章节内容是基于之前已部署完毕的 Hadoop 全分布系统,在 master 节点上实现 Hive 组件安装。
Hive 组件的部署规划和软件包路径如下:
(1)当前环境中已安装 Hadoop 全分布系统。
(2)当地安装 MySQL 数据库(账号 root,密码 Password123$), 软件包在/opt/software/mysql-5.7.18 路径下。
(3)MySQL 端口号(3306)。
(4)MySQL 的 JDBC 驱动包/opt/software/mysql-connector-java-5.1.47.jar, 在此基础上更新 Hive 元数据存储。
(5)Hive 软件包/opt/software/apache-hive-2.0.0-bin.tar.gz。
1.3.1.2. 步骤二:解压安装文件
(1)使用 root 用户,将 Hive 安装包
/opt/software/apache-hive-2.0.0-bin.tar.gz 路解压到/usr/local/src 路径下。- [root@master ~]# tar -zxvf /opt/software/apache-hive-2.0.0-bin.tar.gz -C /usr/local/src/
复制代码 (2)将解压后的 apache-hive-2.0.0-bin 文件夹更名为 hive;- [root@master ~]# mv /usr/local/src/apache-hive-2.0.0-bin/ /usr/local/src/hive/
复制代码 (3)修改 hive 目录归属用户和用户组为 hadoop- [root@master ~]# chown -R hadoop:hadoop /usr/local/src/hive
复制代码 1.3.2. 实验任务二:设置 Hive 环境
1.3.2.1. 步骤一:卸载MariaDB 数据库
Hive 元数据存储在 MySQL 数据库中,因此在部署 Hive 组件前需要起首在 Linux 系统下安装 MySQL 数据库,并进行 MySQL 字符集、安全初始化、远程访问权限等相关配置。需要使用 root 用户登录,执行如下操作步骤:
(1)关闭 Linux 系统防火墙,并将防火墙设定为系统开机并不主动启动。- [root@master ~]# systemctl stop firewalld
- [root@master ~]# systemctl disable firewalld
复制代码 (2)卸载 Linux 系统自带的 MariaDB。
- 起首查看 Linux 系统中 MariaDB 的安装环境。
[root@master ~]# rpm -qa | grep mariadb
2)卸载 MariaDB 软件包。
我这里没有就不需要卸载
1.3.2.2. 步骤二:安装MySQL 数据库
(1)按如下顺序依次按照 MySQL 数据库的 mysql common、mysql libs、mysql client 软件包。- [root@master ~]# cd /opt/software/mysql-5.7.18/
- [root@master mysql-5.7.18]# rpm -ivh mysql-community-common-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-common-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-common-5.7.18-1.el7.x86_64 is already installed
- [root@master mysql-5.7.18]# rpm -ivh mysql-community-libs-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-libs-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-libs-5.7.18-1.el7.x86_64 is already installed
- [root@master mysql-5.7.18]# rpm -ivh mysql-community-client-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-client-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-client-5.7.18-1.el7.x86_64 is already installed
复制代码 (2)安装 mysql server 软件包。- [root@master mysql-5.7.18]# rpm -ivh mysql-community-server-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-server-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-server-5.7.18-1.el7.x86_64 is already installed
复制代码 (3)修改 MySQL 数据库配置,在/etc/my.cnf 文件中添加如表 6-1 所示的 MySQL 数据库配置项。
将以下配置信息添加到/etc/my.cnf 文件 symbolic-links=0 配置信息的下方。- default-storage-engine=innodb
- innodb_file_per_table
- collation-server=utf8_general_ci
- init-connect='SET NAMES utf8'
- character-set-server=utf8
复制代码 (4)启动 MySQL 数据库。- [root@master ~]# systemctl start mysqld
复制代码 (5)查询 MySQL 数据库状态。mysqld 进程状态为 active (running),则表示 MySQL 数据库正常运行。
如果 mysqld 进程状态为 failed,则表示 MySQL 数据库启动非常。此时需要排查/etc/my.cnf 文件。- [root@master ~]# systemctl status mysqld
- ● mysqld.service - MySQL Server
- Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
- Active: active (running) since Sun 2022-04-10 22:54:39 CST; 1h 0min ago
- Docs: man:mysqld(8)
- http://dev.mysql.com/doc/refman/en/using-systemd.html
- Main PID: 929 (mysqld)
- CGroup: /system.slice/mysqld.service
- └─929 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/my...
- Apr 10 22:54:35 master systemd[1]: Starting MySQL Server...
- Apr 10 22:54:39 master systemd[1]: Started MySQL Server.
复制代码 (6)查询 MySQL 数据库默认密码。- [root@master ~]# cat /var/log/mysqld.log | grep password
- 2022-04-08T16:20:04.456271Z 1 [Note] A temporary password is generated for root@localhost: 0yf>>yWdMd8_
复制代码 MySQL 数据库是安装后随机生成的,所以每次安装后生成的默认密码不雷同。
(7)MySQL 数据库初始化。 0yf>>yWdMd8_
执行 mysql_secure_installation 命令初始化 MySQL 数据库,初始化过程中需要设定命据库 root 用户登录密码,密码需符合安全规则,包括大小写字符、数字和特殊符号, 可设定密码为 Password123$。
在进行 MySQL 数据库初始化过程中会出现以下交互确认信息:
1)Change the password for root ? ((Press y|Y for Yes, any other key for No)表示是否更改 root 用户密码,在键盘输入 y 和回车。
2)Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No)表示是否使用设定的密码继续,在键盘输入 y 和回车。
3)Remove anonymous users? (Press y|Y for Yes, any other key for No)表示是否删除匿名用户,在键盘输入 y 和回车。
4)Disallow root login remotely? (Press y|Y for Yes, any other key for No) 表示是否拒绝 root 用户远程登录,在键盘输入 n 和回车,表示答应 root 用户远程登录。
5)Remove test database and access to it? (Press y|Y for Yes, any other key for No)表示是否删除测试数据库,在键盘输入 y 和回车。
6)Reload privilege tables now? (Press y|Y for Yes, any other key for No) 表示是否重新加载授权表,在键盘输入 y 和回车。
mysql_secure_installation 命令执行过程如下:- [root@master ~]# mysql_secure_installation
- Securing the MySQL server deployment.
- Enter password for user root:
- The 'validate_password' plugin is installed on the server.
- The subsequent steps will run with the existing configuration
- of the plugin.
- Using existing password for root.
- Estimated strength of the password: 100
- Change the password for root ? ((Press y|Y for Yes, any other key for No) : y
- New password:
- Re-enter new password:
- Estimated strength of the password: 100
- Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No) : y
- By default, a MySQL installation has an anonymous user,
- allowing anyone to log into MySQL without having to have
- a user account created for them. This is intended only for
- testing, and to make the installation go a bit smoother.
- You should remove them before moving into a production
- environment.
- Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
- Success.
复制代码 - Normally, root should only be allowed to connect from
- 'localhost'. This ensures that someone cannot guess at
- the root password from the network.
- Disallow root login remotely? (Press y|Y for Yes, any other key for No) : n
- ... skipping.
- By default, MySQL comes with a database named 'test' that
- anyone can access. This is also intended only for testing,
- and should be removed before moving into a production
- environment.
复制代码 - Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
- - Dropping test database...
- Success.
- - Removing privileges on test database...
- Success.
- Reloading the privilege tables will ensure that all changes
- made so far will take effect immediately.
- Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
- Success.
- All done!
复制代码 (7) 添加 root 用户从当地和远程访问 MySQL 数据库表单的授权。- [root@master ~]# mysql -u root -p
- Enter password:
- Welcome to the MySQL monitor. Commands end with ; or \g.
- Your MySQL connection id is 9
- Server version: 5.7.18 MySQL Community Server (GPL)
- Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
- Oracle is a registered trademark of Oracle Corporation and/or its
- affiliates. Other names may be trademarks of their respective
- owners.
- Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
- mysql> grant all privileges on *.* to root@'localhost' identified by 'Password123$';
- Query OK, 0 rows affected, 1 warning (0.00 sec)
- mysql> grant all privileges on *.* to root@'%' identified by 'Password123$';
- Query OK, 0 rows affected, 1 warning (0.00 sec)
- mysql> flush privileges;
- Query OK, 0 rows affected (0.00 sec)
- mysql> select user,host from mysql.user where user='root';
- +------+-----------+
- | user | host |
- +------+-----------+
- | root | % |
- | root | localhost |
- +------+-----------+
- 2 rows in set (0.00 sec)
- mysql> exit;
- Bye
复制代码 1.3.2.3. 步骤三:配置 Hive 组件
(1)设置 Hive 环境变量并使其生效。- [root@master ~]# vim /etc/profile
- export HIVE_HOME=/usr/local/src/hive
- export PATH=$PATH:$HIVE_HOME/bin
- [root@master ~]# source /etc/profile
复制代码 (2)修改 Hive 组件配置文件。
切换到 hadoop 用户执行以下对 Hive 组件的配置操作。
将/usr/local/src/hive/conf 文件夹下 hive-default.xml.template 文件,更名为hive-site.xml。- [root@master ~]# su - hadoop
- Last login: Sun Apr 10 23:27:25 CS
- [hadoop@master ~]$ cp /usr/local/src/hive/conf/hive-default.xml.template /usr/local/src/hive/conf/hive-site.xml
复制代码 (3)通过 vi 编辑器修改 hive-site.xml 文件实现 Hive 连接 MySQL 数据库,并设定Hive 临时文件存储路径。- [hadoop@master ~]$ vi /usr/local/src/hive/conf/hive-site.xml
复制代码 1)设置 MySQL 数据库连接。- <name>javax.jdo.option.ConnectionURL</name>
- <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&us eSSL=false</value>
- <description>JDBC connect string for a JDBC metastore</description>
复制代码 2)配置 MySQL 数据库 root 的密码。- <property>
- <name>javax.jdo.option.ConnectionPassword</name>
- <value>Password123$</value>
- <description>password to use against s database</description>
- </property>
复制代码 3)验证元数据存储版本同等性。若默认 false,则不用修改。- <property>
- <name>hive.metastore.schema.verification</name>
- <value>false</value>
- <description>
- Enforce metastore schema version consistency.
- True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
- False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
- </description>
- </property>
复制代码 4)配置数据库驱动。- <property>
- <name>javax.jdo.option.ConnectionDriverName</name>
- <value>com.mysql.jdbc.Driver</value>
- <description>Driver class name for a JDBC metastore</description>
- </property>
复制代码 5)配置数据库用户名 javax.jdo.option.ConnectionUserName 为 root。- <property>
- <name>javax.jdo.option.ConnectionUserName</name>
- <value>root</value>
- <description>Username to use against metastore database</description>
- </property>
复制代码 6 )将以下位置的 ${system:java.io.tmpdir}/${system:user.name} 替换为“/usr/local/src/hive/tmp”目录及其子目录。
需要替换以下 4 处配置内容:- <name>hive.querylog.location</name>
- <value>/usr/local/src/hive/tmp</value>
- <description>Location of Hive run time structured log file</description>
- <name>hive.exec.local.scratchdir</name>
- <value>/usr/local/src/hive/tmp</value>
- <name>hive.downloaded.resources.dir</name>
- <value>/usr/local/src/hive/tmp/resources</value>
- <name>hive.server2.logging.operation.log.location</name>
- <value>/usr/local/src/hive/tmp/operation_logs</value>
复制代码 7)在Hive安装目录中创建临时文件夹 tmp。- [hadoop@master ~]$ mkdir /usr/local/src/hive/tmp
复制代码 至此,Hive 组件安装和配置完成。
1.3.2.4. 步骤四:初始化 hive 元数据
1)将 MySQL 数据库驱动(/opt/software/mysql-connector-java-5.1.46.jar)拷贝到Hive 安装目录的 lib 下;- [hadoop@master ~]$ cp /opt/software/mysql-connector-java-5.1.46.jar /usr/local/src/hive/lib/
复制代码 2)重新启动 hadooop 即可- [hadoop@master ~]$ stop-all.sh
- This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.129: stopping datanode
- 10.10.10.130: stopping datanode
- Stopping secondary namenodes [0.0.0.0]
- 0.0.0.0: stopping secondarynamenode
- stopping yarn daemons
- stopping resourcemanager
- 10.10.10.129: stopping nodemanager
- 10.10.10.130: stopping nodemanager
- no proxyserver to stop
- [hadoop@master ~]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
复制代码 3)初始化数据库- [hadoop@master ~]$ schematool -initSchema -dbType mysql
- which: no hbase in (/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/src/hive/bin:/home/hadoop/.local/bin:/home/hadoop/bin)
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- Metastore connection URL:jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&us eSSL=false
- Metastore Connection Driver :com.mysql.jdbc.Driver
- Metastore connection User: root
- Mon Apr 11 00:46:32 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Starting metastore schema initialization to 2.0.0
- Initialization script hive-schema-2.0.0.mysql.sql
- Password123$
- Password123$
- No current connection
- org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
复制代码 4)启动 hive- [hadoop@master hive]$ hive
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- hive>
复制代码 第7章 ZooKeeper组件安装配置
实验一:ZooKeeper 组件安装配置
1.1.实验目标
完成本实验,您应该能够:
- 掌握下载和安装 ZooKeeper
- 掌握 ZooKeeper 的配置选项
- 掌握启动 ZooKeeper
1.2.实验要求
- 相识 ZooKeeper 的配置选项
- 认识启动 ZooKeeper
1.3.实验过程
1.3.1 实验任务一:配置时间同步
- [root@master ~]# yum -y install chrony
- [root@master ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
-
- [root@master ~]# systemctl restart chronyd.service
- [root@master ~]# systemctl enable chronyd.service
- [root@master ~]# date
- Fri Apr 15 15:40:14 CST 2022
复制代码- [root@slave1 ~]# yum -y install chrony
- [root@slave1 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave1 ~]# systemctl restart chronyd.service
- [root@slave1 ~]# systemctl enable chronyd.service
- [root@slave1 ~]# date
- Fri Apr 15 15:40:17 CST 2022
复制代码- [root@slave2 ~]# yum -y install chrony
- [root@slave2 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave2 ~]# systemctl restart chronyd.service
- [root@slave2 ~]# systemctl enable chronyd.service
- [root@slave2 ~]# date
- Fri Apr 15 15:40:20 CST 2022
复制代码 1.3.2 实验任务二:下载和安装 ZooKeeper
ZooKeeper最新的版本可以通过官网http://hadoop.apache.org/zookeeper/来获取,安装 ZooKeeper 组件需要与 Hadoop 环境适配。
注意,各节点的防火墙需要关闭,否则会出现连接题目。
1.ZooKeeper 的安装包 zookeeper-3.4.8.tar.gz 已放置在 Linux系统/opt/software
目录下。
2.解压安装包到指定目标,在 Master 节点执行如下命令。- [root@master ~]# tar xf /opt/software/zookeeper-3.4.8.tar.gz -C /usr/local/src/
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv zookeeper-3.4.8/ zookeeper
复制代码 1.3.3 实验任务三:ZooKeeper的配置选项
1.3.3.1 步骤一:Master节点配置
(1)在 ZooKeeper 的安装目录下创建 data 和 logs 文件夹。- [root@master src]# cd /usr/local/src/zookeeper/
- [root@master zookeeper]# mkdir data logs
复制代码 (2)在每个节点写入该节点的标识编号,每个节点编号不同,master节点写入 1,slave1 节点写入2,slave2 节点写入3。- [root@master zookeeper]# echo '1' > /usr/local/src/zookeeper/data/myid
复制代码 (3)修改配置文件 zoo.cfg- [root@master zookeeper]# cd /usr/local/src/zookeeper/conf/
- [root@master conf]# cp zoo_sample.cfg zoo.cfg
复制代码 修改 dataDir 参数内容如下:- [root@master conf]# vi zoo.cfg
- dataDir=/usr/local/src/zookeeper/data
复制代码 (4)在 zoo.cfg 文件末端追加以下参数配置,表示三个 ZooKeeper 节点的访问端口号。- server.1=master:2888:3888
- server.2=slave1:2888:3888
- server.3=slave2:2888:3888
复制代码 (5)修改ZooKeeper安装目录的归属用户为 hadoop 用户。- [root@master conf]# chown -R hadoop:hadoop /usr/local/src/
复制代码 1.3.3.2 步骤二:Slave 节点配置
(1)从 Master 节点复制 ZooKeeper 安装目录到两个 Slave 节点。- [root@master ~]# scp -r /usr/local/src/zookeeper node1:/usr/local/src/
- [root@master ~]# scp -r /usr/local/src/zookeeper node2:/usr/local/src/
复制代码 (2)在slave1节点上修改 zookeeper 目录的归属用户为 hadoop 用户。- [root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/
- [root@slave1 ~]# ll /usr/local/src/
- total 4
- drwxr-xr-x. 12 hadoop hadoop 183 Apr 2 18:11 hadoop
- drwxr-xr-x 9 hadoop hadoop 183 Apr 15 16:37 hbase
- drwxr-xr-x. 8 hadoop hadoop 255 Apr 2 18:06 jdk
- drwxr-xr-x 12 hadoop hadoop 4096 Apr 22 15:31 zookeeper
复制代码 (3)在slave1节点上配置该节点的myid为2。- [root@slave1 ~]# echo 2 > /usr/local/src/zookeeper/data/myid
复制代码 (4)在slave2节点上修改 zookeeper 目录的归属用户为 hadoop 用户。- [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/
复制代码 (5)在slave2节点上配置该节点的myid为3。- [root@slave2 ~]# echo 3 > /usr/local/src/zookeeper/data/myid
复制代码 1.3.3.3 步骤三:系统环境变量配置
在 master、slave1、slave2 三个节点增加环境变量配置。- [root@master conf]# vi /etc/profile.d/zookeeper.sh
- export ZOOKEEPER_HOME=/usr/local/src/zookeeper
- export PATH=${ZOOKEEPER_HOME}/bin:$PATH
- [root@master ~]# scp /etc/profile.d/zookeeper.sh node1:/etc/profile.d/
- zookeeper.sh 100% 8742.3KB/s 00:00
- [root@master ~]# scp /etc/profile.d/zookeeper.sh node2:/etc/profile.d/
- zookeeper.sh 100% 8750.8KB/s 00:00
复制代码 1.3.4 实验任务四:启动 ZooKeeper
启动ZooKeeper需要使用Hadoop用户进行操作。
(1)分别在 master、slave1、slave2 三个节点使用 zkServer.sh start 命令启动ZooKeeper。- [root@master ~]# su - hadoop
- Last login: Fri Apr 15 21:54:17 CST 2022 on pts/0
- [hadoop@master ~]$ jps
- 3922 Jps
- [hadoop@master ~]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- [hadoop@master ~]$ jps
- 3969 Jps
- 3950 QuorumPeerMain
- [root@slave1 ~]# su - hadoop
- Last login: Fri Apr 15 22:06:47 CST 2022 on pts/0
- [hadoop@slave1 ~]$ jps
- 1370 Jps
- [hadoop@slave1 ~]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- [hadoop@slave1 ~]$ jps
- 1395 QuorumPeerMain
- 1421 Jps
- [root@slave2 ~]# su - hadoop
- Last login: Fri Apr 15 16:25:52 CST 2022 on pts/1
- [hadoop@slave2 ~]$ jps
- 1336 Jps
- [hadoop@slave2 ~]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- [hadoop@slave2 ~]$ jps
- 1361 QuorumPeerMain
- 1387 Jps
复制代码 (2)三个节点都启动完成后,再统一查看 ZooKeeper 运行状态。- [hadoop@master conf]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
- [hadoop@slave1 ~]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: leader
复制代码- [hadoop@slave2 conf]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
复制代码 第8章 HBase组件安装配置
实验一:HBase 组件安装与配置
1.1实验目标
完成本实验,您应该能够:
- 掌握HBase 安装与配置
- 掌握HBase 常用 Shell 命令
1.2实验要求
- 相识HBase 原理
- 认识HBase 常用 Shell 命令
1.3实验过程
1.3.1 实验任务一:配置时间同步
- [root@master ~]# yum -y install chrony
- [root@master ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
-
- [root@master ~]# systemctl restart chronyd.service
- [root@master ~]# systemctl enable chronyd.service
- [root@master ~]# date
- Fri Apr 15 15:40:14 CST 2022
复制代码- [root@slave1 ~]# yum -y install chrony
- [root@slave1 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave1 ~]# systemctl restart chronyd.service
- [root@slave1 ~]# systemctl enable chronyd.service
- [root@slave1 ~]# date
- Fri Apr 15 15:40:17 CST 2022
复制代码- [root@slave2 ~]# yum -y install chrony
- [root@slave2 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave2 ~]# systemctl restart chronyd.service
- [root@slave2 ~]# systemctl enable chronyd.service
- [root@slave2 ~]# date
- Fri Apr 15 15:40:20 CST 2022
复制代码 1.3.2 实验任务二:HBase 安装与配置
1.3.2.1 步骤一:解压缩 HBase 安装包
- [root@master ~]# tar -zxvf hbase-1.2.1-bin.tar.gz -C /usr/local/src/
复制代码 1.3.2.2 步骤二:重命名 HBase 安装文件夹
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv hbase-1.2.1 hbase
复制代码 1.3.2.3 步骤三:在全部节点添加环境变量
- [root@master ~]# cat /etc/profile
- # set hbase environment
- export HBASE_HOME=/usr/local/src/hbase
- export PATH=$HBASE_HOME/bin:$PATH
- [root@slave1 ~]# cat /etc/profile
- # set hbase environment
- export HBASE_HOME=/usr/local/src/hbase
- export PATH=$HBASE_HOME/bin:$PATH
- [root@slave2 ~]# cat /etc/profile
- # set hbase environment
- export HBASE_HOME=/usr/local/src/hbase
- export PATH=$HBASE_HOME/bin:$PATH
复制代码 1.3.2.4 步骤四:在全部节点使环境变量生效
- [root@master ~]# source /etc/profile
- [root@master ~]# echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/src/hive/bin:/root/bin:/usr/local/src/hive/bin:/usr/local/src/hive/bin
- [root@slave1 ~]# source /etc/profile
- [root@slave1 ~]# echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
- [root@slave2 ~]# source /etc/profile
- [root@slave2 ~]# echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
复制代码 1.3.2.5 步骤五:在 master 节点进入配置文件目录
- [root@master ~]# cd /usr/local/src/hbase/conf/
复制代码 1.3.2.6 步骤六:在 master 节点配置 hbase-env.sh 文件
- [root@master conf]# cat hbase-env.sh
- export JAVA_HOME=/usr/local/src/jdk
- export HBASE_MANAGES_ZK=true
- export HBASE_CLASSPATH=/usr/local/src/hadoop/etc/hadoop/
复制代码 1.3.2.7 步骤七:在 master 节点配置 hbase-site.xml
- [root@master conf]# cat hbase-site.xml
- <configuration>
- <property>
- <name>hbase.rootdir</name>
- <value>hdfs://master:9000/hbase</value>
- </property>
- <property>
- <name>hbase.master.info.port</name>
- <value>60010</value>
- </property>
- <property>
- <name>hbase.zookeeper.property.clientPort</name>
- <value>2181</value>
- </property>
- <property>
- <name>zookeeper.session.timeout</name>
- <value>120000</value>
- </property>
- <property>
- <name>hbase.zookeeper.quorum</name>
- <value>master,node1,node2</value>
- </property>
- <property>
- <name>hbase.tmp.dir</name>
- <value>/usr/local/src/hbase/tmp</value>
- </property>
- <property>
- <name>hbase.cluster.distributed</name>
- <value>true</value>
- </property>
- </configuration>
复制代码 1.3.2.8 步骤八:在master节点修改 regionservers 文件
- [root@master conf]# cat regionservers
- node1
- node2
复制代码 1.3.2.9 步骤九:在master节点创建 hbase.tmp.dir 目录
- [root@master ~]# mkdir /usr/local/src/hbase/tmp
复制代码 1.3.2.10 步骤十:将master上的hbase安装文件同步到 node1 node2
- [root@master ~]# scp -r /usr/local/src/hbase/ root@node1:/usr/local/src/
- [root@master ~]# scp -r /usr/local/src/hbase/ root@node2:/usr/local/src/
复制代码 1.3.2.11 步骤十一:在全部节点修改 hbase 目录权限
- [root@master ~]# chown -R hadoop:hadoop /usr/local/src/hbase/
- [root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/hbase/
- [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/hbase/
复制代码 1.3.2.12 步骤十二:在全部节点切换到hadoop用户
- [root@master ~]# su - hadoop
- Last login: Mon Apr 11 00:42:46 CST 2022 on pts/0
- [root@slave1 ~]# su - hadoop
- Last login: Fri Apr 8 22:57:42 CST 2022 on pts/0
- [root@slave2 ~]# su - hadoop
- Last login: Fri Apr 8 22:57:54 CST 2022 on pts/0
复制代码 1.3.2.13 步骤十三:启动 HBase
先启动 Hadoop,然后启动 ZooKeeper,末了启动 HBase。- [hadoop@master ~]$ start-all.sh
- [hadoop@master ~]$ jps
- 2130 SecondaryNameNode
- 1927 NameNode
- 2554 Jps
- 2301 ResourceManager
- [hadoop@slave1 ~]$ jps
- 1845 NodeManager
- 1977 Jps
- 1725 DataNode
- [hadoop@slave2 ~]$ jps
- 2080 Jps
- 1829 DataNode
- 1948 NodeManager
复制代码 1.3.2.14 步骤十四:在 master节点启动HBase
- [hadoop@master conf]$ start-hbase.sh
- [hadoop@master conf]$ jps
- 2130 SecondaryNameNode
- 3572 HQuorumPeer
- 1927 NameNode
- 5932 HMaster
- 2301 ResourceManager
- 6157 Jps
- [hadoop@slave1 ~]$ jps
- 2724 Jps
- 1845 NodeManager
- 1725 DataNode
- 2399 HQuorumPeer
- 2527 HRegionServer
- [root@slave2 ~]# jps
- 3795 Jps
- 1829 DataNode
- 3529 HRegionServer
- 1948 NodeManager
- 3388 HQuorumPeer
复制代码 1.3.2.15 步骤十五:修改windows上的hosts文件
(C:\Windows\System32\drivers\etc\hosts)
把hots文件拖到桌面上,然后编辑它参加master的主机名与P地址的映射关系后在浏览器上输入http//:master:60010访问hbase的web界面
1.3.3 实验任务三:HBase常用Shell命令
1.3.3.1 步骤一:进入 HBase 命令行
- [hadoop@master ~]$ hbase shell
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
- hbase(main):001:0>
复制代码 1.3.3.2 步骤二:创建表 scores,两个列簇:grade 和 course
- hbase(main):001:0> create 'scores','grade','course'
- 0 row(s) in 1.4400 seconds
- => Hbase::Table - scores
复制代码 1.3.3.3 步骤三:查看数据库状态
- hbase(main):002:0> status
- 1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load
复制代码 1.3.3.4 步骤四:查看数据库版本
- hbase(main):003:0> version
- 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
复制代码 1.3.3.5 步骤五:查看表
- hbase(main):004:0> list
- TABLE
- scores
- 1 row(s) in 0.0150 seconds
- => ["scores"]
复制代码 1.3.3.6 步骤六:插入记载 1:jie,grade: 143cloud
- hbase(main):005:0> put 'scores','jie','grade:','146cloud'
- 0 row(s) in 0.1060 seconds
复制代码 1.3.3.7 步骤七:插入记载 2:jie,course:math,86
- hbase(main):006:0> put 'scores','jie','course:math','86'
- 0 row(s) in 0.0120 seconds
复制代码 1.3.3.8 步骤八:插入记载 3:jie,course:cloud,92
- hbase(main):009:0> put 'scores','jie','course:cloud','92'
- 0 row(s) in 0.0070 seconds
复制代码 1.3.3.9 步骤九:插入记载 4:shi,grade:133soft
- hbase(main):010:0> put 'scores','shi','grade:','133soft'
- 0 row(s) in 0.0120 seconds
复制代码 1.3.3.10 步骤十:插入记载 5:shi,grade:math,87
- hbase(main):011:0> put 'scores','shi','course:math','87'
- 0 row(s) in 0.0090 seconds
复制代码 1.3.3.11 步骤十一:插入记载 6:shi,grade:cloud,96
- hbase(main):012:0> put 'scores','shi','course:cloud','96'
- 0 row(s) in 0.0100 seconds
复制代码 1.3.3.12 步骤十二:读取 jie 的记载
- hbase(main):013:0> get 'scores','jie'
- COLUMN CELL
- course:cloud timestamp=1650015032132, value=92
- course:mathtimestamp=1650014925177, value=86
- grade: timestamp=1650014896056, value=146cloud
- 3 row(s) in 0.0250 seconds
复制代码 1.3.3.13 步骤十三:读取 jie 的班级
- hbase(main):014:0> get 'scores','jie','grade'
- COLUMN CELL
- grade: timestamp=1650014896056, value=146cloud
- 1 row(s) in 0.0110 seconds
复制代码 1.3.3.14 步骤十四:查看整个表记载
- hbase(main):001:0> scan 'scores'
- ROW COLUMN+CELL
- jie column=course:cloud, timestamp=1650015032132, value=92
- jie column=course:math, timestamp=1650014925177, value=86
- jie column=grade:, timestamp=1650014896056, value=146cloud
- shi column=course:cloud, timestamp=1650015240873, value=96
- shi column=course:math, timestamp=1650015183521, value=87
- 2 row(s) in 0.1490 seconds
复制代码 1.3.3.15 步骤十五:按例查看表记载
- hbase(main):002:0> scan 'scores',{COLUMNS=>'course'}
- ROW COLUMN+CELL
- jie column=course:cloud, timestamp=1650015032132, value=92
- jie column=course:math, timestamp=1650014925177, value=86
- shi column=course:cloud, timestamp=1650015240873, value=96
- shi column=course:math, timestamp=1650015183521, value=87
- 2 row(s) in 0.0160 seconds
复制代码 1.3.3.16 步骤十六:删除指定记载shell
- hbase(main):003:0> delete 'scores','shi','grade'
- 0 row(s) in 0.0560 seconds
复制代码 1.3.3.17 步骤十七:删除后,执行scan 命令
- hbase(main):004:0> scan 'scores'
- ROW COLUMN+CELL
- jie column=course:cloud, timestamp=1650015032132, value=92
- jie column=course:math, timestamp=1650014925177, value=86
- jie column=grade:, timestamp=1650014896056, value=146cloud
- shi column=course:cloud, timestamp=1650015240873, value=96
- shi column=course:math, timestamp=1650015183521, value=87
- 2 row(s) in 0.0130 seconds
复制代码 1.3.3.18 步骤十八:增加新的列簇
- hbase(main):005:0> alter 'scores',NAME=>'age'
- Updating all regions with the new schema...
- 1/1 regions updated.
- Done.
- 0 row(s) in 2.0110 seconds
复制代码 1.3.3.19 步骤十九:查看表布局
- hbase(main):006:0> describe 'scores'
- Table scores is ENABLED
- scores
- COLUMN FAMILIES DESCRIPTION
- {NAME => 'age', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', C
- OMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
- {NAME => 'course', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER'
- , COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
- {NAME => 'grade', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER',
- COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
- 3 row(s) in 0.0230 seconds
复制代码 1.3.3.20 步骤二十:删除列簇
- hbase(main):007:0> alter 'scores',NAME=>'age',METHOD=>'delete'
- Updating all regions with the new schema...
- 1/1 regions updated.
- Done.
- 0 row(s) in 2.1990 seconds
复制代码 1.3.3.21 步骤二十一:删除表
- hbase(main):008:0> disable 'scores'
- 0 row(s) in 2.3190 seconds
复制代码 1.3.3.22 步骤二十二:退出
1.3.3.23 步骤二十三:关闭 HBase
- [hadoop@master ~]$ stop-hbase.sh
- stopping hbase.................
- master: stopping zookeeper.
- node2: stopping zookeeper.
- node1: stopping zookeeper.
复制代码在 master 节点关闭 Hadoop。
- [hadoop@master ~]$ stop-all.sh
- This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.130: stopping datanode
- 10.10.10.129: stopping datanode
- Stopping secondary namenodes [0.0.0.0]
- 0.0.0.0: stopping secondarynamenode
- stopping yarn daemons
- stopping resourcemanager
- 10.10.10.129: stopping nodemanager
- 10.10.10.130: stopping nodemanager
- no proxyserver to stop
- [hadoop@master ~]$ jps
- 3820 Jps
- [hadoop@slave1 ~]$ jps
- 2220 Jps
- [root@slave2 ~]# jps
- 2082 Jps
复制代码 完结,撒花
附件:
第9章 Sqoop组件安装配置
实验一:Sqoop 组件安装与配置
1.1.实验目标
完成本实验,您应该能够:
- 下载和解压 Sqoop
- 配置Sqoop 环境
- 安装Sqoop
- Sqoop 模板命令
1.2.实验要求
1.3.实验过程
1.3.1.实验任务一:下载和解压 Sqoop
安装Sqoop 组件需要与Hadoop 环境适配。使用 root 用户在Master 节点上进行部署, 将 /opt/software/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz 压 缩 包 解 压 到/usr/local/src 目录下。- [root@master ~]# tar xf /opt/software/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /usr/local/src/
复制代码 将解压后生成的 sqoop-1.4.7.bin hadoop-2.6.0 文件夹更名为 sqoop。- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop
复制代码 1.3.2.实验任务二:配置 Sqoop 环境
1.3.2.1.步骤一:创建 Sqoop 的配置文件 sqoop-env.sh。
复制 sqoop-env-template.sh 模板,并将模板重命名为 sqoop-env.sh。- [root@master src]# cd /usr/local/src/sqoop/conf/
- [root@master conf]# cp sqoop-env-template.sh sqoop-env.sh
复制代码 1.3.2.2.步骤二:修改 sqoop-env.sh 文件,添加 Hdoop、Hbase、Hive 等组件的安装路径。
注意,下面各组件的安装路径需要与实际环境中的安装路径保持同等。- vim sqoop-env.sh
- export HADOOP_COMMON_HOME=/usr/local/src/hadoop
- export HADOOP_MAPRED_HOME=/usr/local/src/hadoop
- export HBASE_HOME=/usr/local/src/hbase
- export HIVE_HOME=/usr/local/src/hive
复制代码 1.3.2.3.步骤三:配置 Linux 系统环境变量,添加 Sqoop 组件的路径。
- vim /etc/profile.d/sqoop.sh
- export SQOOP_HOME=/usr/local/src/sqoop
- export PATH=$SQOOP_HOME/bin:$PATH
- export CLASSPATH=$CLASSPATH:$SQOOP_HOME/lib
- [root@master conf]# source /etc/profile.d/sqoop.sh
- [root@master conf]# echo $PATH
- /usr/local/src/sqoop/bin:/usr/local/src/hbase/bin:/usr/local/src/zookeeper/bin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/src/hive/bin:/root/bin
复制代码 1.3.2.4.步骤四:连接数据库
为了使 Sqoop 能够连接 MySQL 数据库,需要将/opt/software/mysql-connector-jav a-5.1.46.jar 文件放入 sqoop 的 lib 目录中。该 jar 文件的版本需要与 MySQL 数据库的版本相对应,否则 Sqoop 导入数据时会报错。(mysql-connector-java-5.1.46.jar 对应的是 MySQL 5.7 版本)若该目录没有 jar 包,则使用第 6 章导入 home 目录的jar包- [root@master conf]# cp /opt/software/mysql-connector-java-5.1.46.jar /usr/local/src/sqoop/lib/
复制代码 1.3.3.实验任务三:启动Sqoop
1.3.3.1.步骤一:执行 Sqoop 前需要先启动 Hadoop 集群。
在 master 节点切换到 hadoop 用户执行 start-all.sh 命令启动 Hadoop 集群。- [root@master conf]# su - hadoop
- Last login: Fri Apr 22 16:21:25 CST 2022 on pts/0
- [hadoop@master ~]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
复制代码 1.3.3.2.步骤二:检查 Hadoop 集群的运行状态。
- [hadoop@master ~]$ jps
- 1653 SecondaryNameNode
- 2086 Jps
- 1450 NameNode
- 1822 ResourceManager
- [root@slave1 ~]# jps
- 1378 NodeManager
- 1268 DataNode
- 1519 Jps
- [root@slave2 ~]# jps
- 1541 Jps
- 1290 DataNode
- 1405 NodeManager
复制代码 1.3.3.3.步骤三:测试Sqoop是否能够正常连接MySQL 数据库。
Sqoop 连接 MySQL 数据库 P 大写 密码 Password123$
- [hadoop@master ~]$ sqoop list-databases --connect jdbc:mysql://master:3306 --username root -P
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 15:25:49 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- Enter password:
- 22/04/29 15:25:58 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri Apr 29 15:25:58 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- information_schema
- hive
- mysql
- performance_schema
- sys
复制代码 1.3.3.4.步骤四:连接 hive
为了使 Sqoop 能够连接 Hive,需要将 hive 组件/usr/local/src/hive/lib 目录下的
hive-common-2.0.0.jar 也放入 Sqoop 安装路径的 lib 目录中。- [hadoop@master ~]$ cp /usr/local/src/hive/lib/hive-common-2.0.0.jar /usr/local/src/sqoop/lib/
复制代码 1.3.4.实验任务四:Sqoop 模板命令
1.3.4.1.步骤一:创建MySQL数据库和数据表。
创建 sample 数据库,在 sample 中创建 student 表,在 student 表中插入了 3 条数据。- # 登录 MySQL 数据库
- [hadoop@master ~]$ mysql -uroot -pPassword123$
- mysql: [Warning] Using a password on the command line interface can be insecure.
- Welcome to the MySQL monitor. Commands end with ; or \g.
- Your MySQL connection id is 6
- Server version: 5.7.18 MySQL Community Server (GPL)
- Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
- Oracle is a registered trademark of Oracle Corporation and/or its
- affiliates. Other names may be trademarks of their respective
- owners.
- Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
- # 创建 sample 库
- mysql> create database sample;
- Query OK, 1 row affected (0.00 sec)
- # 使用 sample 库
- mysql> use sample;
- Database changed
- # 创建 student 表,该数据表有number学号和name姓名两个字段
- mysql> create table student(number char(9) primary key, name varchar(10));
- Query OK, 0 rows affected (0.01 sec)
- # 向 student 表插入几条数据
- mysql> insert into student values('01','zhangsan'),('02','lisi'),('03','wangwu');
- Query OK, 3 rows affected (0.01 sec)
- Records: 3 Duplicates: 0 Warnings: 0
- # 查询 student 表的数据
- mysql> select * from student;
- +--------+----------+
- | number | name |
- +--------+----------+
- | 01 | zhangsan |
- | 02 | lisi |
- | 03 | wangwu |
- +--------+----------+
- 3 rows in set (0.00 sec)
- mysql> quit
- Bye
复制代码 1.3.4.2.步骤二:在Hive中创建sample数据库和student数据表。
- hive>
- > create database sample;
- OK
- Time taken: 0.528 seconds
- hive> use sample;
- OK
- Time taken: 0.019 seconds
- hive> create table student(number STRING,name STRING);
- OK
- Time taken: 0.2 seconds
- hive> exit;
- [hadoop@master conf]$
复制代码 1.3.4.3.步骤三:从MySQL 导出数据,导入 Hive。
- [hadoop@master ~]$ sqoop import --connect jdbc:mysql://master:3306/sample --username root --password Password123$ --table student --fields-terminated-by '|' --delete-target-dir --num-mappers 1 --hive-import --hive-database sample --hive-table student
- hive>
- > select * from sample.student;
- OK
- 01|zhangsan NULL
- 02|lisi NULL
- 03|wangwu NULL
- Time taken: 1.238 seconds, Fetched: 3 row(s)
- hive>
- > exit;
复制代码 1.3.4.4.步骤四:sqoop常用命令
- #列出所有数据库
- [hadoop@master ~]$ sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 16:55:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 16:55:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 16:55:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri Apr 29 16:55:40 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- information_schema
- hive
- mysql
- performance_schema
- sample
- sys
- # 连接 MySQL 并列出 sample 数据库中的表
- [hadoop@master ~]$ sqoop list-tables --connect "jdbc:mysql://master:3306/sample?useSSL=false" --username root --password Password123$
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 16:56:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 16:56:45 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 16:56:45 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- student
- # 将关系型数据的表结构复制到 hive 中,只是复制表的结构,表中的内容没有复制过去
- [hadoop@master ~]$ sqoop create-hive-table --connect jdbc:mysql://master:3306/sample --table student --username root --password Password123$ --hive-table test
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 16:57:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 16:57:42 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 16:57:42 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
- 22/04/29 16:57:42 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
- 22/04/29 16:57:42 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri Apr 29 16:57:42 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 16:57:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- 22/04/29 16:57:43 INFO hive.HiveImport: Loading uploaded data into Hive
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- 22/04/29 16:57:46 INFO hive.HiveImport:
- 22/04/29 16:57:46 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:50 INFO hive.HiveImport: OK
- 22/04/29 16:57:50 INFO hive.HiveImport: Time taken: 0.853 seconds
- 22/04/29 16:57:51 INFO hive.HiveImport: Hive import complete.
- # 如果执行以上命令之后显示hive.HiveImport: Hive import complete.则表示成功
- [hadoop@master ~]$ sqoop import --connect jdbc:mysql://master:3306/sample --username root --password Password123$ --table student --fields-terminated-by '|' --delete-target-dir --num-mappers 1 --hive-import --hive-database default --hive-table test
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 17:00:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 17:00:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 17:00:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- 22/04/29 17:00:06 INFO tool.CodeGenTool: Beginning code generation
- Fri Apr 29 17:00:06 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:00:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:00:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/src/hadoop
- Note: /tmp/sqoop-hadoop/compile/556af862aa5bc04a542c14f0741f7dc6/student.java uses or overrides a deprecated API.
- Note: Recompile with -Xlint:deprecation for details.
- 22/04/29 17:00:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/556af862aa5bc04a542c14f0741f7dc6/student.jar
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- 22/04/29 17:00:07 INFO tool.ImportTool: Destination directory student is not present, hence not deleting.
- 22/04/29 17:00:07 WARN manager.MySQLManager: It looks like you are importing from mysql.
- 22/04/29 17:00:07 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
- 22/04/29 17:00:07 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
- 22/04/29 17:00:07 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
- 22/04/29 17:00:07 INFO mapreduce.ImportJobBase: Beginning import of student
- 22/04/29 17:00:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
- 22/04/29 17:00:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- 22/04/29 17:00:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
- Fri Apr 29 17:00:09 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:09 INFO db.DBInputFormat: Using read commited transaction isolation
- 22/04/29 17:00:09 INFO mapreduce.JobSubmitter: number of splits:1
- 22/04/29 17:00:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1651221174197_0003
- 22/04/29 17:00:09 INFO impl.YarnClientImpl: Submitted application application_1651221174197_0003
- 22/04/29 17:00:09 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1651221174197_0003/
- 22/04/29 17:00:09 INFO mapreduce.Job: Running job: job_1651221174197_0003
- 22/04/29 17:00:13 INFO mapreduce.Job: Job job_1651221174197_0003 running in uber mode : false
- 22/04/29 17:00:13 INFO mapreduce.Job: map 0% reduce 0%
- 22/04/29 17:00:17 INFO mapreduce.Job: map 100% reduce 0%
- 22/04/29 17:00:17 INFO mapreduce.Job: Job job_1651221174197_0003 completed successfully
- 22/04/29 17:00:17 INFO mapreduce.Job: Counters: 30
- File System Counters
- FILE: Number of bytes read=0
- FILE: Number of bytes written=134261
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=87
- HDFS: Number of bytes written=30
- HDFS: Number of read operations=4
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=1
- Other local map tasks=1
- Total time spent by all maps in occupied slots (ms)=1731
- Total time spent by all reduces in occupied slots (ms)=0
- Total time spent by all map tasks (ms)=1731
- Total vcore-seconds taken by all map tasks=1731
- Total megabyte-seconds taken by all map tasks=1772544
- Map-Reduce Framework
- Map input records=3
- Map output records=3
- Input split bytes=87
- Spilled Records=0
- Failed Shuffles=0
- Merged Map outputs=0
- GC time elapsed (ms)=35
- CPU time spent (ms)=1010
- Physical memory (bytes) snapshot=179433472
- Virtual memory (bytes) snapshot=2137202688
- Total committed heap usage (bytes)=88604672
- File Input Format Counters
- Bytes Read=0
- File Output Format Counters
- Bytes Written=30
- 22/04/29 17:00:17 INFO mapreduce.ImportJobBase: Transferred 30 bytes in 9.8777 seconds (3.0371 bytes/sec)
- 22/04/29 17:00:17 INFO mapreduce.ImportJobBase: Retrieved 3 records.
- 22/04/29 17:00:17 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table student
- Fri Apr 29 17:00:17 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:00:17 INFO hive.HiveImport: Loading uploaded data into Hive
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- 22/04/29 17:00:20 INFO hive.HiveImport:
- 22/04/29 17:00:20 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:24 INFO hive.HiveImport: OK
- 22/04/29 17:00:24 INFO hive.HiveImport: Time taken: 0.713 seconds
- 22/04/29 17:00:24 INFO hive.HiveImport: Loading data to table default.test
- 22/04/29 17:00:25 INFO hive.HiveImport: OK
- 22/04/29 17:00:25 INFO hive.HiveImport: Time taken: 0.42 seconds
- 22/04/29 17:00:25 INFO hive.HiveImport: Hive import complete.
- 22/04/29 17:00:25 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.
- hive> show tables;
- OK
- test
- Time taken: 0.558 seconds, Fetched: 1 row(s)
- hive> exit;
复制代码- # 从mysql中导出表内容到HDFS文件中
- [hadoop@master ~]$ sqoop import --connect jdbc:mysql://master:3306/sample --username root --password Password123$ --table student --num-mappers 1 --target-dir /user/test
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 17:03:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 17:03:13 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 17:03:13 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- 22/04/29 17:03:13 INFO tool.CodeGenTool: Beginning code generation
- Fri Apr 29 17:03:14 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:03:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:03:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:03:14 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/src/hadoop
- Note: /tmp/sqoop-hadoop/compile/eab748b8f3fb956072f4877fdf4bf23a/student.java uses or overrides a deprecated API.
- Note: Recompile with -Xlint:deprecation for details.
- 22/04/29 17:03:15 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/eab748b8f3fb956072f4877fdf4bf23a/student.jar
- 22/04/29 17:03:15 WARN manager.MySQLManager: It looks like you are importing from mysql.
- 22/04/29 17:03:15 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
- 22/04/29 17:03:15 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
- 22/04/29 17:03:15 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
- 22/04/29 17:03:15 INFO mapreduce.ImportJobBase: Beginning import of student
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- 22/04/29 17:03:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
- 22/04/29 17:03:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- 22/04/29 17:03:15 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
- Fri Apr 29 17:03:17 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:03:17 INFO db.DBInputFormat: Using read commited transaction isolation
- 22/04/29 17:03:17 INFO mapreduce.JobSubmitter: number of splits:1
- 22/04/29 17:03:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1651221174197_0004
- 22/04/29 17:03:17 INFO impl.YarnClientImpl: Submitted application application_1651221174197_0004
- 22/04/29 17:03:17 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1651221174197_0004/
- 22/04/29 17:03:17 INFO mapreduce.Job: Running job: job_1651221174197_0004
- 22/04/29 17:03:21 INFO mapreduce.Job: Job job_1651221174197_0004 running in uber mode : false
- 22/04/29 17:03:21 INFO mapreduce.Job: map 0% reduce 0%
- 22/04/29 17:03:25 INFO mapreduce.Job: map 100% reduce 0%
- 22/04/29 17:03:25 INFO mapreduce.Job: Job job_1651221174197_0004 completed successfully
- 22/04/29 17:03:25 INFO mapreduce.Job: Counters: 30
- File System Counters
- FILE: Number of bytes read=0
- FILE: Number of bytes written=134251
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=87
- HDFS: Number of bytes written=30
- HDFS: Number of read operations=4
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=1
- Other local map tasks=1
- Total time spent by all maps in occupied slots (ms)=1945
- Total time spent by all reduces in occupied slots (ms)=0
- Total time spent by all map tasks (ms)=1945
- Total vcore-seconds taken by all map tasks=1945
- Total megabyte-seconds taken by all map tasks=1991680
- Map-Reduce Framework
- Map input records=3
- Map output records=3
- Input split bytes=87
- Spilled Records=0
- Failed Shuffles=0
- Merged Map outputs=0
- GC time elapsed (ms)=69
- CPU time spent (ms)=1050
- Physical memory (bytes) snapshot=179068928
- Virtual memory (bytes) snapshot=2136522752
- Total committed heap usage (bytes)=88604672
- File Input Format Counters
- Bytes Read=0
- File Output Format Counters
- Bytes Written=30
- 22/04/29 17:03:25 INFO mapreduce.ImportJobBase: Transferred 30 bytes in 10.2361 seconds (2.9308 bytes/sec)
- 22/04/29 17:03:25 INFO mapreduce.ImportJobBase: Retrieved 3 records.
- # 执行以上命令后在浏览器上访问master_ip:50070然后点击Utilities下面的Browse the file system,要能看到user就表示成功
复制代码- [hadoop@master ~]$ hdfs dfs -ls /user/test
- Found 2 items
- -rw-r--r-- 2 hadoop supergroup 0 2022-04-29 17:03 /user/test/_SUCCESS
- -rw-r--r-- 2 hadoop supergroup 30 2022-04-29 17:03 /user/test/part-m-00000
- [hadoop@master ~]$ hdfs dfs -cat /user/test/part-m-00000
- 01,zhangsan
- 02,lisi
- 03,wangwu
复制代码 第10章 Flume组件安装配置
实验一:Flume 组件安装配置
1.1. 实验目标
完成本实验,您应该能够:
- 掌握下载和解压 Flume
- 掌握 Flume 组件部署
- 掌握使用 Flume 发送和继承信息
1.2. 实验要求
- 相识 Flume 相关知识
- 认识 Flume 功能应用
- 认识 Flume 组件设置
1.3. 实验过程
1.3.1. 实验任务一:下载和解压 Flume
使用 root 用户解压 Flume 安装包到“/usr/local/src”路径,并修改解压后文件夹名
为 flume。- [root@master ~]# tar xf /opt/software/apache-flume-1.6.0-bin.tar.gz -C /usr/local/src/
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv apache-flume-1.6.0-bin/
- flume
- [root@master src]# chown -R hadoop.hadoop /usr/local/src/
复制代码 1.3.2. 实验任务二:Flume 组件部署
1.3.2.1. 步骤一:使用 root 用户设置 Flume 环境变量,并使环境变量对全部用户生效。
- [root@master src]# vim /etc/profile.d/flume.sh
- export FLUME_HOME=/usr/local/src/flume
- export PATH=${FLUME_HOME}/bin:$PATH
复制代码 1.3.2.2. 步骤二:修改 Flume 相应配置文件。
起首,切换到 hadoop 用户,并切换当前工作目录到 Flume 的配置文件夹。- [hadoop@master ~]$ echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/zookeeper/bin:/usr/local/src/sqoop/bin:/usr/local/src/hive/bin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/flume/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/src/hive/bin:/home/hadoop/.local/bin:/home/hadoop/bin
复制代码 1.3.2.3. 步骤三:修改并配置 flume-env.sh 文件。
- [hadoop@master ~]$ vim /usr/local/src/hbase/conf/hbase-env.sh
- #export HBASE_CLASSPATH=/usr/local/src/hadoop/etc/hadoop/ #注释掉这一行的内容
- export JAVA_HOME=/usr/local/src/jdk
- [hadoop@master conf]$ start-all.sh
- [hadoop@master ~]$ flume-ng version
- Flume 1.6.0
- Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
- Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
- Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
- From source with checksum b29e416802ce9ece3269d34233baf43f
复制代码 1.3.3. 实验任务三:使用 Flume 发送和继承信息
通过 Flume 将 Web 服务器中数据传输到 HDFS 中。
1.3.3.1. 步骤一:在 Flume 安装目录中创建 simple-hdfs-flume.conf 文件。
- [hadoop@master ~]$ cd /usr/local/src/flume/
- [hadoop@master ~]$ vi /usr/local/src/flume/simple-hdfs-flume.conf
- a1.sources=r1
- a1.sinks=k1
- a1.channels=c1
- a1.sources.r1.type=spooldir
- a1.sources.r1.spoolDir=/usr/local/src/hadoop/logs/
- a1.sources.r1.fileHeader=true
- a1.sinks.k1.type=hdfs
- a1.sinks.k1.hdfs.path=hdfs://master:9000/tmp/flume
- a1.sinks.k1.hdfs.rollsize=1048760
- a1.sinks.k1.hdfs.rollCount=0
- a1.sinks.k1.hdfs.rollInterval=900
- a1.sinks.k1.hdfs.useLocalTimeStamp=true
- a1.channels.c1.type=file
- a1.channels.c1.capacity=1000
- a1.channels.c1.transactionCapacity=100
- a1.sources.r1.channels = c1
- a1.sinks.k1.channel = c1
复制代码 1.3.3.2. 步骤二:使用 flume-ng agent 命令加载 simple-hdfs-flume.conf 配置信息,启 配置信息,启动flume 传输数据。
- [hadoop@master ~]$ flume-ng agent --conf-file simple-hdfs-flume.conf --name a1
复制代码ctrl+c 退出 flume 传输
1.3.3.3. 步骤三:查看 Flume 传输到 HDFS 的文件,若能查看到 HDFS 上/tmp/flume目录有传输的数据文件,则表示数据传输成功。
- [hadoop@master ~]$ hdfs dfs -ls /
- Found 5 items
- drwxr-xr-x - hadoop supergroup 0 2022-04-15 22:04 /hbase
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:24 /input
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:26 /output
- drwxr-xr-x - hadoop supergroup 0 2022-05-06 17:24 /tmp
- drwxr-xr-x - hadoop supergroup 0 2022-04-29 17:03 /user
复制代码
第13章 大数据平台监控命令
实验一:通过命令监控大数据平台运行状态
1.1. 实验目标
完成本实验,您应该能够:
- 掌握大数据平台的运行状况
- 掌握查看大数据平台运行状况的命令
1.2. 实验要求
- 认识查看大数据平台运行状态的方式
- 相识查看大数据平台运行状况的命令
1.3. 实验过程
1.3.1. 实验任务一:通过命令查看大数据平台状态
1.3.1.1. 步骤一: 查看 Linux 系统的信息( uname -a)
- [root@master ~]# uname -a
- Linux master 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
复制代码 1.3.1.2. 步骤二:查看硬盘信息
(1)查看全部分区(fdisk -l)- [root@master ~]# fdisk -l
- Disk /dev/sda: 21.5 GB, 21474836480 bytes, 41943040 sectors
- Units = sectors of 1 * 512 = 512 bytes
- Sector size (logical/physical): 512 bytes / 512 bytes
- I/O size (minimum/optimal): 512 bytes / 512 bytes
- Disk label type: dos
- Disk identifier: 0x00096169
- Device Boot Start End Blocks Id System
- /dev/sda1 * 2048 2099199 1048576 83 Linux
- /dev/sda2 2099200 41943039 19921920 8e Linux LVM
- Disk /dev/mapper/centos-root: 18.2 GB, 18249416704 bytes, 35643392 sectors
- Units = sectors of 1 * 512 = 512 bytes
- Sector size (logical/physical): 512 bytes / 512 bytes
- I/O size (minimum/optimal): 512 bytes / 512 bytes
复制代码 - Disk /dev/mapper/centos-swap: 2147 MB, 2147483648 bytes, 4194304 sectors
- Units = sectors of 1 * 512 = 512 bytes
- Sector size (logical/physical): 512 bytes / 512 bytes
- I/O size (minimum/optimal): 512 bytes / 512 bytes
复制代码 (2)查看全部交换分区(swapon -s)- [root@master ~]# swapon -s
- Filename Type Size Used Priority
- /dev/dm-1 partition 2097148 0 -
复制代码 (3)查看文件系统占比(df -h)- [root@master ~]# df -h
- Filesystem Size Used Avail Use% Mounted on
- /dev/mapper/centos-root 17G 4.8G 13G 28% /
- devtmpfs 980M 0 980M 0% /dev
- tmpfs 992M 0 992M 0% /dev/shm
- tmpfs 992M 9.5M 982M 1% /run
- tmpfs 992M 0 992M 0% /sys/fs/cgroup
- /dev/sda1 1014M 130M 885M 13% /boot
- tmpfs 199M 0 199M 0% /run/user/0
复制代码 1.3.1.3. 步骤三: 查看网络 IP 地址( ifconfig)
- [root@master ~]# ifconfig
- ens32: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
- inet 10.10.10.128 netmask 255.255.255.0 broadcast 10.10.10.255
- inet6 fe80::af34:1702:3972:2b64 prefixlen 64 scopeid 0x20<link>
- ether 00:0c:29:2e:33:83 txqueuelen 1000 (Ethernet)
- RX packets 342 bytes 29820 (29.1 KiB)
- RX errors 0 dropped 0 overruns 0 frame 0
- TX packets 257 bytes 26394 (25.7 KiB)
- TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
- lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
- inet 127.0.0.1 netmask 255.0.0.0
- inet6 ::1 prefixlen 128 scopeid 0x10<host>
- loop txqueuelen 1000 (Local Loopback)
- RX packets 4 bytes 360 (360.0 B)
- RX errors 0 dropped 0 overruns 0 frame 0
- TX packets 4 bytes 360 (360.0 B)
- TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
复制代码 1.3.1.4. 步骤四:查看全部监听端口( netstat -lntp)
- [root@master ~]# netstat -lntp
- Active Internet connections (only servers)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 933/sshd
- tcp6 0 0 :::3306 :::* LISTEN 1021/mysqld
- tcp6 0 0 :::22 :::* LISTEN 933/sshd 、
复制代码 1.3.1.5. 步骤五:查看全部已经创建的连接( netstat -antp)
- [root@master ~]# netstat -antp
- Active Internet connections (servers and established)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 933/sshd
- tcp 0 52 10.10.10.128:22 10.10.10.1:59963 ESTABLISHED 1249/sshd: root@pts
- tcp6 0 0 :::3306 :::* LISTEN 1021/mysqld
- tcp6 0 0 :::22 :::* LISTEN 933/sshd
复制代码 1.3.1.6. 步骤六:实时显示进程状态( top ),该命令可以查看进程对 CPU 、内存的占比等。
- [root@master ~]# top
- top - 16:09:46 up 47 min, 2 users, load average: 0.00, 0.01, 0.05
- Tasks: 115 total, 1 running, 114 sleeping, 0 stopped, 0 zombie
- %Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- KiB Mem : 2030172 total, 1575444 free, 281296 used, 173432 buff/cache
- KiB Swap: 2097148 total, 2097148 free, 0 used. 1571928 avail Mem
- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
- 1021 mysql 20 0 1258940 191544 6840 S 0.3 9.4 0:01.71 mysqld
- 1 root 20 0 125456 3896 2560 S 0.0 0.2 0:00.96 systemd
- 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
- 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
- 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
- 7 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/0
- 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
- 9 root 20 0 0 0 0 S 0.0 0.0 0:00.15 rcu_sched
- 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain
- 11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
- 12 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/1
- 13 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
- 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
- 16 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
- 17 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/2
- 18 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/2
- 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2
复制代码 1.3.1.7. 步骤七:查看 U CPU 信息( cat /proc/cpuinfo )
1.3.1.8. 步骤八:查看内存信息( cat /proc/meminfo ),该命令可以查看总内存、空闲内存等信息。
- [root@master ~]# cat /proc/meminfo
- MemTotal: 2030172 kB
- MemFree: 1575448 kB
- MemAvailable: 1571932 kB
- Buffers: 2112 kB
- Cached: 126676 kB
- SwapCached: 0 kB
- Active: 251708 kB
- Inactive: 100540 kB
- Active(anon): 223876 kB
- Inactive(anon): 9252 kB
- Active(file): 27832 kB
- Inactive(file): 91288 kB
- Unevictable: 0 kB
- Mlocked: 0 kB
- SwapTotal: 2097148 kB
- SwapFree: 2097148 kB
- Dirty: 0 kB
- Writeback: 0 kB
- AnonPages: 223648 kB
- Mapped: 28876 kB
- Shmem: 9668 kB
- Slab: 44644 kB
- SReclaimable: 18208 kB
- SUnreclaim: 26436 kB
- KernelStack: 4512 kB
- PageTables: 4056 kB
- NFS_Unstable: 0 kB
- Bounce: 0 kB
- WritebackTmp: 0 kB
- CommitLimit: 3112232 kB
- Committed_AS: 782724 kB
- VmallocTotal: 34359738367 kB
- VmallocUsed: 180220 kB
- VmallocChunk: 34359310332 kB
- HardwareCorrupted: 0 kB
- AnonHugePages: 178176 kB
- CmaTotal: 0 kB
- CmaFree: 0 kB
- HugePages_Total: 0
- HugePages_Free: 0
- HugePages_Rsvd: 0
- HugePages_Surp: 0
- Hugepagesize: 2048 kB
- DirectMap4k: 63360 kB
- DirectMap2M: 2033664 kB
- DirectMap1G: 0 kB
复制代码 1.3.2. 实验任务二:通过命令查看 Hadoop 状态
1.3.2.1. 步骤一:切换到 hadoop 用户
若当前的用户为 root,请切换到 hadoop 用户进行操作。- [root@master ~]# su - hadoop
- Last login: Tue May 10 14:33:03 CST 2022 on pts/0
- [hadoop@master ~]$
复制代码 1.3.2.2. 步骤二:切换到 Hadoop 的安装目录
- [hadoop@master ~]$ cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$
复制代码 1.3.2.3. 步骤三:启动 Hadoop
- [hadoop@master hadoop]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
- [hadoop@master hadoop]$ jps
- 1697 SecondaryNameNode
- 2115 Jps
- 1865 ResourceManager
- 1498 NameNode
复制代码 1.3.2.4. 步骤四:关闭 Hadoop
- [hadoop@master hadoop]$ stop-all.sh
- This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.130: stopping datanode
- 10.10.10.129: stopping datanode
- Stopping secondary namenodes [0.0.0.0]
- 0.0.0.0: stopping secondarynamenode
- stopping yarn daemons
- stopping resourcemanager
- 10.10.10.129: stopping nodemanager
- 10.10.10.130: stopping nodemanager
- no proxyserver to stop
复制代码 实验二:通过命令监控大数据平台资源状态
2.1 实验目标
完成本实验,您应该能够:
- 掌握大数据平台资源的运行状况
- 掌握查看大数据平台资源运行状况的命令
2.2. 实验要求
- 认识查看大数据平台资源运行状态的方式
- 相识查看大数据平台资源运行状况的命令
2.3. 实验过程
2.3.1. 实验任务一:看通过命令查看YARN状态
2.3.1.1. 步骤一:确认切换到目录 确认切换到目录 /usr/local/src/hadoop
- [hadoop@master ~]$ cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$
复制代码 2.3.1.2. 步骤二:返回主机界面在在Master主机上执行 start-all.sh
- [hadoop@master ~]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slav1.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slav1.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- [hadoop@master ~]$
- #master 节点启动 zookeeper
- [hadoop@master hadoop]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- #slave1 节点启动 zookeeper
- [hadoop@slav1 hadoop]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- #slave2 节点启动 zookeeper
- [hadoop@slave2 hadoop]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
复制代码 2.3.1.3. 步骤三:执行JPS命令,发现Master上有NodeManager进程和ResourceManager进程,则YARN启动完成。
- 2817 NameNode
- 3681 ResourceManager
- 3477 NodeManager
- 3909 Jps
- 2990 SecondaryNameNode
复制代码 2.3.2. 实验任务二:通过命令查看HDFS状态
2.3.2.1. 步骤一:目录操作
切换到 hadoop 目录,执行 cd /usr/local/src/hadoop 命令- [hadoop@master ~]$ cd /usr/local/src/hadoop
- [hadoop@master hadoop]$
复制代码 查看 HDFS 目录- [hadoop@master hadoop]$ ./bin/hdfs dfs –ls /
复制代码 2.3.2.2. 步骤二:查看HDSF的报告,执行命令:bin/hdfs dfsadmin -report
- [hadoop@master hadoop]$ bin/hdfs dfsadmin -report
- Configured Capacity: 36477861888 (33.97 GB)
- Present Capacity: 31767752704 (29.59 GB)
- DFS Remaining: 31767146496 (29.59 GB)
- DFS Used: 606208 (592 KB)
- DFS Used%: 0.00%
- Under replicated blocks: 0
- Blocks with corrupt replicas: 0
- Missing blocks: 0
- Missing blocks (with replication factor 1): 0
- -------------------------------------------------
- Live datanodes (2):
- Name: 10.10.10.129:50010 (node1)
- Hostname: node1
- Decommission Status : Normal
- Configured Capacity: 18238930944 (16.99 GB)
- DFS Used: 303104 (296 KB)
- Non DFS Used: 2379792384 (2.22 GB)
- DFS Remaining: 15858835456 (14.77 GB)
- DFS Used%: 0.00%
- DFS Remaining%: 86.95%
- Configured Cache Capacity: 0 (0 B)
- Cache Used: 0 (0 B)
- Cache Remaining: 0 (0 B)
- Cache Used%: 100.00%
- Cache Remaining%: 0.00%
- Xceivers: 1
- Last contact: Fri May 20 18:31:48 CST 2022
复制代码 - Name: 10.10.10.130:50010 (node2)
- Hostname: node2
- Decommission Status : Normal
- Configured Capacity: 18238930944 (16.99 GB)
- DFS Used: 303104 (296 KB)
- Non DFS Used: 2330316800 (2.17 GB)
- DFS Remaining: 15908311040 (14.82 GB)
- DFS Used%: 0.00%
- DFS Remaining%: 87.22%
- Configured Cache Capacity: 0 (0 B)
- Cache Used: 0 (0 B)
- Cache Remaining: 0 (0 B)
- Cache Used%: 100.00%
- Cache Remaining%: 0.00%
- Xceivers: 1
- Last contact: Fri May 20 18:31:48 CST 2022
复制代码 2.3.2.3. 步骤三:查看 HDFS 空间环境,执行命令:hdfs dfs -df
- [hadoop@master hadoop]$ hdfs dfs -df
- Filesystem Size Used Available Use%
- hdfs://master:9000 36477861888 606208 31767146496 0%
复制代码 2.3.3. 实验任务三:看通过命令查看HBase状态
2.3.3.1. 步骤一 :启动运行HBase
切换到 HBase 安装目录/usr/local/src/hbase,命令如下:- [hadoop@master hadoop]$ cd /usr/local/src/hbase
- [hadoop@master hbase]$ hbase version
- HBase 1.2.1
- Source code repository git://asf-dev/home/busbey/projects/hbase revision=8d8a7107dc4ccbf36a92f64675dc60392f85c015
- Compiled by busbey on Wed Mar 30 11:19:21 CDT 2016
- From source with checksum f4bb4a14bb4e0b72b46f729dae98a772
复制代码结果显示 HBase1.2.1,说明 HBase 正在运行,版本号为 1.2.1。
2.3.3.2. 步骤二:查看HBase版本信息
执行命令hbase shell,进入HBase命令交互界面。- [hadoop@master hbase]$ hbase shell
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
复制代码 输入version,查询 HBase 版本- hbase(main):001:0> version
- 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
复制代码 结果显示 HBase 版本为 1.2.1
2.3.3.3. 步骤三 :查询 HBase 状态,在 HBase 命令交互界面,执行 status 命令
- 1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667
- average load
复制代码 我们还可以“简单”查询 HBase 的状态,执行命令 status 'simple'- active master: master:16000 1589125905790
- 0 backup masters
- 3 live servers
- master:16020 1589125908065
- requestsPerSecond=0.0, numberOfOnlineRegions=1,
- usedHeapMB=28, maxHeapMB=1918, numberOfStores=1,
- numberOfStorefiles=1, storefileUncompressedSizeMB=0,
- storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0,
- readRequestsCount=5, writeRequestsCount=1, rootIndexSizeKB=0,
- totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0,
- totalCompactingKVs=0, currentCompactedKVs=0,
- compactionProgressPct=NaN, coprocessors=[MultiRowMutationEndpoint]
- slave1:16020 1589125915820
- requestsPerSecond=0.0, numberOfOnlineRegions=0,
- usedHeapMB=17, maxHeapMB=440, numberOfStores=0,
- numberOfStorefiles=0, storefileUncompressedSizeMB=0,
- storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0,
- readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0,
- totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0,
- totalCompactingKVs=0, currentCompactedKVs=0,
- compactionProgressPct=NaN, coprocessors=[]
- slave2:16020 1589125917741
- requestsPerSecond=0.0, numberOfOnlineRegions=1,
- usedHeapMB=15, maxHeapMB=440, numberOfStores=1,
- numberOfStorefiles=1, storefileUncompressedSizeMB=0,
- storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0,
- readRequestsCount=4, writeRequestsCount=0, rootIndexSizeKB=0,
- totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0,
- totalCompactingKVs=0, currentCompactedKVs=0,
- compactionProgressPct=NaN, coprocessors=[]
- 0 dead servers
- Aggregate load: 0, regions: 2
复制代码 显示更多的关于 Master、Slave1和 Slave2 主机的服务端口、请求时间等具体信息。
如果需要查询更多关于 HBase 状态,执行命令 help 'status'- hbase(main):004:0> help 'status'
- Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The
- default is 'summary'. Examples:
- hbase> status
- hbase> status 'simple'
- hbase> status 'summary'
- hbase> status 'detailed'
- hbase> status 'replication'
- hbase> status 'replication', 'source'
- hbase> status 'replication', 'sink'
复制代码 结果显示出全部关于 status 的命令。
2.3.3.4. 步骤四 停止HBase服务
停止HBase服务,则执行命令stop-hbase.sh。- [hadoop@master hbase]$ stop-hbase.sh
- stopping hbasecat.........
复制代码 2.4.4. 实验任务四:通过命令查看 Hive 状态
2.4.4.1. 步骤一:启动 Hive
切换到/usr/local/src/hive 目录,输入 hive,回车。- [hadoop@master ~]$ cd /usr/local/src/hive/[hadoop@master hive]$ hive
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- hive>
复制代码 当显示 hive>时,表示启动成功,进入到了 Hive shell 状态。
2.4.4.2. 步骤二:Hive 操作基本命令
注意:Hive 命令行语句后面一定要加分号。
(1)查看数据库- hive> show databases;
- OK
- default
- sample
- Time taken: 0.596 seconds, Fetched: 2 row(s)
- hive>
复制代码 显示默认的数据库 default。
(2)查看 default 数据库全部表- hive> use default;
- OK
- Time taken: 0.018 seconds
- hive> show tables;
- OK
- test
- Time taken: 0.036 seconds, Fetched: 1 row(s)
- hive>
复制代码 显示 default 数据中没有任何表。
(3)创建表 stu,表的 id 为整数型,name 为字符型- hive> create table stu(id int,name string);
- OK
- Time taken: 0.23 seconds
- hive>
复制代码 (4)为表 stu 插入一条信息,id 号为 001,name 为张三- hive> insert into stu values (1001,"zhangsan");
- WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- Query ID = hadoop_20220520185326_7c18630d-0690-4b35-8de8-423c9b901677
- Total jobs = 3
- Launching Job 1 out of 3
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_1653042072571_0001, Tracking URL = http://master:8088/proxy/application_1653042072571_0001/
- Kill Command = /usr/local/src/hadoop/bin/hadoop job -kill job_1653042072571_0001
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
- 2022-05-20 18:56:05,436 Stage-1 map = 0%, reduce = 0%
- 2022-05-20 18:56:11,699 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.47 sec
- MapReduce Total cumulative CPU time: 3 seconds 470 msec
- Ended Job = job_1653042072571_0001
- Stage-4 is selected by condition resolver.
- Stage-3 is filtered out by condition resolver.
- Stage-5 is filtered out by condition resolver.
- Moving data to: hdfs://master:9000/user/hive/warehouse/stu/.hive-staging_hive_2022-05-20_18-55-52_567_2370673334190980235-1/-ext-10000
- Loading data to table default.stu
- MapReduce Jobs Launched:
- Stage-Stage-1: Map: 1 Cumulative CPU: 3.47 sec HDFS Read: 4138 HDFS Write: 81 SUCCESS
- Total MapReduce CPU Time Spent: 3 seconds 470 msec
- OK
- Time taken: 20.438 seconds
复制代码 按照以上操作,继续插入两条信息:id 和 name 分别为 1002、1003 和 lisi、wangwu。
(5)插入数据后查看表的信息- hive> show tables;
- OK
- stu
- test
- values__tmp__table__1
- Time taken: 0.017 seconds, Fetched: 3 row(s)
- hive>
复制代码 (6)查看表 stu 布局- hive> desc stu;
- OK
- id int
- name string
- Time taken: 0.031 seconds, Fetched: 2 row(s)
- hive>
复制代码 (7)查看表 stu 的内容- hive> select * from stu;
- OK
- 1001 zhangsan
- Time taken: 0.077 seconds, Fetched: 1 row(s)
- hive>
复制代码 2.4.4.3. 步骤三:通过 Hive 命令行界面查看文件系统和历史命令
(1)查看当地文件系统,执行命令 ! ls /usr/local/src;- hive> ! ls /usr/local/src;
- apache-hive-2.0.0-bin
- flume
- hadoop
- hbase
- hive
- jdk
- sqoop
- zookeeper
复制代码 (2)查看 HDFS 文件系统,执行命令 dfs -ls /;- hive> dfs -ls /;
- Found 5 items
- drwxr-xr-x - hadoop supergroup 0 2022-04-15 22:04 /hbase
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:24 /input
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:26 /output
- drwxr-xr-x - hadoop supergroup 0 2022-05-20 18:55 /tmp
- drwxr-xr-x - hadoop supergroup 0 2022-04-29 17:03 /user
复制代码 (3)查看在 Hive 中输入的全部历史命令
进入到当前用户 Hadoop 的目录/home/hadoop,查看.hivehistory 文件。- [hadoop@master ~]$ cd /home/hadoop
- [hadoop@master ~]$ cat .hivehistory
- create database sample;
- use sample;
- create table student(number STRING,name STRING);
- exit;
- select * from sample.student;
- exit;
- show tables;
- exit;
- show databases;
- use default;
- show tables;
- create table stu(id int,name string);
- insert into stu values (1001,"zhangsan");
- show tables;
- desc stu;
- select * from stu;
- ! ls /usr/local/src;
- dfs -ls /;
- exit
- ;
复制代码 结果显示,之前在 Hive 命令行界面下运行的全部命令(含错误命令)都显示了出来,有助于维护、故障排查等工作。
实验三 通过命令监控大数据平台服务状态
3.1. 实验目标
完成本实验,您应该能够:
- 掌握大数据平台服务的运行状况
- 掌握查看大数据平台服务运行状况的命令
3.2. 实验要求
- 认识查看大数据平台服务运行状态的方式
- 相识查看大数据平台服务运行状况的命令
3.3. 实验过程
3.3.1. 实验任务一: 通过命令查看 ZooKeeper 状态
3.3.1.1. 步骤一: 查看ZooKeeper状态,执行命令 zkServer.sh status,结果显示如下
- [hadoop@master ~]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
复制代码 以上结果中,Mode:follower 表示为 ZooKeeper 的跟随者。
3.3.1.2. 步骤二: 查看运行进程
QuorumPeerMain:QuorumPeerMain 是 ZooKeeper 集群的启动入口类,是用来加载配置启动 QuorumPeer线程的。
执行命令 jps 以查看进程环境。- [hadoop@master ~]$ jps
- 5029 Jps
- 3494 SecondaryNameNode
- 3947 QuorumPeerMain
- 3292 NameNode
- 3660 ResourceManager
复制代码 3.3.1.3. 步骤四: 在成功启动ZooKeeper服务后,输入命令 zkCli.sh,连接到ZooKeeper 服务。
- [hadoop@master ~]$ zkCli.sh
- Connecting to localhost:2181
- 2022-05-20 19:07:11,924 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
- 2022-05-20 19:07:11,927 [myid:] - INFO [main:Environment@100] - Client environment:host.name=master
- 2022-05-20 19:07:11,927 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_152
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/local/src/jdk/jre
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/usr/local/src/zookeeper/bin/../build/classes:/usr/local/src/zookeeper/bin/../build/lib/*.jar:/usr/local/src/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/src/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/src/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/usr/local/src/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/local/src/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/src/zookeeper/bin/../zookeeper-3.4.8.jar:/usr/local/src/zookeeper/bin/../src/java/lib/*.jar:/usr/local/src/zookeeper/bin/../conf::/usr/local/src/sqoop/lib
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.el7.x86_64
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:user.name=hadoop
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/hadoop
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/hadoop
- 2022-05-20 19:07:11,930 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@69d0a921
- Welcome to ZooKeeper!
- 2022-05-20 19:07:11,946 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
- JLine support is enabled
- 2022-05-20 19:07:11,984 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
- 2022-05-20 19:07:11,991 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x180e0fed4990001, negotiated timeout = 30000
- WATCHER::
- WatchedEvent state:SyncConnected type:None path:null
- [zk: localhost:2181(CONNECTED) 0]
复制代码 3.3.1.4. 步骤五: 使用 Watch 监听/hbase 目录,一旦/hbase 内容有变化,将会有提 内容有变化,将会有提示。打开监视,执行命令 示。打开监视,执行命令 get /hbase 1 。
- cZxid = 0x100000002
- ctime = Thu Apr 23 16:02:29 CST 2022
- mZxid = 0x100000002
- mtime = Thu Apr 23 16:02:29 CST 2022
- pZxid = 0x20000008d
- cversion = 26
- dataVersion = 0
- aclVersion = 0
- ephemeralOwner = 0x0
- dataLength = 0
- numChildren = 16
- [zk: localhost:2181(CONNECTED) 1] set /hbase value-update
- WATCHER::cZxid = 0x100000002
- WatchedEvent state:SyncConnected type:NodeDataChanged
- path:/hbase
- ctime = Thu Apr 23 16:02:29 CST 2022
- mZxid = 0x20000c6d3
- mtime = Fri May 15 15:03:41 CST 2022
- pZxid = 0x20000008d
- cversion = 26
- dataVersion = 1
- aclVersion = 0
- ephemeralOwner = 0x0
- dataLength = 12
- numChildren = 16
- [zk: localhost:2181(CONNECTED) 2] get /hbase
- value-update
- cZxid = 0x100000002
- ctime = Thu Apr 23 16:02:29 CST 2022
- mZxid = 0x20000c6d3
- mtime = Fri May 15 15:03:41 CST 2022
- pZxid = 0x20000008d
- cversion = 26
- dataVersion = 1
- aclVersion = 0
- ephemeralOwner = 0x0
- dataLength = 12
- numChildren = 16
- [zk: localhost:2181(CONNECTED) 3] quit
复制代码 结果显示,当执行命令 set /hbase value-update 后,数据版本由 0 变成 1,说明/hbase 处于监控中。
3.3.2. 实验任务二:通过命令查看 Sqoop 状态
3.3.2.1. 步骤一: 查询 Sqoop 版本号,验证 Sqoop 是否启动成功。
起首切换到/usr/local/src/sqoop 目录,执行命令:./bin/sqoop-version- [hadoop@master ~]$ cd /usr/local/src/sqoop
- [hadoop@master sqoop]$ ./bin/sqoop-version
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/05/20 19:10:55 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- Sqoop 1.4.7
- git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
- Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
复制代码 结果显示:Sqoop 1.4.7,说明 Sqoop 版本号为 1.4.7,并启动成功。
3.3.2.2. 步骤二: 测试 Sqoop 是否能够成功连接数据库
切换到Sqoop 的 目 录 , 执 行 命 令 bin/sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$,命令中“master:3306”为数据库主机名和端口。- [hadoop@master sqoop]$ bin/sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/05/20 19:13:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/05/20 19:13:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/05/20 19:13:21 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri May 20 19:13:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- information_schema
- hive
- mysql
- performance_schema
- sample
- sys
复制代码 结果显示,可以连接到 MySQL,并查看到 Master 主机中 MySQL 的全部库实例,如information_schema、hive、mysql、performance_schema 和 sys 等数据库。
3.3.2.3. 步骤三: 执行命令sqoop help ,可以看到如下内容,代表Sqoop 启动成功。
- [hadoop@master sqoop]$ sqoop help
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/05/20 19:14:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- usage: sqoop COMMAND [ARGS]
- Available commands:
- codegen Generate code to interact with database records
- create-hive-table Import a table definition into Hive
- eval Evaluate a SQL statement and display the results
- export Export an HDFS directory to a database table
- help List available commands
- import Import a table from a database to HDFS
- import-all-tables Import tables from a database to HDFS
- import-mainframe Import datasets from a mainframe server to HDFS
- job Work with saved jobs
- list-databases List available databases on a server
- list-tables List available tables in a database
- merge Merge results of incremental imports
- metastore Run a standalone Sqoop metastore
- version Display version information
- See 'sqoop help COMMAND' for information on a specific command.
复制代码 结果显示了 Sqoop 的常用命令和功能,如下表所示。
3.3.3. 实验任务三:通过命令查看Flume状态
3.3.3.1. 步骤一: 检查 Flume安装是否成功,执行flume-ng version 命令,查看 Flume的版本。
- [hadoop@master ~]$ cd /usr/local/src/flume
- [hadoop@master flume]$ flume-ng version
- Flume 1.6.0
- Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
- Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
- Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
- From source with checksum b29e416802ce9ece3269d34233baf43f
复制代码 3.3.3.2. 步骤二: 添加 example.conf 到/usr/local/src/flume
- [hadoop@master flume]$ cat /usr/local/src/flume/example.conf
- a1.sources=r1
- a1.sinks=k1
- a1.channels=c1
- a1.sources.r1.type=spooldir
- a1.sources.r1.spoolDir=/usr/local/src/flume/
- a1.sources.r1.fileHeader=true
- a1.sinks.k1.type=hdfs
复制代码 第4章 Hadoop文件参数配置
实验一:hadoop 全分布配置
1.1 实验目标
完成本实验,您应该能够:
- 掌握 hadoop 全分布的配置
- 掌握 hadoop 全分布的安装
- 掌握 hadoop 配置文件的参数意义
1.2 实验要求
- 认识 hadoop 全分布的安装
- 相识 hadoop 配置文件的意义
1.3 实验过程
1.3.1 实验任务一:在 Master 节点上安装 Hadoop
1.3.1.1 步骤一:解压缩 hadoop-2.7.1.tar.gz 安装包到/usr 目录下
- [root@master ~]# tar zvxf jdk-8u152-linux-x64.tar.gz -C /usr/local/src/
- [root@master ~]# tar zvxf hadoop-2.7.1.tar.gz -C /usr/local/src/
复制代码 1.3.1.2 步骤二:将 hadoop-2.7.1 文件夹重命名为 hadoop
- [root@master ~]# cd /usr/local/src/
- [root@master src]# ls
- hadoop-2.7.1 jdk1.8.0_152
- [root@master src]# mv hadoop-2.7.1/ hadoop
- [root@master src]# mv jdk1.8.0_152/ jdk
- [root@master src]# ls
- hadoop jdk
复制代码 1.3.1.3 步骤三:配置 Hadoop 环境变量
[root@master ~]# vi /etc/profile.d/hadoop.sh
注意:在第二章安装单机 Hadoop 系统已经配置过环境变量,先删除之前配置后添加- #写入以下信息
- export JAVA_HOME=/usr/local/src/jdk
- export HADOOP_HOME=/usr/local/src/hadoop
- export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
复制代码 1.3.1.4 步骤四:使配置的 Hadoop 的环境变量生效
- [root@master ~]# source /etc/profile.d/hadoop.sh
- [root@master ~]# echo $PATH
- /usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
复制代码 1.3.1.5 步骤五:执行以下命令修改 hadoop-env.sh 配置文件
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hadoop-env.sh
- #写入以下信息
- export JAVA_HOME=/usr/local/src/jdk
复制代码 1.3.2 实验任务二:配置 hdfs-site.xml 文件参数
执行以下命令修改 hdfs-site.xml 配置文件。- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hdfs-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:/usr/local/src/hadoop/dfs/name</value>
- </property>
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>file:/usr/local/src/hadoop/dfs/data</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>2</value>
- </property>
- </configuration>
- 创建目录
- [root@master ~]# mkdir -p /usr/local/src/hadoop/dfs/{name,data}
复制代码 对于 Hadoop 的分布式文件系统 HDFS 而言,一样平常都是采用冗余存储,冗余因子通常为3,也就是说,一份数据生存三份副本。所以,修改 dfs.replication 的配置,使 HDFS 文件的备份副本数量设定为2个。
1.3.3 实验任务三:配置 core-site.xml 文件参数
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/core-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://master:9000</value>
- </property>
- <property>
- <name>io.file.buffer.size</name>
- <value>131072</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>file:/usr/local/src/hadoop/tmp</value>
- </property>
- </configuration>
- #保存以上配置后创建目录
- [root@master ~]# mkdir -p /usr/local/src/hadoop/tmp
复制代码 如没有配置 hadoop.tmp.dir 参数,此时系统默认的临时目录为:/tmp/hadoop-hadoop。该目录在每次 Linux 系统重启后会被删除,必须重新执行 Hadoop 文件系统格式化命令,否则 Hadoop 运行会出错。
1.3.4 实验任务四:配置 mapred-site.xml
- [root@master ~]# cd /usr/local/src/hadoop/etc/hadoop/
- [root@master hadoop]# cp mapred-site.xml.template mapred-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>master:10020</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>master:19888</value>
- </property>
- </configuration>
复制代码 1.3.5 实验任务五:配置 yarn-site.xml
- [root@master hadoop]# vi /usr/local/src/hadoop/etc/hadoop/yarn-site.xml
- #在文件中<configuration>和</configuration>一对标签之间追加以下配置信息
- <configuration>
- <property>
- <name>arn.resourcemanager.address</name>
- <value>master:8032</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>master:8030</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>master:8088</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>master:8031</value>
- </property>
- <property>
- <name>yarn.resourcemanager.admin.address</name>
- <value>master:8033</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
- </configuration>
复制代码 1.3.6 实验任务六:Hadoop 别的相关配置
1.3.6.1 步骤一:配置 masters 文件
- #修改 masters 配置文件
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/masters
- #加入以下配置信息
- 10.10.10.128
复制代码 1.3.6.2 步骤二:配置 slaves 文件
- #修改 slaves 配置文件
- [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/slaves
- #删除 localhost,加入以下配置信息
- 10.10.10.129
- 10.10.10.130
复制代码 1.3.6.3 步骤三:新建用户以及修改目录权限
- #新建用户
- [root@master ~]# useradd hadoop
- [root@master ~]# echo 'hadoop' | passwd --stdin hadoop
- Changing password for user hadoop.
- passwd: all authentication tokens updated successfully.
- #修改目录权限
- [root@master ~]# chown -R hadoop.hadoop /usr/local/src/
- [root@master ~]# cd /usr/local/src/
- [root@master src]# ll
- total 0
- drwxr-xr-x 11 hadoop hadoop 171 Mar 27 01:51 hadoop
- drwxr-xr-x 8 hadoop hadoop 255 Sep 14 2017 jdk
复制代码 1.3.6.4 步骤四:配置master能够免密登录全部slave节点
- [root@master ~]# ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/root/.ssh/id_rsa):
- Created directory '/root/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /root/.ssh/id_rsa.
- Your public key has been saved in /root/.ssh/id_rsa.pub.
- The key fingerprint is:
- SHA256:Ibeslip4Bo9erREJP37u7qhlwaEeMOCg8DlJGSComhk root@master
- The key's randomart image is:
- +---[RSA 2048]----+
- |B.oo |
- |Oo.o |
- |=o=. . o|
- |E.=.o + o |
- |.* BS|
- |* o = o |
- | * * o+ |
- |o O *o |
- |.=.+== |
- +----[SHA256]-----+
- [root@master ~]# ssh-copy-id root@slave1
- /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
- The authenticity of host 'slave1 (10.10.10.129)' can't be established.
- ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
- ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
- Are you sure you want to continue connecting (yes/no)? yes
- /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- root@slave1's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'root@slave1'"
- and check to make sure that only the key(s) you wanted were added.
- [root@master ~]# ssh-copy-id root@slave2
- /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
- The authenticity of host 'slave2 (10.10.10.130)' can't be established.
- ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
- ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
- Are you sure you want to continue connecting (yes/no)? yes
- /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- root@slave2's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'root@slave2'"
- and check to make sure that only the key(s) you wanted were added.
-
- [root@master ~]# ssh slave1
- Last login: Sun Mar 27 02:58:38 2022 from master
- [root@slave1 ~]# exit
- logout
- Connection to slave1 closed.
- [root@master ~]# ssh slave2
- Last login: Sun Mar 27 00:26:12 2022 from 10.10.10.1
- [root@slave2 ~]# exit
- logout
- Connection to slave2 closed.
复制代码 1.3.6.5 步骤五:同步/usr/local/src/目录下全部文件至全部slave节点
- [root@master ~]# scp -r /usr/local/src/* root@slave1:/usr/local/src/
- [root@master ~]# scp -r /usr/local/src/* root@slave2:/usr/local/src/
- [root@master ~]# scp /etc/profile.d/hadoop.sh root@slave1:/etc/profile.d/
- hadoop.sh 100% 151 45.9KB/s 00:00
-
- [root@master ~]# scp /etc/profile.d/hadoop.sh root@slave2:/etc/profile.d/
- hadoop.sh 100% 151 93.9KB/s 00:00
复制代码 1.3.6.6 步骤六:在全部slave节点执行以下命令
- (1)在slave1
- [root@slave1 ~]# useradd hadoop
- [root@slave1 ~]# echo 'hadoop' | passwd --stdin hadoop
- Changing password for user hadoop.
- passwd: all authentication tokens updated successfully.
- [root@slave1 ~]# chown -R hadoop.hadoop /usr/local/src/
- [root@slave1 ~]# ll /usr/local/src/
- total 0
- drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:07 hadoop
- drwxr-xr-x 8 hadoop hadoop 255 Mar 27 03:07 jdk
- [root@slave1 ~]# source /etc/profile.d/hadoop.sh
- [root@slave1 ~]# echo $PATH
- /usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
- (2)在slave2
- [root@slave2 ~]# useradd hadoop
- [root@slave2 ~]# echo 'hadoop' | passwd --stdin hadoop
- Changing password for user hadoop.
- passwd: all authentication tokens updated successfully.
- [root@slave2 ~]# chown -R hadoop.hadoop /usr/local/src/
- [root@slave2 ~]# ll /usr/local/src/
- total 0
- drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:09 hadoop
- drwxr-xr-x 8 hadoop hadoop 255 Mar 27 03:09 jdk
- [root@slave2 ~]# source /etc/profile.d/hadoop.sh
- [root@slave2 ~]# echo $PATH
- /usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
复制代码 第5章 Hadoop集群运行
实验一:hadoop 集群运行
1.1 实验目标
完成本实验,您应该能够:
- 掌握 hadoop 的运行状态
- 掌握 hadoop 文件系统格式化配置
- 掌握 hadoop java 运行状态查看
- 掌握 hadoop hdfs 报告查看
- 掌握 hadoop 节点状态查看
- 掌握停止 hadoop 进程操作
1.2 实验要求
- 认识怎样查看 hadoop 的运行状态
- 认识停止 hadoop 进程的操作
1.3 实验过程
1.3.1 实验任务一:配置 Hadoop 格式化
1.3.1.1 步骤一:NameNode 格式化
将 NameNode 上的数据清零,第一次启动 HDFS 时要进行格式化,以后启动无需再格式化,否则会缺失 DataNode 进程。别的,只要运行过 HDFS,Hadoop 的工作目录(本书设置为/usr/local/src/hadoop/tmp)就会有数据,如果需要重新格式化,则在格式化之前一定要先删除工作目录下的数据,否则格式化时会出题目。
执行如下命令,格式化 NameNode- [root@master ~]# su - hadoop
- Last login: Fri Apr 1 23:34:46 CST 2022 on pts/1
- [hadoop@master ~]$ cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$ ./bin/hdfs namenode -format
- 22/04/02 01:22:42 INFO namenode.NameNode: STARTUP_MSG:
- /************************************************************
复制代码 1.3.1.2 步骤二:启动 NameNode
- [hadoop@master hadoop]$ hadoop-daemon.sh start namenode
- namenode running as process 11868. Stop it first.
复制代码 1.3.2 实验任务二:查看 Java 进程
启动完成后,可以使用 JPS 命令查看是否成功。JPS 命令是 Java 提供的一个显示当前全部 Java 进程 pid 的命令。- [hadoop@master hadoop]$ jps
- 12122 Jps
- 11868 NameNode
复制代码 1.3.2.1 步骤一:切换到Hadoop用户
- [hadoop@master ~]$ su - hadoop
- Password:
- Last login: Sat Apr 2 01:22:13 CST 2022 on pts/1
- Last failed login: Sat Apr 2 04:47:08 CST 2022 on pts/1
- There was 1 failed login attempt since the last successful login.
复制代码 1.3.3 实验任务三:查看 HDFS 的报告
- [hadoop@master ~]$ hdfs dfsadmin -report
- Configured Capacity: 0 (0 B)
- Present Capacity: 0 (0 B)
- DFS Remaining: 0 (0 B)
- DFS Used: 0 (0 B)
- DFS Used%: NaN%
- Under replicated blocks: 0
- Blocks with corrupt replicas: 0
- Missing blocks: 0
- Missing blocks (with replication factor 1): 0
- -------------------------------------------------
复制代码 1.3.3.1 步骤一:生成密钥
- [hadoop@master ~]$ ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
- Created directory '/home/hadoop/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /home/hadoop/.ssh/id_rsa.
- Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
- The key fingerprint is:
- SHA256:nW/cVxmRp5Ht9TKGT61OmGbhQtkBdpHyS5prGhx24pI hadoop@master.example.com
- The key's randomart image is:
- +---[RSA 2048]----+
- | o.oo +.|
- | ...o o.=|
- | = o *+|
- | .o.* * *|
- |S.+= O =.|
- | = ++oB.+ .|
- | E + =+o. .|
- | . .o. .. |
- |.o |
- +----[SHA256]-----+
- [hadoop@master ~]$ ssh-copy-id slave1
- /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
- The authenticity of host 'slave1 (10.10.10.129)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- hadoop@slave1's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'slave1'"
- and check to make sure that only the key(s) you wanted were added.
- [hadoop@master ~]$ ssh-copy-id slave2
- /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
- The authenticity of host 'slave2 (10.10.10.130)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- hadoop@slave2's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'slave2'"
- and check to make sure that only the key(s) you wanted were added.
- [hadoop@master ~]$ ssh-copy-id master
- /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
- The authenticity of host 'master (10.10.10.128)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- hadoop@master's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'master'"
- and check to make sure that only the key(s) you wanted were added.
复制代码 1.3.4 实验任务四:停止dfs.sh
- [hadoop@master ~]$ stop-dfs.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.129: no datanode to stop
- 10.10.10.130: no datanode to stop
- Stopping secondary namenodes [0.0.0.0]
- The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
- ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
- ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
- Are you sure you want to continue connecting (yes/no)? yes
- 0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
- 0.0.0.0: no secondarynamenode to stop
复制代码 1.3.4.1 重启并验证
第6章 Hive组建安装配置
实验一:Hive 组件安装配置
1.1. 实验目标
完成本实验,您应该能够:
- 掌握Hive 组件安装配置
- 掌握Hive 组件格式化和启动
1.2. 实验要求
- 认识Hive 组件安装配置
- 相识Hive 组件格式化和启动
1.3. 实验过程
1.3.1. 实验任务一:下载和解压安装文件
1.3.1.1. 步骤一:基础环境和安装准备
Hive 组件需要基于Hadoop 系统进行安装。因此,在安装 Hive 组件前,需要确保 Hadoop 系统能够正常运行。本章节内容是基于之前已部署完毕的 Hadoop 全分布系统,在 master 节点上实现 Hive 组件安装。
Hive 组件的部署规划和软件包路径如下:
(1)当前环境中已安装 Hadoop 全分布系统。
(2)当地安装 MySQL 数据库(账号 root,密码 Password123$), 软件包在/opt/software/mysql-5.7.18 路径下。
(3)MySQL 端口号(3306)。
(4)MySQL 的 JDBC 驱动包/opt/software/mysql-connector-java-5.1.47.jar, 在此基础上更新 Hive 元数据存储。
(5)Hive 软件包/opt/software/apache-hive-2.0.0-bin.tar.gz。
1.3.1.2. 步骤二:解压安装文件
(1)使用 root 用户,将 Hive 安装包
/opt/software/apache-hive-2.0.0-bin.tar.gz 路解压到/usr/local/src 路径下。- [root@master ~]# tar -zxvf /opt/software/apache-hive-2.0.0-bin.tar.gz -C /usr/local/src/
复制代码 (2)将解压后的 apache-hive-2.0.0-bin 文件夹更名为 hive;- [root@master ~]# mv /usr/local/src/apache-hive-2.0.0-bin/ /usr/local/src/hive/
复制代码 (3)修改 hive 目录归属用户和用户组为 hadoop- [root@master ~]# chown -R hadoop:hadoop /usr/local/src/hive
复制代码 1.3.2. 实验任务二:设置 Hive 环境
1.3.2.1. 步骤一:卸载MariaDB 数据库
Hive 元数据存储在 MySQL 数据库中,因此在部署 Hive 组件前需要起首在 Linux 系统下安装 MySQL 数据库,并进行 MySQL 字符集、安全初始化、远程访问权限等相关配置。需要使用 root 用户登录,执行如下操作步骤:
(1)关闭 Linux 系统防火墙,并将防火墙设定为系统开机并不主动启动。- [root@master ~]# systemctl stop firewalld
- [root@master ~]# systemctl disable firewalld
复制代码 (2)卸载 Linux 系统自带的 MariaDB。
- 起首查看 Linux 系统中 MariaDB 的安装环境。
[root@master ~]# rpm -qa | grep mariadb
2)卸载 MariaDB 软件包。
我这里没有就不需要卸载
1.3.2.2. 步骤二:安装MySQL 数据库
(1)按如下顺序依次按照 MySQL 数据库的 mysql common、mysql libs、mysql client 软件包。- [root@master ~]# cd /opt/software/mysql-5.7.18/
- [root@master mysql-5.7.18]# rpm -ivh mysql-community-common-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-common-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-common-5.7.18-1.el7.x86_64 is already installed
- [root@master mysql-5.7.18]# rpm -ivh mysql-community-libs-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-libs-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-libs-5.7.18-1.el7.x86_64 is already installed
- [root@master mysql-5.7.18]# rpm -ivh mysql-community-client-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-client-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-client-5.7.18-1.el7.x86_64 is already installed
复制代码 (2)安装 mysql server 软件包。- [root@master mysql-5.7.18]# rpm -ivh mysql-community-server-5.7.18-1.el7.x86_64.rpm
- warning: mysql-community-server-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
- Preparing... ################################# [100%]
- package mysql-community-server-5.7.18-1.el7.x86_64 is already installed
复制代码 (3)修改 MySQL 数据库配置,在/etc/my.cnf 文件中添加如表 6-1 所示的 MySQL 数据库配置项。
将以下配置信息添加到/etc/my.cnf 文件 symbolic-links=0 配置信息的下方。- default-storage-engine=innodb
- innodb_file_per_table
- collation-server=utf8_general_ci
- init-connect='SET NAMES utf8'
- character-set-server=utf8
复制代码 (4)启动 MySQL 数据库。- [root@master ~]# systemctl start mysqld
复制代码 (5)查询 MySQL 数据库状态。mysqld 进程状态为 active (running),则表示 MySQL 数据库正常运行。
如果 mysqld 进程状态为 failed,则表示 MySQL 数据库启动非常。此时需要排查/etc/my.cnf 文件。- [root@master ~]# systemctl status mysqld
- ● mysqld.service - MySQL Server
- Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
- Active: active (running) since Sun 2022-04-10 22:54:39 CST; 1h 0min ago
- Docs: man:mysqld(8)
- http://dev.mysql.com/doc/refman/en/using-systemd.html
- Main PID: 929 (mysqld)
- CGroup: /system.slice/mysqld.service
- └─929 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/my...
- Apr 10 22:54:35 master systemd[1]: Starting MySQL Server...
- Apr 10 22:54:39 master systemd[1]: Started MySQL Server.
复制代码 (6)查询 MySQL 数据库默认密码。- [root@master ~]# cat /var/log/mysqld.log | grep password
- 2022-04-08T16:20:04.456271Z 1 [Note] A temporary password is generated for root@localhost: 0yf>>yWdMd8_
复制代码 MySQL 数据库是安装后随机生成的,所以每次安装后生成的默认密码不雷同。
(7)MySQL 数据库初始化。 0yf>>yWdMd8_
执行 mysql_secure_installation 命令初始化 MySQL 数据库,初始化过程中需要设定命据库 root 用户登录密码,密码需符合安全规则,包括大小写字符、数字和特殊符号, 可设定密码为 Password123$。
在进行 MySQL 数据库初始化过程中会出现以下交互确认信息:
1)Change the password for root ? ((Press y|Y for Yes, any other key for No)表示是否更改 root 用户密码,在键盘输入 y 和回车。
2)Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No)表示是否使用设定的密码继续,在键盘输入 y 和回车。
3)Remove anonymous users? (Press y|Y for Yes, any other key for No)表示是否删除匿名用户,在键盘输入 y 和回车。
4)Disallow root login remotely? (Press y|Y for Yes, any other key for No) 表示是否拒绝 root 用户远程登录,在键盘输入 n 和回车,表示答应 root 用户远程登录。
5)Remove test database and access to it? (Press y|Y for Yes, any other key for No)表示是否删除测试数据库,在键盘输入 y 和回车。
6)Reload privilege tables now? (Press y|Y for Yes, any other key for No) 表示是否重新加载授权表,在键盘输入 y 和回车。
mysql_secure_installation 命令执行过程如下:- [root@master ~]# mysql_secure_installation
- Securing the MySQL server deployment.
- Enter password for user root:
- The 'validate_password' plugin is installed on the server.
- The subsequent steps will run with the existing configuration
- of the plugin.
- Using existing password for root.
- Estimated strength of the password: 100
- Change the password for root ? ((Press y|Y for Yes, any other key for No) : y
- New password:
- Re-enter new password:
- Estimated strength of the password: 100
- Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No) : y
- By default, a MySQL installation has an anonymous user,
- allowing anyone to log into MySQL without having to have
- a user account created for them. This is intended only for
- testing, and to make the installation go a bit smoother.
- You should remove them before moving into a production
- environment.
- Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
- Success.
复制代码 - Normally, root should only be allowed to connect from
- 'localhost'. This ensures that someone cannot guess at
- the root password from the network.
- Disallow root login remotely? (Press y|Y for Yes, any other key for No) : n
- ... skipping.
- By default, MySQL comes with a database named 'test' that
- anyone can access. This is also intended only for testing,
- and should be removed before moving into a production
- environment.
复制代码 - Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
- - Dropping test database...
- Success.
- - Removing privileges on test database...
- Success.
- Reloading the privilege tables will ensure that all changes
- made so far will take effect immediately.
- Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
- Success.
- All done!
复制代码 (7) 添加 root 用户从当地和远程访问 MySQL 数据库表单的授权。- [root@master ~]# mysql -u root -p
- Enter password:
- Welcome to the MySQL monitor. Commands end with ; or \g.
- Your MySQL connection id is 9
- Server version: 5.7.18 MySQL Community Server (GPL)
- Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
- Oracle is a registered trademark of Oracle Corporation and/or its
- affiliates. Other names may be trademarks of their respective
- owners.
- Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
- mysql> grant all privileges on *.* to root@'localhost' identified by 'Password123$';
- Query OK, 0 rows affected, 1 warning (0.00 sec)
- mysql> grant all privileges on *.* to root@'%' identified by 'Password123$';
- Query OK, 0 rows affected, 1 warning (0.00 sec)
- mysql> flush privileges;
- Query OK, 0 rows affected (0.00 sec)
- mysql> select user,host from mysql.user where user='root';
- +------+-----------+
- | user | host |
- +------+-----------+
- | root | % |
- | root | localhost |
- +------+-----------+
- 2 rows in set (0.00 sec)
- mysql> exit;
- Bye
复制代码 1.3.2.3. 步骤三:配置 Hive 组件
(1)设置 Hive 环境变量并使其生效。- [root@master ~]# vim /etc/profile
- export HIVE_HOME=/usr/local/src/hive
- export PATH=$PATH:$HIVE_HOME/bin
- [root@master ~]# source /etc/profile
复制代码 (2)修改 Hive 组件配置文件。
切换到 hadoop 用户执行以下对 Hive 组件的配置操作。
将/usr/local/src/hive/conf 文件夹下 hive-default.xml.template 文件,更名为hive-site.xml。- [root@master ~]# su - hadoop
- Last login: Sun Apr 10 23:27:25 CS
- [hadoop@master ~]$ cp /usr/local/src/hive/conf/hive-default.xml.template /usr/local/src/hive/conf/hive-site.xml
复制代码 (3)通过 vi 编辑器修改 hive-site.xml 文件实现 Hive 连接 MySQL 数据库,并设定Hive 临时文件存储路径。- [hadoop@master ~]$ vi /usr/local/src/hive/conf/hive-site.xml
复制代码 1)设置 MySQL 数据库连接。- <name>javax.jdo.option.ConnectionURL</name>
- <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&us eSSL=false</value>
- <description>JDBC connect string for a JDBC metastore</description>
复制代码 2)配置 MySQL 数据库 root 的密码。- <property>
- <name>javax.jdo.option.ConnectionPassword</name>
- <value>Password123$</value>
- <description>password to use against s database</description>
- </property>
复制代码 3)验证元数据存储版本同等性。若默认 false,则不用修改。- <property>
- <name>hive.metastore.schema.verification</name>
- <value>false</value>
- <description>
- Enforce metastore schema version consistency.
- True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
- False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
- </description>
- </property>
复制代码 4)配置数据库驱动。- <property>
- <name>javax.jdo.option.ConnectionDriverName</name>
- <value>com.mysql.jdbc.Driver</value>
- <description>Driver class name for a JDBC metastore</description>
- </property>
复制代码 5)配置数据库用户名 javax.jdo.option.ConnectionUserName 为 root。- <property>
- <name>javax.jdo.option.ConnectionUserName</name>
- <value>root</value>
- <description>Username to use against metastore database</description>
- </property>
复制代码 6 )将以下位置的 ${system:java.io.tmpdir}/${system:user.name} 替换为“/usr/local/src/hive/tmp”目录及其子目录。
需要替换以下 4 处配置内容:- <name>hive.querylog.location</name>
- <value>/usr/local/src/hive/tmp</value>
- <description>Location of Hive run time structured log file</description>
- <name>hive.exec.local.scratchdir</name>
- <value>/usr/local/src/hive/tmp</value>
- <name>hive.downloaded.resources.dir</name>
- <value>/usr/local/src/hive/tmp/resources</value>
- <name>hive.server2.logging.operation.log.location</name>
- <value>/usr/local/src/hive/tmp/operation_logs</value>
复制代码 7)在Hive安装目录中创建临时文件夹 tmp。- [hadoop@master ~]$ mkdir /usr/local/src/hive/tmp
复制代码 至此,Hive 组件安装和配置完成。
1.3.2.4. 步骤四:初始化 hive 元数据
1)将 MySQL 数据库驱动(/opt/software/mysql-connector-java-5.1.46.jar)拷贝到Hive 安装目录的 lib 下;- [hadoop@master ~]$ cp /opt/software/mysql-connector-java-5.1.46.jar /usr/local/src/hive/lib/
复制代码 2)重新启动 hadooop 即可- [hadoop@master ~]$ stop-all.sh
- This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.129: stopping datanode
- 10.10.10.130: stopping datanode
- Stopping secondary namenodes [0.0.0.0]
- 0.0.0.0: stopping secondarynamenode
- stopping yarn daemons
- stopping resourcemanager
- 10.10.10.129: stopping nodemanager
- 10.10.10.130: stopping nodemanager
- no proxyserver to stop
- [hadoop@master ~]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
复制代码 3)初始化数据库- [hadoop@master ~]$ schematool -initSchema -dbType mysql
- which: no hbase in (/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/src/hive/bin:/home/hadoop/.local/bin:/home/hadoop/bin)
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- Metastore connection URL:jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&us eSSL=false
- Metastore Connection Driver :com.mysql.jdbc.Driver
- Metastore connection User: root
- Mon Apr 11 00:46:32 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Starting metastore schema initialization to 2.0.0
- Initialization script hive-schema-2.0.0.mysql.sql
- Password123$
- Password123$
- No current connection
- org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
复制代码 4)启动 hive- [hadoop@master hive]$ hive
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- hive>
复制代码 第7章 ZooKeeper组件安装配置
实验一:ZooKeeper 组件安装配置
1.1.实验目标
完成本实验,您应该能够:
- 掌握下载和安装 ZooKeeper
- 掌握 ZooKeeper 的配置选项
- 掌握启动 ZooKeeper
1.2.实验要求
- 相识 ZooKeeper 的配置选项
- 认识启动 ZooKeeper
1.3.实验过程
1.3.1 实验任务一:配置时间同步
- [root@master ~]# yum -y install chrony
- [root@master ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
-
- [root@master ~]# systemctl restart chronyd.service
- [root@master ~]# systemctl enable chronyd.service
- [root@master ~]# date
- Fri Apr 15 15:40:14 CST 2022
复制代码- [root@slave1 ~]# yum -y install chrony
- [root@slave1 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave1 ~]# systemctl restart chronyd.service
- [root@slave1 ~]# systemctl enable chronyd.service
- [root@slave1 ~]# date
- Fri Apr 15 15:40:17 CST 2022
复制代码- [root@slave2 ~]# yum -y install chrony
- [root@slave2 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave2 ~]# systemctl restart chronyd.service
- [root@slave2 ~]# systemctl enable chronyd.service
- [root@slave2 ~]# date
- Fri Apr 15 15:40:20 CST 2022
复制代码 1.3.2 实验任务二:下载和安装 ZooKeeper
ZooKeeper最新的版本可以通过官网http://hadoop.apache.org/zookeeper/来获取,安装 ZooKeeper 组件需要与 Hadoop 环境适配。
注意,各节点的防火墙需要关闭,否则会出现连接题目。
1.ZooKeeper 的安装包 zookeeper-3.4.8.tar.gz 已放置在 Linux系统/opt/software
目录下。
2.解压安装包到指定目标,在 Master 节点执行如下命令。- [root@master ~]# tar xf /opt/software/zookeeper-3.4.8.tar.gz -C /usr/local/src/
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv zookeeper-3.4.8/ zookeeper
复制代码 1.3.3 实验任务三:ZooKeeper的配置选项
1.3.3.1 步骤一:Master节点配置
(1)在 ZooKeeper 的安装目录下创建 data 和 logs 文件夹。- [root@master src]# cd /usr/local/src/zookeeper/
- [root@master zookeeper]# mkdir data logs
复制代码 (2)在每个节点写入该节点的标识编号,每个节点编号不同,master节点写入 1,slave1 节点写入2,slave2 节点写入3。- [root@master zookeeper]# echo '1' > /usr/local/src/zookeeper/data/myid
复制代码 (3)修改配置文件 zoo.cfg- [root@master zookeeper]# cd /usr/local/src/zookeeper/conf/
- [root@master conf]# cp zoo_sample.cfg zoo.cfg
复制代码 修改 dataDir 参数内容如下:- [root@master conf]# vi zoo.cfg
- dataDir=/usr/local/src/zookeeper/data
复制代码 (4)在 zoo.cfg 文件末端追加以下参数配置,表示三个 ZooKeeper 节点的访问端口号。- server.1=master:2888:3888
- server.2=slave1:2888:3888
- server.3=slave2:2888:3888
复制代码 (5)修改ZooKeeper安装目录的归属用户为 hadoop 用户。- [root@master conf]# chown -R hadoop:hadoop /usr/local/src/
复制代码 1.3.3.2 步骤二:Slave 节点配置
(1)从 Master 节点复制 ZooKeeper 安装目录到两个 Slave 节点。- [root@master ~]# scp -r /usr/local/src/zookeeper node1:/usr/local/src/
- [root@master ~]# scp -r /usr/local/src/zookeeper node2:/usr/local/src/
复制代码 (2)在slave1节点上修改 zookeeper 目录的归属用户为 hadoop 用户。- [root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/
- [root@slave1 ~]# ll /usr/local/src/
- total 4
- drwxr-xr-x. 12 hadoop hadoop 183 Apr 2 18:11 hadoop
- drwxr-xr-x 9 hadoop hadoop 183 Apr 15 16:37 hbase
- drwxr-xr-x. 8 hadoop hadoop 255 Apr 2 18:06 jdk
- drwxr-xr-x 12 hadoop hadoop 4096 Apr 22 15:31 zookeeper
复制代码 (3)在slave1节点上配置该节点的myid为2。- [root@slave1 ~]# echo 2 > /usr/local/src/zookeeper/data/myid
复制代码 (4)在slave2节点上修改 zookeeper 目录的归属用户为 hadoop 用户。- [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/
复制代码 (5)在slave2节点上配置该节点的myid为3。- [root@slave2 ~]# echo 3 > /usr/local/src/zookeeper/data/myid
复制代码 1.3.3.3 步骤三:系统环境变量配置
在 master、slave1、slave2 三个节点增加环境变量配置。- [root@master conf]# vi /etc/profile.d/zookeeper.sh
- export ZOOKEEPER_HOME=/usr/local/src/zookeeper
- export PATH=${ZOOKEEPER_HOME}/bin:$PATH
- [root@master ~]# scp /etc/profile.d/zookeeper.sh node1:/etc/profile.d/
- zookeeper.sh 100% 8742.3KB/s 00:00
- [root@master ~]# scp /etc/profile.d/zookeeper.sh node2:/etc/profile.d/
- zookeeper.sh 100% 8750.8KB/s 00:00
复制代码 1.3.4 实验任务四:启动 ZooKeeper
启动ZooKeeper需要使用Hadoop用户进行操作。
(1)分别在 master、slave1、slave2 三个节点使用 zkServer.sh start 命令启动ZooKeeper。- [root@master ~]# su - hadoop
- Last login: Fri Apr 15 21:54:17 CST 2022 on pts/0
- [hadoop@master ~]$ jps
- 3922 Jps
- [hadoop@master ~]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- [hadoop@master ~]$ jps
- 3969 Jps
- 3950 QuorumPeerMain
- [root@slave1 ~]# su - hadoop
- Last login: Fri Apr 15 22:06:47 CST 2022 on pts/0
- [hadoop@slave1 ~]$ jps
- 1370 Jps
- [hadoop@slave1 ~]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- [hadoop@slave1 ~]$ jps
- 1395 QuorumPeerMain
- 1421 Jps
- [root@slave2 ~]# su - hadoop
- Last login: Fri Apr 15 16:25:52 CST 2022 on pts/1
- [hadoop@slave2 ~]$ jps
- 1336 Jps
- [hadoop@slave2 ~]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- [hadoop@slave2 ~]$ jps
- 1361 QuorumPeerMain
- 1387 Jps
复制代码 (2)三个节点都启动完成后,再统一查看 ZooKeeper 运行状态。- [hadoop@master conf]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
- [hadoop@slave1 ~]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: leader
复制代码- [hadoop@slave2 conf]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
复制代码 第8章 HBase组件安装配置
实验一:HBase 组件安装与配置
1.1实验目标
完成本实验,您应该能够:
- 掌握HBase 安装与配置
- 掌握HBase 常用 Shell 命令
1.2实验要求
- 相识HBase 原理
- 认识HBase 常用 Shell 命令
1.3实验过程
1.3.1 实验任务一:配置时间同步
- [root@master ~]# yum -y install chrony
- [root@master ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
-
- [root@master ~]# systemctl restart chronyd.service
- [root@master ~]# systemctl enable chronyd.service
- [root@master ~]# date
- Fri Apr 15 15:40:14 CST 2022
复制代码- [root@slave1 ~]# yum -y install chrony
- [root@slave1 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave1 ~]# systemctl restart chronyd.service
- [root@slave1 ~]# systemctl enable chronyd.service
- [root@slave1 ~]# date
- Fri Apr 15 15:40:17 CST 2022
复制代码- [root@slave2 ~]# yum -y install chrony
- [root@slave2 ~]# cat /etc/chrony.conf
- # Use public servers from the pool.ntp.org project.
- # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- server time1.aliyun.com iburst
- [root@slave2 ~]# systemctl restart chronyd.service
- [root@slave2 ~]# systemctl enable chronyd.service
- [root@slave2 ~]# date
- Fri Apr 15 15:40:20 CST 2022
复制代码 1.3.2 实验任务二:HBase 安装与配置
1.3.2.1 步骤一:解压缩 HBase 安装包
- [root@master ~]# tar -zxvf hbase-1.2.1-bin.tar.gz -C /usr/local/src/
复制代码 1.3.2.2 步骤二:重命名 HBase 安装文件夹
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv hbase-1.2.1 hbase
复制代码 1.3.2.3 步骤三:在全部节点添加环境变量
- [root@master ~]# cat /etc/profile
- # set hbase environment
- export HBASE_HOME=/usr/local/src/hbase
- export PATH=$HBASE_HOME/bin:$PATH
- [root@slave1 ~]# cat /etc/profile
- # set hbase environment
- export HBASE_HOME=/usr/local/src/hbase
- export PATH=$HBASE_HOME/bin:$PATH
- [root@slave2 ~]# cat /etc/profile
- # set hbase environment
- export HBASE_HOME=/usr/local/src/hbase
- export PATH=$HBASE_HOME/bin:$PATH
复制代码 1.3.2.4 步骤四:在全部节点使环境变量生效
- [root@master ~]# source /etc/profile
- [root@master ~]# echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/src/hive/bin:/root/bin:/usr/local/src/hive/bin:/usr/local/src/hive/bin
- [root@slave1 ~]# source /etc/profile
- [root@slave1 ~]# echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
- [root@slave2 ~]# source /etc/profile
- [root@slave2 ~]# echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
复制代码 1.3.2.5 步骤五:在 master 节点进入配置文件目录
- [root@master ~]# cd /usr/local/src/hbase/conf/
复制代码 1.3.2.6 步骤六:在 master 节点配置 hbase-env.sh 文件
- [root@master conf]# cat hbase-env.sh
- export JAVA_HOME=/usr/local/src/jdk
- export HBASE_MANAGES_ZK=true
- export HBASE_CLASSPATH=/usr/local/src/hadoop/etc/hadoop/
复制代码 1.3.2.7 步骤七:在 master 节点配置 hbase-site.xml
- [root@master conf]# cat hbase-site.xml
- <configuration>
- <property>
- <name>hbase.rootdir</name>
- <value>hdfs://master:9000/hbase</value>
- </property>
- <property>
- <name>hbase.master.info.port</name>
- <value>60010</value>
- </property>
- <property>
- <name>hbase.zookeeper.property.clientPort</name>
- <value>2181</value>
- </property>
- <property>
- <name>zookeeper.session.timeout</name>
- <value>120000</value>
- </property>
- <property>
- <name>hbase.zookeeper.quorum</name>
- <value>master,node1,node2</value>
- </property>
- <property>
- <name>hbase.tmp.dir</name>
- <value>/usr/local/src/hbase/tmp</value>
- </property>
- <property>
- <name>hbase.cluster.distributed</name>
- <value>true</value>
- </property>
- </configuration>
复制代码 1.3.2.8 步骤八:在master节点修改 regionservers 文件
- [root@master conf]# cat regionservers
- node1
- node2
复制代码 1.3.2.9 步骤九:在master节点创建 hbase.tmp.dir 目录
- [root@master ~]# mkdir /usr/local/src/hbase/tmp
复制代码 1.3.2.10 步骤十:将master上的hbase安装文件同步到 node1 node2
- [root@master ~]# scp -r /usr/local/src/hbase/ root@node1:/usr/local/src/
- [root@master ~]# scp -r /usr/local/src/hbase/ root@node2:/usr/local/src/
复制代码 1.3.2.11 步骤十一:在全部节点修改 hbase 目录权限
- [root@master ~]# chown -R hadoop:hadoop /usr/local/src/hbase/
- [root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/hbase/
- [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/hbase/
复制代码 1.3.2.12 步骤十二:在全部节点切换到hadoop用户
- [root@master ~]# su - hadoop
- Last login: Mon Apr 11 00:42:46 CST 2022 on pts/0
- [root@slave1 ~]# su - hadoop
- Last login: Fri Apr 8 22:57:42 CST 2022 on pts/0
- [root@slave2 ~]# su - hadoop
- Last login: Fri Apr 8 22:57:54 CST 2022 on pts/0
复制代码 1.3.2.13 步骤十三:启动 HBase
先启动 Hadoop,然后启动 ZooKeeper,末了启动 HBase。- [hadoop@master ~]$ start-all.sh
- [hadoop@master ~]$ jps
- 2130 SecondaryNameNode
- 1927 NameNode
- 2554 Jps
- 2301 ResourceManager
- [hadoop@slave1 ~]$ jps
- 1845 NodeManager
- 1977 Jps
- 1725 DataNode
- [hadoop@slave2 ~]$ jps
- 2080 Jps
- 1829 DataNode
- 1948 NodeManager
复制代码 1.3.2.14 步骤十四:在 master节点启动HBase
- [hadoop@master conf]$ start-hbase.sh
- [hadoop@master conf]$ jps
- 2130 SecondaryNameNode
- 3572 HQuorumPeer
- 1927 NameNode
- 5932 HMaster
- 2301 ResourceManager
- 6157 Jps
- [hadoop@slave1 ~]$ jps
- 2724 Jps
- 1845 NodeManager
- 1725 DataNode
- 2399 HQuorumPeer
- 2527 HRegionServer
- [root@slave2 ~]# jps
- 3795 Jps
- 1829 DataNode
- 3529 HRegionServer
- 1948 NodeManager
- 3388 HQuorumPeer
复制代码 1.3.2.15 步骤十五:修改windows上的hosts文件
(C:\Windows\System32\drivers\etc\hosts)
把hots文件拖到桌面上,然后编辑它参加master的主机名与P地址的映射关系后在浏览器上输入http//:master:60010访问hbase的web界面
1.3.3 实验任务三:HBase常用Shell命令
1.3.3.1 步骤一:进入 HBase 命令行
- [hadoop@master ~]$ hbase shell
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
- hbase(main):001:0>
复制代码 1.3.3.2 步骤二:创建表 scores,两个列簇:grade 和 course
- hbase(main):001:0> create 'scores','grade','course'
- 0 row(s) in 1.4400 seconds
- => Hbase::Table - scores
复制代码 1.3.3.3 步骤三:查看数据库状态
- hbase(main):002:0> status
- 1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load
复制代码 1.3.3.4 步骤四:查看数据库版本
- hbase(main):003:0> version
- 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
复制代码 1.3.3.5 步骤五:查看表
- hbase(main):004:0> list
- TABLE
- scores
- 1 row(s) in 0.0150 seconds
- => ["scores"]
复制代码 1.3.3.6 步骤六:插入记载 1:jie,grade: 143cloud
- hbase(main):005:0> put 'scores','jie','grade:','146cloud'
- 0 row(s) in 0.1060 seconds
复制代码 1.3.3.7 步骤七:插入记载 2:jie,course:math,86
- hbase(main):006:0> put 'scores','jie','course:math','86'
- 0 row(s) in 0.0120 seconds
复制代码 1.3.3.8 步骤八:插入记载 3:jie,course:cloud,92
- hbase(main):009:0> put 'scores','jie','course:cloud','92'
- 0 row(s) in 0.0070 seconds
复制代码 1.3.3.9 步骤九:插入记载 4:shi,grade:133soft
- hbase(main):010:0> put 'scores','shi','grade:','133soft'
- 0 row(s) in 0.0120 seconds
复制代码 1.3.3.10 步骤十:插入记载 5:shi,grade:math,87
- hbase(main):011:0> put 'scores','shi','course:math','87'
- 0 row(s) in 0.0090 seconds
复制代码 1.3.3.11 步骤十一:插入记载 6:shi,grade:cloud,96
- hbase(main):012:0> put 'scores','shi','course:cloud','96'
- 0 row(s) in 0.0100 seconds
复制代码 1.3.3.12 步骤十二:读取 jie 的记载
- hbase(main):013:0> get 'scores','jie'
- COLUMN CELL
- course:cloud timestamp=1650015032132, value=92
- course:mathtimestamp=1650014925177, value=86
- grade: timestamp=1650014896056, value=146cloud
- 3 row(s) in 0.0250 seconds
复制代码 1.3.3.13 步骤十三:读取 jie 的班级
- hbase(main):014:0> get 'scores','jie','grade'
- COLUMN CELL
- grade: timestamp=1650014896056, value=146cloud
- 1 row(s) in 0.0110 seconds
复制代码 1.3.3.14 步骤十四:查看整个表记载
- hbase(main):001:0> scan 'scores'
- ROW COLUMN+CELL
- jie column=course:cloud, timestamp=1650015032132, value=92
- jie column=course:math, timestamp=1650014925177, value=86
- jie column=grade:, timestamp=1650014896056, value=146cloud
- shi column=course:cloud, timestamp=1650015240873, value=96
- shi column=course:math, timestamp=1650015183521, value=87
- 2 row(s) in 0.1490 seconds
复制代码 1.3.3.15 步骤十五:按例查看表记载
- hbase(main):002:0> scan 'scores',{COLUMNS=>'course'}
- ROW COLUMN+CELL
- jie column=course:cloud, timestamp=1650015032132, value=92
- jie column=course:math, timestamp=1650014925177, value=86
- shi column=course:cloud, timestamp=1650015240873, value=96
- shi column=course:math, timestamp=1650015183521, value=87
- 2 row(s) in 0.0160 seconds
复制代码 1.3.3.16 步骤十六:删除指定记载shell
- hbase(main):003:0> delete 'scores','shi','grade'
- 0 row(s) in 0.0560 seconds
复制代码 1.3.3.17 步骤十七:删除后,执行scan 命令
- hbase(main):004:0> scan 'scores'
- ROW COLUMN+CELL
- jie column=course:cloud, timestamp=1650015032132, value=92
- jie column=course:math, timestamp=1650014925177, value=86
- jie column=grade:, timestamp=1650014896056, value=146cloud
- shi column=course:cloud, timestamp=1650015240873, value=96
- shi column=course:math, timestamp=1650015183521, value=87
- 2 row(s) in 0.0130 seconds
复制代码 1.3.3.18 步骤十八:增加新的列簇
- hbase(main):005:0> alter 'scores',NAME=>'age'
- Updating all regions with the new schema...
- 1/1 regions updated.
- Done.
- 0 row(s) in 2.0110 seconds
复制代码 1.3.3.19 步骤十九:查看表布局
- hbase(main):006:0> describe 'scores'
- Table scores is ENABLED
- scores
- COLUMN FAMILIES DESCRIPTION
- {NAME => 'age', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', C
- OMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
- {NAME => 'course', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER'
- , COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
- {NAME => 'grade', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER',
- COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
- 3 row(s) in 0.0230 seconds
复制代码 1.3.3.20 步骤二十:删除列簇
- hbase(main):007:0> alter 'scores',NAME=>'age',METHOD=>'delete'
- Updating all regions with the new schema...
- 1/1 regions updated.
- Done.
- 0 row(s) in 2.1990 seconds
复制代码 1.3.3.21 步骤二十一:删除表
- hbase(main):008:0> disable 'scores'
- 0 row(s) in 2.3190 seconds
复制代码 1.3.3.22 步骤二十二:退出
1.3.3.23 步骤二十三:关闭 HBase
- [hadoop@master ~]$ stop-hbase.sh
- stopping hbase.................
- master: stopping zookeeper.
- node2: stopping zookeeper.
- node1: stopping zookeeper.
复制代码在 master 节点关闭 Hadoop。
- [hadoop@master ~]$ stop-all.sh
- This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.130: stopping datanode
- 10.10.10.129: stopping datanode
- Stopping secondary namenodes [0.0.0.0]
- 0.0.0.0: stopping secondarynamenode
- stopping yarn daemons
- stopping resourcemanager
- 10.10.10.129: stopping nodemanager
- 10.10.10.130: stopping nodemanager
- no proxyserver to stop
- [hadoop@master ~]$ jps
- 3820 Jps
- [hadoop@slave1 ~]$ jps
- 2220 Jps
- [root@slave2 ~]# jps
- 2082 Jps
复制代码 完结,撒花
附件:
第9章 Sqoop组件安装配置
实验一:Sqoop 组件安装与配置
1.1.实验目标
完成本实验,您应该能够:
- 下载和解压 Sqoop
- 配置Sqoop 环境
- 安装Sqoop
- Sqoop 模板命令
1.2.实验要求
1.3.实验过程
1.3.1.实验任务一:下载和解压 Sqoop
安装Sqoop 组件需要与Hadoop 环境适配。使用 root 用户在Master 节点上进行部署, 将 /opt/software/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz 压 缩 包 解 压 到/usr/local/src 目录下。- [root@master ~]# tar xf /opt/software/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /usr/local/src/
复制代码 将解压后生成的 sqoop-1.4.7.bin hadoop-2.6.0 文件夹更名为 sqoop。- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop
复制代码 1.3.2.实验任务二:配置 Sqoop 环境
1.3.2.1.步骤一:创建 Sqoop 的配置文件 sqoop-env.sh。
复制 sqoop-env-template.sh 模板,并将模板重命名为 sqoop-env.sh。- [root@master src]# cd /usr/local/src/sqoop/conf/
- [root@master conf]# cp sqoop-env-template.sh sqoop-env.sh
复制代码 1.3.2.2.步骤二:修改 sqoop-env.sh 文件,添加 Hdoop、Hbase、Hive 等组件的安装路径。
注意,下面各组件的安装路径需要与实际环境中的安装路径保持同等。- vim sqoop-env.sh
- export HADOOP_COMMON_HOME=/usr/local/src/hadoop
- export HADOOP_MAPRED_HOME=/usr/local/src/hadoop
- export HBASE_HOME=/usr/local/src/hbase
- export HIVE_HOME=/usr/local/src/hive
复制代码 1.3.2.3.步骤三:配置 Linux 系统环境变量,添加 Sqoop 组件的路径。
- vim /etc/profile.d/sqoop.sh
- export SQOOP_HOME=/usr/local/src/sqoop
- export PATH=$SQOOP_HOME/bin:$PATH
- export CLASSPATH=$CLASSPATH:$SQOOP_HOME/lib
- [root@master conf]# source /etc/profile.d/sqoop.sh
- [root@master conf]# echo $PATH
- /usr/local/src/sqoop/bin:/usr/local/src/hbase/bin:/usr/local/src/zookeeper/bin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/src/hive/bin:/root/bin
复制代码 1.3.2.4.步骤四:连接数据库
为了使 Sqoop 能够连接 MySQL 数据库,需要将/opt/software/mysql-connector-jav a-5.1.46.jar 文件放入 sqoop 的 lib 目录中。该 jar 文件的版本需要与 MySQL 数据库的版本相对应,否则 Sqoop 导入数据时会报错。(mysql-connector-java-5.1.46.jar 对应的是 MySQL 5.7 版本)若该目录没有 jar 包,则使用第 6 章导入 home 目录的jar包- [root@master conf]# cp /opt/software/mysql-connector-java-5.1.46.jar /usr/local/src/sqoop/lib/
复制代码 1.3.3.实验任务三:启动Sqoop
1.3.3.1.步骤一:执行 Sqoop 前需要先启动 Hadoop 集群。
在 master 节点切换到 hadoop 用户执行 start-all.sh 命令启动 Hadoop 集群。- [root@master conf]# su - hadoop
- Last login: Fri Apr 22 16:21:25 CST 2022 on pts/0
- [hadoop@master ~]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
复制代码 1.3.3.2.步骤二:检查 Hadoop 集群的运行状态。
- [hadoop@master ~]$ jps
- 1653 SecondaryNameNode
- 2086 Jps
- 1450 NameNode
- 1822 ResourceManager
- [root@slave1 ~]# jps
- 1378 NodeManager
- 1268 DataNode
- 1519 Jps
- [root@slave2 ~]# jps
- 1541 Jps
- 1290 DataNode
- 1405 NodeManager
复制代码 1.3.3.3.步骤三:测试Sqoop是否能够正常连接MySQL 数据库。
Sqoop 连接 MySQL 数据库 P 大写 密码 Password123$
- [hadoop@master ~]$ sqoop list-databases --connect jdbc:mysql://master:3306 --username root -P
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 15:25:49 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- Enter password:
- 22/04/29 15:25:58 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri Apr 29 15:25:58 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- information_schema
- hive
- mysql
- performance_schema
- sys
复制代码 1.3.3.4.步骤四:连接 hive
为了使 Sqoop 能够连接 Hive,需要将 hive 组件/usr/local/src/hive/lib 目录下的
hive-common-2.0.0.jar 也放入 Sqoop 安装路径的 lib 目录中。- [hadoop@master ~]$ cp /usr/local/src/hive/lib/hive-common-2.0.0.jar /usr/local/src/sqoop/lib/
复制代码 1.3.4.实验任务四:Sqoop 模板命令
1.3.4.1.步骤一:创建MySQL数据库和数据表。
创建 sample 数据库,在 sample 中创建 student 表,在 student 表中插入了 3 条数据。- # 登录 MySQL 数据库
- [hadoop@master ~]$ mysql -uroot -pPassword123$
- mysql: [Warning] Using a password on the command line interface can be insecure.
- Welcome to the MySQL monitor. Commands end with ; or \g.
- Your MySQL connection id is 6
- Server version: 5.7.18 MySQL Community Server (GPL)
- Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
- Oracle is a registered trademark of Oracle Corporation and/or its
- affiliates. Other names may be trademarks of their respective
- owners.
- Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
- # 创建 sample 库
- mysql> create database sample;
- Query OK, 1 row affected (0.00 sec)
- # 使用 sample 库
- mysql> use sample;
- Database changed
- # 创建 student 表,该数据表有number学号和name姓名两个字段
- mysql> create table student(number char(9) primary key, name varchar(10));
- Query OK, 0 rows affected (0.01 sec)
- # 向 student 表插入几条数据
- mysql> insert into student values('01','zhangsan'),('02','lisi'),('03','wangwu');
- Query OK, 3 rows affected (0.01 sec)
- Records: 3 Duplicates: 0 Warnings: 0
- # 查询 student 表的数据
- mysql> select * from student;
- +--------+----------+
- | number | name |
- +--------+----------+
- | 01 | zhangsan |
- | 02 | lisi |
- | 03 | wangwu |
- +--------+----------+
- 3 rows in set (0.00 sec)
- mysql> quit
- Bye
复制代码 1.3.4.2.步骤二:在Hive中创建sample数据库和student数据表。
- hive>
- > create database sample;
- OK
- Time taken: 0.528 seconds
- hive> use sample;
- OK
- Time taken: 0.019 seconds
- hive> create table student(number STRING,name STRING);
- OK
- Time taken: 0.2 seconds
- hive> exit;
- [hadoop@master conf]$
复制代码 1.3.4.3.步骤三:从MySQL 导出数据,导入 Hive。
- [hadoop@master ~]$ sqoop import --connect jdbc:mysql://master:3306/sample --username root --password Password123$ --table student --fields-terminated-by '|' --delete-target-dir --num-mappers 1 --hive-import --hive-database sample --hive-table student
- hive>
- > select * from sample.student;
- OK
- 01|zhangsan NULL
- 02|lisi NULL
- 03|wangwu NULL
- Time taken: 1.238 seconds, Fetched: 3 row(s)
- hive>
- > exit;
复制代码 1.3.4.4.步骤四:sqoop常用命令
- #列出所有数据库
- [hadoop@master ~]$ sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 16:55:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 16:55:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 16:55:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri Apr 29 16:55:40 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- information_schema
- hive
- mysql
- performance_schema
- sample
- sys
- # 连接 MySQL 并列出 sample 数据库中的表
- [hadoop@master ~]$ sqoop list-tables --connect "jdbc:mysql://master:3306/sample?useSSL=false" --username root --password Password123$
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 16:56:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 16:56:45 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 16:56:45 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- student
- # 将关系型数据的表结构复制到 hive 中,只是复制表的结构,表中的内容没有复制过去
- [hadoop@master ~]$ sqoop create-hive-table --connect jdbc:mysql://master:3306/sample --table student --username root --password Password123$ --hive-table test
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 16:57:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 16:57:42 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 16:57:42 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
- 22/04/29 16:57:42 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
- 22/04/29 16:57:42 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri Apr 29 16:57:42 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 16:57:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- 22/04/29 16:57:43 INFO hive.HiveImport: Loading uploaded data into Hive
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- 22/04/29 16:57:46 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- 22/04/29 16:57:46 INFO hive.HiveImport:
- 22/04/29 16:57:46 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:47 INFO hive.HiveImport: Fri Apr 29 16:57:47 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:48 INFO hive.HiveImport: Fri Apr 29 16:57:48 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 16:57:50 INFO hive.HiveImport: OK
- 22/04/29 16:57:50 INFO hive.HiveImport: Time taken: 0.853 seconds
- 22/04/29 16:57:51 INFO hive.HiveImport: Hive import complete.
- # 如果执行以上命令之后显示hive.HiveImport: Hive import complete.则表示成功
- [hadoop@master ~]$ sqoop import --connect jdbc:mysql://master:3306/sample --username root --password Password123$ --table student --fields-terminated-by '|' --delete-target-dir --num-mappers 1 --hive-import --hive-database default --hive-table test
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 17:00:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 17:00:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 17:00:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- 22/04/29 17:00:06 INFO tool.CodeGenTool: Beginning code generation
- Fri Apr 29 17:00:06 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:00:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:00:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/src/hadoop
- Note: /tmp/sqoop-hadoop/compile/556af862aa5bc04a542c14f0741f7dc6/student.java uses or overrides a deprecated API.
- Note: Recompile with -Xlint:deprecation for details.
- 22/04/29 17:00:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/556af862aa5bc04a542c14f0741f7dc6/student.jar
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- 22/04/29 17:00:07 INFO tool.ImportTool: Destination directory student is not present, hence not deleting.
- 22/04/29 17:00:07 WARN manager.MySQLManager: It looks like you are importing from mysql.
- 22/04/29 17:00:07 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
- 22/04/29 17:00:07 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
- 22/04/29 17:00:07 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
- 22/04/29 17:00:07 INFO mapreduce.ImportJobBase: Beginning import of student
- 22/04/29 17:00:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
- 22/04/29 17:00:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- 22/04/29 17:00:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
- Fri Apr 29 17:00:09 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:09 INFO db.DBInputFormat: Using read commited transaction isolation
- 22/04/29 17:00:09 INFO mapreduce.JobSubmitter: number of splits:1
- 22/04/29 17:00:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1651221174197_0003
- 22/04/29 17:00:09 INFO impl.YarnClientImpl: Submitted application application_1651221174197_0003
- 22/04/29 17:00:09 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1651221174197_0003/
- 22/04/29 17:00:09 INFO mapreduce.Job: Running job: job_1651221174197_0003
- 22/04/29 17:00:13 INFO mapreduce.Job: Job job_1651221174197_0003 running in uber mode : false
- 22/04/29 17:00:13 INFO mapreduce.Job: map 0% reduce 0%
- 22/04/29 17:00:17 INFO mapreduce.Job: map 100% reduce 0%
- 22/04/29 17:00:17 INFO mapreduce.Job: Job job_1651221174197_0003 completed successfully
- 22/04/29 17:00:17 INFO mapreduce.Job: Counters: 30
- File System Counters
- FILE: Number of bytes read=0
- FILE: Number of bytes written=134261
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=87
- HDFS: Number of bytes written=30
- HDFS: Number of read operations=4
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=1
- Other local map tasks=1
- Total time spent by all maps in occupied slots (ms)=1731
- Total time spent by all reduces in occupied slots (ms)=0
- Total time spent by all map tasks (ms)=1731
- Total vcore-seconds taken by all map tasks=1731
- Total megabyte-seconds taken by all map tasks=1772544
- Map-Reduce Framework
- Map input records=3
- Map output records=3
- Input split bytes=87
- Spilled Records=0
- Failed Shuffles=0
- Merged Map outputs=0
- GC time elapsed (ms)=35
- CPU time spent (ms)=1010
- Physical memory (bytes) snapshot=179433472
- Virtual memory (bytes) snapshot=2137202688
- Total committed heap usage (bytes)=88604672
- File Input Format Counters
- Bytes Read=0
- File Output Format Counters
- Bytes Written=30
- 22/04/29 17:00:17 INFO mapreduce.ImportJobBase: Transferred 30 bytes in 9.8777 seconds (3.0371 bytes/sec)
- 22/04/29 17:00:17 INFO mapreduce.ImportJobBase: Retrieved 3 records.
- 22/04/29 17:00:17 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table student
- Fri Apr 29 17:00:17 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:00:17 INFO hive.HiveImport: Loading uploaded data into Hive
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- 22/04/29 17:00:20 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- 22/04/29 17:00:20 INFO hive.HiveImport:
- 22/04/29 17:00:20 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:21 INFO hive.HiveImport: Fri Apr 29 17:00:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:23 INFO hive.HiveImport: Fri Apr 29 17:00:23 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:00:24 INFO hive.HiveImport: OK
- 22/04/29 17:00:24 INFO hive.HiveImport: Time taken: 0.713 seconds
- 22/04/29 17:00:24 INFO hive.HiveImport: Loading data to table default.test
- 22/04/29 17:00:25 INFO hive.HiveImport: OK
- 22/04/29 17:00:25 INFO hive.HiveImport: Time taken: 0.42 seconds
- 22/04/29 17:00:25 INFO hive.HiveImport: Hive import complete.
- 22/04/29 17:00:25 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.
- hive> show tables;
- OK
- test
- Time taken: 0.558 seconds, Fetched: 1 row(s)
- hive> exit;
复制代码- # 从mysql中导出表内容到HDFS文件中
- [hadoop@master ~]$ sqoop import --connect jdbc:mysql://master:3306/sample --username root --password Password123$ --table student --num-mappers 1 --target-dir /user/test
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/04/29 17:03:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/04/29 17:03:13 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/04/29 17:03:13 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- 22/04/29 17:03:13 INFO tool.CodeGenTool: Beginning code generation
- Fri Apr 29 17:03:14 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:03:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:03:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 1
- 22/04/29 17:03:14 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/src/hadoop
- Note: /tmp/sqoop-hadoop/compile/eab748b8f3fb956072f4877fdf4bf23a/student.java uses or overrides a deprecated API.
- Note: Recompile with -Xlint:deprecation for details.
- 22/04/29 17:03:15 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/eab748b8f3fb956072f4877fdf4bf23a/student.jar
- 22/04/29 17:03:15 WARN manager.MySQLManager: It looks like you are importing from mysql.
- 22/04/29 17:03:15 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
- 22/04/29 17:03:15 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
- 22/04/29 17:03:15 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
- 22/04/29 17:03:15 INFO mapreduce.ImportJobBase: Beginning import of student
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- 22/04/29 17:03:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
- 22/04/29 17:03:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- 22/04/29 17:03:15 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
- Fri Apr 29 17:03:17 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- 22/04/29 17:03:17 INFO db.DBInputFormat: Using read commited transaction isolation
- 22/04/29 17:03:17 INFO mapreduce.JobSubmitter: number of splits:1
- 22/04/29 17:03:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1651221174197_0004
- 22/04/29 17:03:17 INFO impl.YarnClientImpl: Submitted application application_1651221174197_0004
- 22/04/29 17:03:17 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1651221174197_0004/
- 22/04/29 17:03:17 INFO mapreduce.Job: Running job: job_1651221174197_0004
- 22/04/29 17:03:21 INFO mapreduce.Job: Job job_1651221174197_0004 running in uber mode : false
- 22/04/29 17:03:21 INFO mapreduce.Job: map 0% reduce 0%
- 22/04/29 17:03:25 INFO mapreduce.Job: map 100% reduce 0%
- 22/04/29 17:03:25 INFO mapreduce.Job: Job job_1651221174197_0004 completed successfully
- 22/04/29 17:03:25 INFO mapreduce.Job: Counters: 30
- File System Counters
- FILE: Number of bytes read=0
- FILE: Number of bytes written=134251
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=87
- HDFS: Number of bytes written=30
- HDFS: Number of read operations=4
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=1
- Other local map tasks=1
- Total time spent by all maps in occupied slots (ms)=1945
- Total time spent by all reduces in occupied slots (ms)=0
- Total time spent by all map tasks (ms)=1945
- Total vcore-seconds taken by all map tasks=1945
- Total megabyte-seconds taken by all map tasks=1991680
- Map-Reduce Framework
- Map input records=3
- Map output records=3
- Input split bytes=87
- Spilled Records=0
- Failed Shuffles=0
- Merged Map outputs=0
- GC time elapsed (ms)=69
- CPU time spent (ms)=1050
- Physical memory (bytes) snapshot=179068928
- Virtual memory (bytes) snapshot=2136522752
- Total committed heap usage (bytes)=88604672
- File Input Format Counters
- Bytes Read=0
- File Output Format Counters
- Bytes Written=30
- 22/04/29 17:03:25 INFO mapreduce.ImportJobBase: Transferred 30 bytes in 10.2361 seconds (2.9308 bytes/sec)
- 22/04/29 17:03:25 INFO mapreduce.ImportJobBase: Retrieved 3 records.
- # 执行以上命令后在浏览器上访问master_ip:50070然后点击Utilities下面的Browse the file system,要能看到user就表示成功
复制代码- [hadoop@master ~]$ hdfs dfs -ls /user/test
- Found 2 items
- -rw-r--r-- 2 hadoop supergroup 0 2022-04-29 17:03 /user/test/_SUCCESS
- -rw-r--r-- 2 hadoop supergroup 30 2022-04-29 17:03 /user/test/part-m-00000
- [hadoop@master ~]$ hdfs dfs -cat /user/test/part-m-00000
- 01,zhangsan
- 02,lisi
- 03,wangwu
复制代码 第10章 Flume组件安装配置
实验一:Flume 组件安装配置
1.1. 实验目标
完成本实验,您应该能够:
- 掌握下载和解压 Flume
- 掌握 Flume 组件部署
- 掌握使用 Flume 发送和继承信息
1.2. 实验要求
- 相识 Flume 相关知识
- 认识 Flume 功能应用
- 认识 Flume 组件设置
1.3. 实验过程
1.3.1. 实验任务一:下载和解压 Flume
使用 root 用户解压 Flume 安装包到“/usr/local/src”路径,并修改解压后文件夹名
为 flume。- [root@master ~]# tar xf /opt/software/apache-flume-1.6.0-bin.tar.gz -C /usr/local/src/
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv apache-flume-1.6.0-bin/
- flume
- [root@master src]# chown -R hadoop.hadoop /usr/local/src/
复制代码 1.3.2. 实验任务二:Flume 组件部署
1.3.2.1. 步骤一:使用 root 用户设置 Flume 环境变量,并使环境变量对全部用户生效。
- [root@master src]# vim /etc/profile.d/flume.sh
- export FLUME_HOME=/usr/local/src/flume
- export PATH=${FLUME_HOME}/bin:$PATH
复制代码 1.3.2.2. 步骤二:修改 Flume 相应配置文件。
起首,切换到 hadoop 用户,并切换当前工作目录到 Flume 的配置文件夹。- [hadoop@master ~]$ echo $PATH
- /usr/local/src/hbase/bin:/usr/local/src/zookeeper/bin:/usr/local/src/sqoop/bin:/usr/local/src/hive/bin:/usr/local/src/hbase/bin:/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/flume/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/src/hive/bin:/home/hadoop/.local/bin:/home/hadoop/bin
复制代码 1.3.2.3. 步骤三:修改并配置 flume-env.sh 文件。
- [hadoop@master ~]$ vim /usr/local/src/hbase/conf/hbase-env.sh
- #export HBASE_CLASSPATH=/usr/local/src/hadoop/etc/hadoop/ #注释掉这一行的内容
- export JAVA_HOME=/usr/local/src/jdk
- [hadoop@master conf]$ start-all.sh
- [hadoop@master ~]$ flume-ng version
- Flume 1.6.0
- Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
- Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
- Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
- From source with checksum b29e416802ce9ece3269d34233baf43f
复制代码 1.3.3. 实验任务三:使用 Flume 发送和继承信息
通过 Flume 将 Web 服务器中数据传输到 HDFS 中。
1.3.3.1. 步骤一:在 Flume 安装目录中创建 simple-hdfs-flume.conf 文件。
- [hadoop@master ~]$ cd /usr/local/src/flume/
- [hadoop@master ~]$ vi /usr/local/src/flume/simple-hdfs-flume.conf
- a1.sources=r1
- a1.sinks=k1
- a1.channels=c1
- a1.sources.r1.type=spooldir
- a1.sources.r1.spoolDir=/usr/local/src/hadoop/logs/
- a1.sources.r1.fileHeader=true
- a1.sinks.k1.type=hdfs
- a1.sinks.k1.hdfs.path=hdfs://master:9000/tmp/flume
- a1.sinks.k1.hdfs.rollsize=1048760
- a1.sinks.k1.hdfs.rollCount=0
- a1.sinks.k1.hdfs.rollInterval=900
- a1.sinks.k1.hdfs.useLocalTimeStamp=true
- a1.channels.c1.type=file
- a1.channels.c1.capacity=1000
- a1.channels.c1.transactionCapacity=100
- a1.sources.r1.channels = c1
- a1.sinks.k1.channel = c1
复制代码 1.3.3.2. 步骤二:使用 flume-ng agent 命令加载 simple-hdfs-flume.conf 配置信息,启 配置信息,启动flume 传输数据。
- [hadoop@master ~]$ flume-ng agent --conf-file simple-hdfs-flume.conf --name a1
复制代码ctrl+c 退出 flume 传输
1.3.3.3. 步骤三:查看 Flume 传输到 HDFS 的文件,若能查看到 HDFS 上/tmp/flume目录有传输的数据文件,则表示数据传输成功。
- [hadoop@master ~]$ hdfs dfs -ls /
- Found 5 items
- drwxr-xr-x - hadoop supergroup 0 2022-04-15 22:04 /hbase
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:24 /input
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:26 /output
- drwxr-xr-x - hadoop supergroup 0 2022-05-06 17:24 /tmp
- drwxr-xr-x - hadoop supergroup 0 2022-04-29 17:03 /user
复制代码
第13章 大数据平台监控命令
实验一:通过命令监控大数据平台运行状态
1.1. 实验目标
完成本实验,您应该能够:
- 掌握大数据平台的运行状况
- 掌握查看大数据平台运行状况的命令
1.2. 实验要求
- 认识查看大数据平台运行状态的方式
- 相识查看大数据平台运行状况的命令
1.3. 实验过程
1.3.1. 实验任务一:通过命令查看大数据平台状态
1.3.1.1. 步骤一: 查看 Linux 系统的信息( uname -a)
- [root@master ~]# uname -a
- Linux master 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
复制代码 1.3.1.2. 步骤二:查看硬盘信息
(1)查看全部分区(fdisk -l)- [root@master ~]# fdisk -l
- Disk /dev/sda: 21.5 GB, 21474836480 bytes, 41943040 sectors
- Units = sectors of 1 * 512 = 512 bytes
- Sector size (logical/physical): 512 bytes / 512 bytes
- I/O size (minimum/optimal): 512 bytes / 512 bytes
- Disk label type: dos
- Disk identifier: 0x00096169
- Device Boot Start End Blocks Id System
- /dev/sda1 * 2048 2099199 1048576 83 Linux
- /dev/sda2 2099200 41943039 19921920 8e Linux LVM
- Disk /dev/mapper/centos-root: 18.2 GB, 18249416704 bytes, 35643392 sectors
- Units = sectors of 1 * 512 = 512 bytes
- Sector size (logical/physical): 512 bytes / 512 bytes
- I/O size (minimum/optimal): 512 bytes / 512 bytes
复制代码 - Disk /dev/mapper/centos-swap: 2147 MB, 2147483648 bytes, 4194304 sectors
- Units = sectors of 1 * 512 = 512 bytes
- Sector size (logical/physical): 512 bytes / 512 bytes
- I/O size (minimum/optimal): 512 bytes / 512 bytes
复制代码 (2)查看全部交换分区(swapon -s)- [root@master ~]# swapon -s
- Filename Type Size Used Priority
- /dev/dm-1 partition 2097148 0 -
复制代码 (3)查看文件系统占比(df -h)- [root@master ~]# df -h
- Filesystem Size Used Avail Use% Mounted on
- /dev/mapper/centos-root 17G 4.8G 13G 28% /
- devtmpfs 980M 0 980M 0% /dev
- tmpfs 992M 0 992M 0% /dev/shm
- tmpfs 992M 9.5M 982M 1% /run
- tmpfs 992M 0 992M 0% /sys/fs/cgroup
- /dev/sda1 1014M 130M 885M 13% /boot
- tmpfs 199M 0 199M 0% /run/user/0
复制代码 1.3.1.3. 步骤三: 查看网络 IP 地址( ifconfig)
- [root@master ~]# ifconfig
- ens32: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
- inet 10.10.10.128 netmask 255.255.255.0 broadcast 10.10.10.255
- inet6 fe80::af34:1702:3972:2b64 prefixlen 64 scopeid 0x20<link>
- ether 00:0c:29:2e:33:83 txqueuelen 1000 (Ethernet)
- RX packets 342 bytes 29820 (29.1 KiB)
- RX errors 0 dropped 0 overruns 0 frame 0
- TX packets 257 bytes 26394 (25.7 KiB)
- TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
- lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
- inet 127.0.0.1 netmask 255.0.0.0
- inet6 ::1 prefixlen 128 scopeid 0x10<host>
- loop txqueuelen 1000 (Local Loopback)
- RX packets 4 bytes 360 (360.0 B)
- RX errors 0 dropped 0 overruns 0 frame 0
- TX packets 4 bytes 360 (360.0 B)
- TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
复制代码 1.3.1.4. 步骤四:查看全部监听端口( netstat -lntp)
- [root@master ~]# netstat -lntp
- Active Internet connections (only servers)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 933/sshd
- tcp6 0 0 :::3306 :::* LISTEN 1021/mysqld
- tcp6 0 0 :::22 :::* LISTEN 933/sshd 、
复制代码 1.3.1.5. 步骤五:查看全部已经创建的连接( netstat -antp)
- [root@master ~]# netstat -antp
- Active Internet connections (servers and established)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 933/sshd
- tcp 0 52 10.10.10.128:22 10.10.10.1:59963 ESTABLISHED 1249/sshd: root@pts
- tcp6 0 0 :::3306 :::* LISTEN 1021/mysqld
- tcp6 0 0 :::22 :::* LISTEN 933/sshd
复制代码 1.3.1.6. 步骤六:实时显示进程状态( top ),该命令可以查看进程对 CPU 、内存的占比等。
- [root@master ~]# top
- top - 16:09:46 up 47 min, 2 users, load average: 0.00, 0.01, 0.05
- Tasks: 115 total, 1 running, 114 sleeping, 0 stopped, 0 zombie
- %Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- KiB Mem : 2030172 total, 1575444 free, 281296 used, 173432 buff/cache
- KiB Swap: 2097148 total, 2097148 free, 0 used. 1571928 avail Mem
- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
- 1021 mysql 20 0 1258940 191544 6840 S 0.3 9.4 0:01.71 mysqld
- 1 root 20 0 125456 3896 2560 S 0.0 0.2 0:00.96 systemd
- 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
- 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
- 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
- 7 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/0
- 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
- 9 root 20 0 0 0 0 S 0.0 0.0 0:00.15 rcu_sched
- 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain
- 11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
- 12 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/1
- 13 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
- 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
- 16 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
- 17 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/2
- 18 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/2
- 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2
复制代码 1.3.1.7. 步骤七:查看 U CPU 信息( cat /proc/cpuinfo )
1.3.1.8. 步骤八:查看内存信息( cat /proc/meminfo ),该命令可以查看总内存、空闲内存等信息。
- [root@master ~]# cat /proc/meminfo
- MemTotal: 2030172 kB
- MemFree: 1575448 kB
- MemAvailable: 1571932 kB
- Buffers: 2112 kB
- Cached: 126676 kB
- SwapCached: 0 kB
- Active: 251708 kB
- Inactive: 100540 kB
- Active(anon): 223876 kB
- Inactive(anon): 9252 kB
- Active(file): 27832 kB
- Inactive(file): 91288 kB
- Unevictable: 0 kB
- Mlocked: 0 kB
- SwapTotal: 2097148 kB
- SwapFree: 2097148 kB
- Dirty: 0 kB
- Writeback: 0 kB
- AnonPages: 223648 kB
- Mapped: 28876 kB
- Shmem: 9668 kB
- Slab: 44644 kB
- SReclaimable: 18208 kB
- SUnreclaim: 26436 kB
- KernelStack: 4512 kB
- PageTables: 4056 kB
- NFS_Unstable: 0 kB
- Bounce: 0 kB
- WritebackTmp: 0 kB
- CommitLimit: 3112232 kB
- Committed_AS: 782724 kB
- VmallocTotal: 34359738367 kB
- VmallocUsed: 180220 kB
- VmallocChunk: 34359310332 kB
- HardwareCorrupted: 0 kB
- AnonHugePages: 178176 kB
- CmaTotal: 0 kB
- CmaFree: 0 kB
- HugePages_Total: 0
- HugePages_Free: 0
- HugePages_Rsvd: 0
- HugePages_Surp: 0
- Hugepagesize: 2048 kB
- DirectMap4k: 63360 kB
- DirectMap2M: 2033664 kB
- DirectMap1G: 0 kB
复制代码 1.3.2. 实验任务二:通过命令查看 Hadoop 状态
1.3.2.1. 步骤一:切换到 hadoop 用户
若当前的用户为 root,请切换到 hadoop 用户进行操作。- [root@master ~]# su - hadoop
- Last login: Tue May 10 14:33:03 CST 2022 on pts/0
- [hadoop@master ~]$
复制代码 1.3.2.2. 步骤二:切换到 Hadoop 的安装目录
- [hadoop@master ~]$ cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$
复制代码 1.3.2.3. 步骤三:启动 Hadoop
- [hadoop@master hadoop]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
- [hadoop@master hadoop]$ jps
- 1697 SecondaryNameNode
- 2115 Jps
- 1865 ResourceManager
- 1498 NameNode
复制代码 1.3.2.4. 步骤四:关闭 Hadoop
- [hadoop@master hadoop]$ stop-all.sh
- This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
- Stopping namenodes on [master]
- master: stopping namenode
- 10.10.10.130: stopping datanode
- 10.10.10.129: stopping datanode
- Stopping secondary namenodes [0.0.0.0]
- 0.0.0.0: stopping secondarynamenode
- stopping yarn daemons
- stopping resourcemanager
- 10.10.10.129: stopping nodemanager
- 10.10.10.130: stopping nodemanager
- no proxyserver to stop
复制代码 实验二:通过命令监控大数据平台资源状态
2.1 实验目标
完成本实验,您应该能够:
- 掌握大数据平台资源的运行状况
- 掌握查看大数据平台资源运行状况的命令
2.2. 实验要求
- 认识查看大数据平台资源运行状态的方式
- 相识查看大数据平台资源运行状况的命令
2.3. 实验过程
2.3.1. 实验任务一:看通过命令查看YARN状态
2.3.1.1. 步骤一:确认切换到目录 确认切换到目录 /usr/local/src/hadoop
- [hadoop@master ~]$ cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$
复制代码 2.3.1.2. 步骤二:返回主机界面在在Master主机上执行 start-all.sh
- [hadoop@master ~]$ start-all.sh
- This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
- Starting namenodes on [master]
- master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
- 10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slav1.out
- 10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
- Starting secondary namenodes [0.0.0.0]
- 0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
- starting yarn daemons
- starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
- 10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slav1.out
- 10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
- [hadoop@master ~]$
- #master 节点启动 zookeeper
- [hadoop@master hadoop]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- #slave1 节点启动 zookeeper
- [hadoop@slav1 hadoop]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
- #slave2 节点启动 zookeeper
- [hadoop@slave2 hadoop]$ zkServer.sh start
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
复制代码 2.3.1.3. 步骤三:执行JPS命令,发现Master上有NodeManager进程和ResourceManager进程,则YARN启动完成。
- 2817 NameNode
- 3681 ResourceManager
- 3477 NodeManager
- 3909 Jps
- 2990 SecondaryNameNode
复制代码 2.3.2. 实验任务二:通过命令查看HDFS状态
2.3.2.1. 步骤一:目录操作
切换到 hadoop 目录,执行 cd /usr/local/src/hadoop 命令- [hadoop@master ~]$ cd /usr/local/src/hadoop
- [hadoop@master hadoop]$
复制代码 查看 HDFS 目录- [hadoop@master hadoop]$ ./bin/hdfs dfs –ls /
复制代码 2.3.2.2. 步骤二:查看HDSF的报告,执行命令:bin/hdfs dfsadmin -report
- [hadoop@master hadoop]$ bin/hdfs dfsadmin -report
- Configured Capacity: 36477861888 (33.97 GB)
- Present Capacity: 31767752704 (29.59 GB)
- DFS Remaining: 31767146496 (29.59 GB)
- DFS Used: 606208 (592 KB)
- DFS Used%: 0.00%
- Under replicated blocks: 0
- Blocks with corrupt replicas: 0
- Missing blocks: 0
- Missing blocks (with replication factor 1): 0
- -------------------------------------------------
- Live datanodes (2):
- Name: 10.10.10.129:50010 (node1)
- Hostname: node1
- Decommission Status : Normal
- Configured Capacity: 18238930944 (16.99 GB)
- DFS Used: 303104 (296 KB)
- Non DFS Used: 2379792384 (2.22 GB)
- DFS Remaining: 15858835456 (14.77 GB)
- DFS Used%: 0.00%
- DFS Remaining%: 86.95%
- Configured Cache Capacity: 0 (0 B)
- Cache Used: 0 (0 B)
- Cache Remaining: 0 (0 B)
- Cache Used%: 100.00%
- Cache Remaining%: 0.00%
- Xceivers: 1
- Last contact: Fri May 20 18:31:48 CST 2022
复制代码 - Name: 10.10.10.130:50010 (node2)
- Hostname: node2
- Decommission Status : Normal
- Configured Capacity: 18238930944 (16.99 GB)
- DFS Used: 303104 (296 KB)
- Non DFS Used: 2330316800 (2.17 GB)
- DFS Remaining: 15908311040 (14.82 GB)
- DFS Used%: 0.00%
- DFS Remaining%: 87.22%
- Configured Cache Capacity: 0 (0 B)
- Cache Used: 0 (0 B)
- Cache Remaining: 0 (0 B)
- Cache Used%: 100.00%
- Cache Remaining%: 0.00%
- Xceivers: 1
- Last contact: Fri May 20 18:31:48 CST 2022
复制代码 2.3.2.3. 步骤三:查看 HDFS 空间环境,执行命令:hdfs dfs -df
- [hadoop@master hadoop]$ hdfs dfs -df
- Filesystem Size Used Available Use%
- hdfs://master:9000 36477861888 606208 31767146496 0%
复制代码 2.3.3. 实验任务三:看通过命令查看HBase状态
2.3.3.1. 步骤一 :启动运行HBase
切换到 HBase 安装目录/usr/local/src/hbase,命令如下:- [hadoop@master hadoop]$ cd /usr/local/src/hbase
- [hadoop@master hbase]$ hbase version
- HBase 1.2.1
- Source code repository git://asf-dev/home/busbey/projects/hbase revision=8d8a7107dc4ccbf36a92f64675dc60392f85c015
- Compiled by busbey on Wed Mar 30 11:19:21 CDT 2016
- From source with checksum f4bb4a14bb4e0b72b46f729dae98a772
复制代码结果显示 HBase1.2.1,说明 HBase 正在运行,版本号为 1.2.1。
2.3.3.2. 步骤二:查看HBase版本信息
执行命令hbase shell,进入HBase命令交互界面。- [hadoop@master hbase]$ hbase shell
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
复制代码 输入version,查询 HBase 版本- hbase(main):001:0> version
- 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016
复制代码 结果显示 HBase 版本为 1.2.1
2.3.3.3. 步骤三 :查询 HBase 状态,在 HBase 命令交互界面,执行 status 命令
- 1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667
- average load
复制代码 我们还可以“简单”查询 HBase 的状态,执行命令 status 'simple'- active master: master:16000 1589125905790
- 0 backup masters
- 3 live servers
- master:16020 1589125908065
- requestsPerSecond=0.0, numberOfOnlineRegions=1,
- usedHeapMB=28, maxHeapMB=1918, numberOfStores=1,
- numberOfStorefiles=1, storefileUncompressedSizeMB=0,
- storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0,
- readRequestsCount=5, writeRequestsCount=1, rootIndexSizeKB=0,
- totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0,
- totalCompactingKVs=0, currentCompactedKVs=0,
- compactionProgressPct=NaN, coprocessors=[MultiRowMutationEndpoint]
- slave1:16020 1589125915820
- requestsPerSecond=0.0, numberOfOnlineRegions=0,
- usedHeapMB=17, maxHeapMB=440, numberOfStores=0,
- numberOfStorefiles=0, storefileUncompressedSizeMB=0,
- storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0,
- readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0,
- totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0,
- totalCompactingKVs=0, currentCompactedKVs=0,
- compactionProgressPct=NaN, coprocessors=[]
- slave2:16020 1589125917741
- requestsPerSecond=0.0, numberOfOnlineRegions=1,
- usedHeapMB=15, maxHeapMB=440, numberOfStores=1,
- numberOfStorefiles=1, storefileUncompressedSizeMB=0,
- storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0,
- readRequestsCount=4, writeRequestsCount=0, rootIndexSizeKB=0,
- totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0,
- totalCompactingKVs=0, currentCompactedKVs=0,
- compactionProgressPct=NaN, coprocessors=[]
- 0 dead servers
- Aggregate load: 0, regions: 2
复制代码 显示更多的关于 Master、Slave1和 Slave2 主机的服务端口、请求时间等具体信息。
如果需要查询更多关于 HBase 状态,执行命令 help 'status'- hbase(main):004:0> help 'status'
- Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The
- default is 'summary'. Examples:
- hbase> status
- hbase> status 'simple'
- hbase> status 'summary'
- hbase> status 'detailed'
- hbase> status 'replication'
- hbase> status 'replication', 'source'
- hbase> status 'replication', 'sink'
复制代码 结果显示出全部关于 status 的命令。
2.3.3.4. 步骤四 停止HBase服务
停止HBase服务,则执行命令stop-hbase.sh。- [hadoop@master hbase]$ stop-hbase.sh
- stopping hbasecat.........
复制代码 2.4.4. 实验任务四:通过命令查看 Hive 状态
2.4.4.1. 步骤一:启动 Hive
切换到/usr/local/src/hive 目录,输入 hive,回车。- [hadoop@master ~]$ cd /usr/local/src/hive/[hadoop@master hive]$ hive
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- hive>
复制代码 当显示 hive>时,表示启动成功,进入到了 Hive shell 状态。
2.4.4.2. 步骤二:Hive 操作基本命令
注意:Hive 命令行语句后面一定要加分号。
(1)查看数据库- hive> show databases;
- OK
- default
- sample
- Time taken: 0.596 seconds, Fetched: 2 row(s)
- hive>
复制代码 显示默认的数据库 default。
(2)查看 default 数据库全部表- hive> use default;
- OK
- Time taken: 0.018 seconds
- hive> show tables;
- OK
- test
- Time taken: 0.036 seconds, Fetched: 1 row(s)
- hive>
复制代码 显示 default 数据中没有任何表。
(3)创建表 stu,表的 id 为整数型,name 为字符型- hive> create table stu(id int,name string);
- OK
- Time taken: 0.23 seconds
- hive>
复制代码 (4)为表 stu 插入一条信息,id 号为 001,name 为张三- hive> insert into stu values (1001,"zhangsan");
- WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- Query ID = hadoop_20220520185326_7c18630d-0690-4b35-8de8-423c9b901677
- Total jobs = 3
- Launching Job 1 out of 3
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_1653042072571_0001, Tracking URL = http://master:8088/proxy/application_1653042072571_0001/
- Kill Command = /usr/local/src/hadoop/bin/hadoop job -kill job_1653042072571_0001
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
- 2022-05-20 18:56:05,436 Stage-1 map = 0%, reduce = 0%
- 2022-05-20 18:56:11,699 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.47 sec
- MapReduce Total cumulative CPU time: 3 seconds 470 msec
- Ended Job = job_1653042072571_0001
- Stage-4 is selected by condition resolver.
- Stage-3 is filtered out by condition resolver.
- Stage-5 is filtered out by condition resolver.
- Moving data to: hdfs://master:9000/user/hive/warehouse/stu/.hive-staging_hive_2022-05-20_18-55-52_567_2370673334190980235-1/-ext-10000
- Loading data to table default.stu
- MapReduce Jobs Launched:
- Stage-Stage-1: Map: 1 Cumulative CPU: 3.47 sec HDFS Read: 4138 HDFS Write: 81 SUCCESS
- Total MapReduce CPU Time Spent: 3 seconds 470 msec
- OK
- Time taken: 20.438 seconds
复制代码 按照以上操作,继续插入两条信息:id 和 name 分别为 1002、1003 和 lisi、wangwu。
(5)插入数据后查看表的信息- hive> show tables;
- OK
- stu
- test
- values__tmp__table__1
- Time taken: 0.017 seconds, Fetched: 3 row(s)
- hive>
复制代码 (6)查看表 stu 布局- hive> desc stu;
- OK
- id int
- name string
- Time taken: 0.031 seconds, Fetched: 2 row(s)
- hive>
复制代码 (7)查看表 stu 的内容- hive> select * from stu;
- OK
- 1001 zhangsan
- Time taken: 0.077 seconds, Fetched: 1 row(s)
- hive>
复制代码 2.4.4.3. 步骤三:通过 Hive 命令行界面查看文件系统和历史命令
(1)查看当地文件系统,执行命令 ! ls /usr/local/src;- hive> ! ls /usr/local/src;
- apache-hive-2.0.0-bin
- flume
- hadoop
- hbase
- hive
- jdk
- sqoop
- zookeeper
复制代码 (2)查看 HDFS 文件系统,执行命令 dfs -ls /;- hive> dfs -ls /;
- Found 5 items
- drwxr-xr-x - hadoop supergroup 0 2022-04-15 22:04 /hbase
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:24 /input
- drwxr-xr-x - hadoop supergroup 0 2022-04-02 18:26 /output
- drwxr-xr-x - hadoop supergroup 0 2022-05-20 18:55 /tmp
- drwxr-xr-x - hadoop supergroup 0 2022-04-29 17:03 /user
复制代码 (3)查看在 Hive 中输入的全部历史命令
进入到当前用户 Hadoop 的目录/home/hadoop,查看.hivehistory 文件。- [hadoop@master ~]$ cd /home/hadoop
- [hadoop@master ~]$ cat .hivehistory
- create database sample;
- use sample;
- create table student(number STRING,name STRING);
- exit;
- select * from sample.student;
- exit;
- show tables;
- exit;
- show databases;
- use default;
- show tables;
- create table stu(id int,name string);
- insert into stu values (1001,"zhangsan");
- show tables;
- desc stu;
- select * from stu;
- ! ls /usr/local/src;
- dfs -ls /;
- exit
- ;
复制代码 结果显示,之前在 Hive 命令行界面下运行的全部命令(含错误命令)都显示了出来,有助于维护、故障排查等工作。
实验三 通过命令监控大数据平台服务状态
3.1. 实验目标
完成本实验,您应该能够:
- 掌握大数据平台服务的运行状况
- 掌握查看大数据平台服务运行状况的命令
3.2. 实验要求
- 认识查看大数据平台服务运行状态的方式
- 相识查看大数据平台服务运行状况的命令
3.3. 实验过程
3.3.1. 实验任务一: 通过命令查看 ZooKeeper 状态
3.3.1.1. 步骤一: 查看ZooKeeper状态,执行命令 zkServer.sh status,结果显示如下
- [hadoop@master ~]$ zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
复制代码 以上结果中,Mode:follower 表示为 ZooKeeper 的跟随者。
3.3.1.2. 步骤二: 查看运行进程
QuorumPeerMain:QuorumPeerMain 是 ZooKeeper 集群的启动入口类,是用来加载配置启动 QuorumPeer线程的。
执行命令 jps 以查看进程环境。- [hadoop@master ~]$ jps
- 5029 Jps
- 3494 SecondaryNameNode
- 3947 QuorumPeerMain
- 3292 NameNode
- 3660 ResourceManager
复制代码 3.3.1.3. 步骤四: 在成功启动ZooKeeper服务后,输入命令 zkCli.sh,连接到ZooKeeper 服务。
- [hadoop@master ~]$ zkCli.sh
- Connecting to localhost:2181
- 2022-05-20 19:07:11,924 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
- 2022-05-20 19:07:11,927 [myid:] - INFO [main:Environment@100] - Client environment:host.name=master
- 2022-05-20 19:07:11,927 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_152
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/local/src/jdk/jre
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/usr/local/src/zookeeper/bin/../build/classes:/usr/local/src/zookeeper/bin/../build/lib/*.jar:/usr/local/src/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/src/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/src/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/usr/local/src/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/local/src/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/src/zookeeper/bin/../zookeeper-3.4.8.jar:/usr/local/src/zookeeper/bin/../src/java/lib/*.jar:/usr/local/src/zookeeper/bin/../conf::/usr/local/src/sqoop/lib
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.el7.x86_64
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:user.name=hadoop
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/hadoop
- 2022-05-20 19:07:11,929 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/hadoop
- 2022-05-20 19:07:11,930 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@69d0a921
- Welcome to ZooKeeper!
- 2022-05-20 19:07:11,946 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
- JLine support is enabled
- 2022-05-20 19:07:11,984 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
- 2022-05-20 19:07:11,991 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x180e0fed4990001, negotiated timeout = 30000
- WATCHER::
- WatchedEvent state:SyncConnected type:None path:null
- [zk: localhost:2181(CONNECTED) 0]
复制代码 3.3.1.4. 步骤五: 使用 Watch 监听/hbase 目录,一旦/hbase 内容有变化,将会有提 内容有变化,将会有提示。打开监视,执行命令 示。打开监视,执行命令 get /hbase 1 。
- cZxid = 0x100000002
- ctime = Thu Apr 23 16:02:29 CST 2022
- mZxid = 0x100000002
- mtime = Thu Apr 23 16:02:29 CST 2022
- pZxid = 0x20000008d
- cversion = 26
- dataVersion = 0
- aclVersion = 0
- ephemeralOwner = 0x0
- dataLength = 0
- numChildren = 16
- [zk: localhost:2181(CONNECTED) 1] set /hbase value-update
- WATCHER::cZxid = 0x100000002
- WatchedEvent state:SyncConnected type:NodeDataChanged
- path:/hbase
- ctime = Thu Apr 23 16:02:29 CST 2022
- mZxid = 0x20000c6d3
- mtime = Fri May 15 15:03:41 CST 2022
- pZxid = 0x20000008d
- cversion = 26
- dataVersion = 1
- aclVersion = 0
- ephemeralOwner = 0x0
- dataLength = 12
- numChildren = 16
- [zk: localhost:2181(CONNECTED) 2] get /hbase
- value-update
- cZxid = 0x100000002
- ctime = Thu Apr 23 16:02:29 CST 2022
- mZxid = 0x20000c6d3
- mtime = Fri May 15 15:03:41 CST 2022
- pZxid = 0x20000008d
- cversion = 26
- dataVersion = 1
- aclVersion = 0
- ephemeralOwner = 0x0
- dataLength = 12
- numChildren = 16
- [zk: localhost:2181(CONNECTED) 3] quit
复制代码 结果显示,当执行命令 set /hbase value-update 后,数据版本由 0 变成 1,说明/hbase 处于监控中。
3.3.2. 实验任务二:通过命令查看 Sqoop 状态
3.3.2.1. 步骤一: 查询 Sqoop 版本号,验证 Sqoop 是否启动成功。
起首切换到/usr/local/src/sqoop 目录,执行命令:./bin/sqoop-version- [hadoop@master ~]$ cd /usr/local/src/sqoop
- [hadoop@master sqoop]$ ./bin/sqoop-version
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/05/20 19:10:55 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- Sqoop 1.4.7
- git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
- Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
复制代码 结果显示:Sqoop 1.4.7,说明 Sqoop 版本号为 1.4.7,并启动成功。
3.3.2.2. 步骤二: 测试 Sqoop 是否能够成功连接数据库
切换到Sqoop 的 目 录 , 执 行 命 令 bin/sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$,命令中“master:3306”为数据库主机名和端口。- [hadoop@master sqoop]$ bin/sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/05/20 19:13:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- 22/05/20 19:13:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 22/05/20 19:13:21 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- Fri May 20 19:13:21 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
- information_schema
- hive
- mysql
- performance_schema
- sample
- sys
复制代码 结果显示,可以连接到 MySQL,并查看到 Master 主机中 MySQL 的全部库实例,如information_schema、hive、mysql、performance_schema 和 sys 等数据库。
3.3.2.3. 步骤三: 执行命令sqoop help ,可以看到如下内容,代表Sqoop 启动成功。
- [hadoop@master sqoop]$ sqoop help
- Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
- Please set $ACCUMULO_HOME to the root of your Accumulo installation.
- 22/05/20 19:14:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
- usage: sqoop COMMAND [ARGS]
- Available commands:
- codegen Generate code to interact with database records
- create-hive-table Import a table definition into Hive
- eval Evaluate a SQL statement and display the results
- export Export an HDFS directory to a database table
- help List available commands
- import Import a table from a database to HDFS
- import-all-tables Import tables from a database to HDFS
- import-mainframe Import datasets from a mainframe server to HDFS
- job Work with saved jobs
- list-databases List available databases on a server
- list-tables List available tables in a database
- merge Merge results of incremental imports
- metastore Run a standalone Sqoop metastore
- version Display version information
- See 'sqoop help COMMAND' for information on a specific command.
复制代码 结果显示了 Sqoop 的常用命令和功能,如下表所示。
3.3.3. 实验任务三:通过命令查看Flume状态
3.3.3.1. 步骤一: 检查 Flume安装是否成功,执行flume-ng version 命令,查看 Flume的版本。
- [hadoop@master ~]$ cd /usr/local/src/flume
- [hadoop@master flume]$ flume-ng version
- Flume 1.6.0
- Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
- Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
- Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
- From source with checksum b29e416802ce9ece3269d34233baf43f
复制代码 3.3.3.2. 步骤二: 添加 example.conf 到/usr/local/src/flume
- [hadoop@master flume]$ cat /usr/local/src/flume/example.conf
- a1.sources=r1
- a1.sinks=k1
- a1.channels=c1
- a1.sources.r1.type=spooldir
- a1.sources.r1.spoolDir=/usr/local/src/flume/
- a1.sources.r1.fileHeader=true
- a1.sinks.k1.type=hdfsa1.sinks.k1.hdfs.path=hdfs://master:9000/flumea1.sinks.k1.hdfs.rollsize=1048760a1.sinks.k1.hdfs.rollCount=0a1.sinks.k1.hdfs.rollInterval=900a1.sinks.k1.hdfs.useLocalTimeStamp=truea1.channels.c1.type=filea1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100a1.sources.r1.channels = c1a1.sinks.k1.channel = c1
复制代码 3.4.3.3. 步骤三:启动Flume Agent a1 日志控制台
- [hadoop@master flume]$ /usr/local/src/flume/bin/flume-ng agent --conf ./conf --conf-file ./example.conf --name a1 -Dflume.root.logger=INFO,console
复制代码 3.4.3.4. 步骤四: 查看结果
- [hadoop@master flume]$ hdfs dfs -lsr /flume
- drwxr-xr-x - hadoop supergroup 0 2022-05-20 15:16
- /flume/20220520
- -rw-r--r-- 2 hadoop supergroup 11 2022-05-20 15:16
- /flume/20220520/events-.
复制代码- a1.sinks.k1.hdfs.path=hdfs://master:9000/flume a1.sinks.k1.hdfs.rollsize=1048760 a1.sinks.k1.hdfs.rollCount=0 a1.sinks.k1.hdfs.rollInterval=900 a1.sinks.k1.hdfs.useLocalTimeStamp=true a1.channels.c1.type=file a1.channels.c1.capacity=1000 a1.channels.c1.transactionCapacity=100 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
复制代码 3.4.3.3. 步骤三:启动Flume Agent a1 日志控制台
- [hadoop@master flume]$ /usr/local/src/flume/bin/flume-ng agent --conf ./conf --conf-file ./example.conf --name a1 -Dflume.root.logger=INFO,console
复制代码 3.4.3.4. 步骤四: 查看结果
- [hadoop@master flume]$ hdfs dfs -lsr /flume
- drwxr-xr-x - hadoop supergroup 0 2022-05-20 15:16
- /flume/20220520
- -rw-r--r-- 2 hadoop supergroup 11 2022-05-20 15:16
- /flume/20220520/events-.
复制代码 免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |