Hadoop 3.4.0+HBase2.5.8+ZooKeeper3.8.4+Hive4.0+Sqoop 分布式高可用集群 ...

打印 上一主题 下一主题

主题 986|帖子 986|积分 2958

创建服务器,参考
假造机创建服务器
节点名字节点IP体系版本
master11192.168.50.11centos 8.5
slave12192.168.50.12centos 8.5
slave13192.168.50.13centos 8.5
1 下载组件
Hadoop:官网地址
Hbase:官网地址
ZooKeeper:官网下载
Hive:官网下载
Sqoop:官网下载
为方便同学们下载,特整理到网盘
链接地址
2 通过xftp 上传软件到服务器,统一放到/data/soft/

3 配置ZooKeeper

  1. tar zxvf apache-zookeeper-3.8.4-bin.tar.gz
  2. mv apache-zookeeper-3.8.4-bin/ /data/zookeeper
  3. #修改配置文件
  4. cd /data/zookeeper/conf
  5. cp zoo_sample.cfg zoo.cfg
  6. #创建数据保存目录
  7. mkdir  -p /data/zookeeper/zkdata
  8. mkdir -p /data/zookeeper/logs
  9. vim zoo.cfg
  10. dataDir=/tmp/zookeeper-->dataDir=/data/zookeeper/zkdata
  11. dataLogDir=/data/zookeeper/logs
  12. server.1=master11:2888:3888
  13. server.2=slave12:2888:3888
  14. server.3=slave13:2888:3888
  15. #配置环境变量
  16. vim /etc/profile
  17. export ZooKeeper_HOME=/data/zookeeper
  18. export PATH=$PATH:$ZooKeeper_HOME/bin
  19. source  /etc/profile
复制代码
#新建myid并且写入对应的myid
  1. [root@master11 zkdata]# cat myid
  2. 1
  3. #对应修改
  4. slave12
  5. myid--2
  6. slave13
  7. myid--3
复制代码
4  配置HBase
  1. tar  zxvf  hbase-2.5.8-bin.tar.gz
  2. mv  hbase-2.5.8/ /data/hbase
  3. mkdir -p /data/hbase/logs
  4. #vim /etc/profile
  5. export HBASE_LOG_DIR=/data/hbase/logs
  6. export HBASE_MANAGES_ZK=false
  7. export HBASE_HOME=/data/hbase
  8. export PATH=$PATH:$ZooKeeper_HOME/bin
  9. #vim  /data/hbase/conf/regionservers
  10. slave12
  11. slave13
  12. #新建backup-masters
  13. vim  /data/hbase/conf/backup-masters
  14. slave12
  15. #vim  /data/hbase/conf/hbase-site.xml
  16. <property>
  17.     <name>hbase.cluster.distributed</name>
  18.     <value>true</value>
  19.   </property>
  20. <!--HBase端口-->
  21. <property>
  22. <name>hbase.master.info.port</name>
  23. <value>16010</value>
  24. </property>
  25. <property>
  26.     <name>hbase.zookeeper.quorum</name>
  27.     <value>master11,slave12,slave13</value>
  28.   </property>
  29. <property>
  30.     <name>hbase.rootdir</name>
  31.     <value>hdfs://master11:9000/hbase</value>
  32.   </property>
  33. <property>
  34.   <name>hbase.wal.provider</name>
  35.   <value>filesystem</value>
  36. </property>
复制代码
 5 配置hadoop
  1. tar zxvf hadoop-3.4.0.tar.gz
  2. mv  hadoop-3.4.0/ /data/hadoop
  3. #配置环境变量
  4. vim /etc/profile
  5. export HADOOP_HOME=/data/hadoop
  6. export PATH=$PATH:$HADOOP_HOME/bin:$PATH:$HADOOP_HOME/sbin
  7. source /etc/profile
  8. #查看版本
  9. [root@master11 soft]# hadoop version
  10. Hadoop 3.4.0
  11. Source code repository git@github.com:apache/hadoop.git -r bd8b77f398f626bb7791783192ee7a5dfaeec760
  12. Compiled by root on 2024-03-04T06:35Z
  13. Compiled on platform linux-x86_64
  14. Compiled with protoc 3.21.12
  15. From source with checksum f7fe694a3613358b38812ae9c31114e
  16. This command was run using /data/hadoop/share/hadoop/common/hadoop-common-3.4.0.jar
复制代码
6 修改hadoop配置文件
#core-site.xml
  1. vim /data/hadoop/etc/hadoop/core-site.xml
  2. #增加如下
  3. <configuration>
  4. <property>
  5.     <name>fs.defaultFS</name>
  6.     <value>hdfs://master11</value>
  7. </property>
  8. <!-- hadoop 本地数据存储目录 format 时自动生成 -->
  9. <property>
  10.     <name>hadoop.tmp.dir</name>
  11.     <value>/data/hadoop/data/tmp</value>
  12. </property>
  13. <!-- 在 WebUI访问 HDFS 使用的用户名。-->
  14. <property>
  15.     <name>hadoop.http.staticuser.user</name>
  16.     <value>root</value>
  17. </property>
  18. <property>
  19.     <name>hadoop.proxyuser.hadoop.hosts</name>
  20.     <value>*</value>
  21. </property>
  22. <property>
  23.   <name>hadoop.proxyuser.root.hosts</name>
  24.   <value>*</value>
  25. </property>
  26. <property>
  27.   <name>hadoop.proxyuser.root.groups</name>
  28.   <value>*</value>
  29. </property>
  30. <property>
  31.     <name>ha.zookeeper.quorum</name>
  32.     <value>master11:2181,slave12:2181,slave13:2181</value>
  33. </property>
  34. <property>
  35.     <name>ha.zookeeper.session-timeout.ms</name>
  36.     <value>10000</value>
  37. </property>
  38. </configuration>
复制代码
#hdfs-site.xml
  1. vim  /data/hadoop/etc/hadoop/hdfs-site.xml
复制代码
  1. <configuration>
  2.     <!-- 副本数dfs.replication默认值3,可不配置 -->
  3.     <property>
  4.         <name>dfs.replication</name>
  5.         <value>3</value>
  6.     </property>
  7.     <!-- 节点数据存储地址 -->
  8.     <property>
  9.         <name>dfs.namenode.name.dir</name>
  10.         <value>/data/hadoop/data/dfs/name</value>
  11.     </property>
  12.     <property>
  13.         <name>dfs.datanode.data.dir</name>
  14.         <value>/data/hadoop/data/dfs/data</value>
  15.     </property>
  16.     <!-- 主备配置 -->
  17.     <!-- 为namenode集群定义一个services name -->
  18.     <property>
  19.         <name>dfs.nameservices</name>
  20.         <value>mycluster</value>
  21.     </property>
  22.     <!-- 声明集群有几个namenode节点 -->
  23.     <property>
  24.         <name>dfs.ha.namenodes.mycluster</name>
  25.         <value>nn1,nn2</value>
  26.     </property>
  27.     <!-- 指定 RPC通信地址 的地址 -->
  28.     <property>
  29.         <name>dfs.namenode.rpc-address.mycluster.nn1</name>
  30.         <value>master11:8020</value>
  31.     </property>
  32.     <!-- 指定 RPC通信地址 的地址 -->
  33.     <property>
  34.         <name>dfs.namenode.rpc-address.mycluster.nn2</name>
  35.         <value>slave12:8020</value>
  36.     </property>
  37.     <!-- http通信地址 web端访问地址 -->
  38.     <property>
  39.             <name>dfs.namenode.http-address.mycluster.nn1</name>
  40.             <value>master11:50070</value>
  41.     </property>
  42.     <!-- http通信地址 web 端访问地址 -->
  43.     <property>
  44.             <name>dfs.namenode.http-address.mycluster.nn2</name>
  45.             <value>slave12:50070</value>
  46.      </property>
  47.      <!-- 声明journalnode集群服务器 -->
  48.      <property>
  49.             <name>dfs.namenode.shared.edits.dir</name>
  50.             <value>qjournal://master11:8485;slave12:8485;slave13:8485/mycluster</value>
  51.          </property>
  52.      <!-- 声明journalnode服务器数据存储目录 -->
  53.      <property>
  54.             <name>dfs.journalnode.edits.dir</name>
  55.             <value>/data/hadoop/data/dfs/jn</value>
  56.      </property>
  57.      <!-- 开启NameNode失败自动切换 -->
  58.      <property>
  59.             <name>dfs.ha.automatic-failover.enabled</name>
  60.             <value>true</value>
  61.      </property>
  62.      <!-- 隔离:同一时刻只能有一台服务器对外响应 -->
  63.         <property>
  64.         <name>dfs.ha.fencing.methods</name>
  65.         <value>
  66.             sshfence
  67.             shell(/bin/true)
  68.         </value>
  69.     </property>
  70.     <!-- 配置失败自动切换实现方式,通过ConfiguredFailoverProxyProvider这个类实现自动切换 -->
  71.      <property>
  72.             <name>dfs.client.failover.proxy.provider.mycluster</name>
  73.             <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  74.      </property>
  75.      <!-- 指定上述选项ssh通讯使用的密钥文件在系统中的位置。 -->
  76.      <property>
  77.             <name>dfs.ha.fencing.ssh.private-key-files</name>
  78.             <value>/root/.ssh/id_rsa</value>
  79.       </property>
  80.       <!-- 配置sshfence隔离机制超时时间(active异常,standby如果没有在30秒之内未连接上,那么standby将变成active) -->
  81.       <property>
  82.             <name>dfs.ha.fencing.ssh.connect-timeout</name>
  83.             <value>30000</value>
  84.      </property>
  85.      <property>
  86.        <name>dfs.ha.fencing.methods</name>
  87.        <value>sshfence</value>
  88.      </property>
  89. <!-- 开启hdfs允许创建目录的权限,配置hdfs-site.xml -->
  90.      <property>
  91.                 <name>dfs.permissions.enabled</name>
  92.                 <value>false</value>
  93.         </property>
  94.     <!-- 使用host+hostName的配置方式 -->
  95.         <property>
  96.                 <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
  97.                 <value>false</value>
  98.         </property>
  99. <property>
  100.    <name>dfs.webhdfs.enabled</name>
  101.     <value>true</value>
  102. </property>
  103. <!-- 开启自动化: 启动zkfc -->
  104. <property>
  105.    <name>dfs.ha.automatic-failover.enabled</name>
  106.    <value>true</value>
  107. </property>
  108. <property>
  109.     <name>ipc.client.connect.max.retries</name>
  110.     <value>100</value>
  111.     <description>Indicates the number of retries a client will make to establish a server connection.</description>
  112. </property>
  113. <property>
  114.     <name>ipc.client.connect.retry.interval</name>
  115.     <value>10000</value>
  116.     <description>Indicates the number of milliseconds a client will wait for before retrying to establish a server connection.</description>
  117. </property>
  118. </configuration>
复制代码
 #yarn-site.xml
  1. vi /data/hadoop/etc/hadoop/yarn-site.xml
  2. <configuration>
  3. <!-- 指定yarn占电脑资源,默认8核8g -->
  4. <property>
  5.   <name>yarn.nodemanager.resource.cpu-vcores</name>
  6.   <value>2</value>
  7. </property>
  8. <property>
  9.   <name>yarn.nodemanager.resource.memory-mb</name>
  10.   <value>4096</value>
  11. </property>
  12. <property>
  13.     <name>yarn.log.server.url</name>
  14.     <value>http://master11:19888/jobhistory/logs</value>
  15. </property>
  16.     <!-- 指定 MR 走 shuffle -->
  17.     <property>
  18.         <name>yarn.nodemanager.aux-services</name>
  19.         <value>mapreduce_shuffle</value>
  20.         </property>
  21.     <!-- 开启日志聚集功能 -->
  22.     <property>
  23.         <name>yarn.log-aggregation-enable</name>
  24.         <value>true</value>
  25.     </property>
  26.     <!-- 设置日志保留时间为 7 天 -->
  27.     <property>
  28.         <name>yarn.log-aggregation.retain-seconds</name>
  29.         <value>86400</value>
  30.     </property>
  31.     <!-- 主备配置 -->
  32.     <!-- 启用resourcemanager ha -->
  33.     <property>
  34.         <name>yarn.resourcemanager.ha.enabled</name>
  35.         <value>true</value>
  36.     </property>
  37.     <property>
  38.         <name>yarn.resourcemanager.cluster-id</name>
  39.         <value>my-yarn-cluster</value>
  40.     </property>
  41.     <!-- 声明两台resourcemanager的地址 -->
  42.     <property>
  43.         <name>yarn.resourcemanager.ha.rm-ids</name>
  44.         <value>rm1,rm2</value>
  45.     </property>
  46.     <property>
  47.         <name>yarn.resourcemanager.hostname.rm1</name>
  48.         <value>slave12</value>
  49.     </property>
  50.     <property>
  51.         <name>yarn.resourcemanager.hostname.rm2</name>
  52.         <value>slave13</value>
  53.     </property>
  54.     <property>
  55.         <name>yarn.resourcemanager.webapp.address.rm1</name>
  56.         <value>slave12:8088</value>
  57.     </property>
  58.     <property>
  59.         <name>yarn.resourcemanager.webapp.address.rm2</name>
  60.         <value>slave13:8088</value>
  61.     </property>
  62.     <!-- 指定zookeeper集群的地址 -->
  63.     <property>
  64.         <name>yarn.resourcemanager.zk-address</name>
  65.         <value>master11:2181,slave12:2181,slave13:2181</value>
  66.     </property>
  67.     <!-- 启用自动恢复 -->
  68.     <property>
  69.         <name>yarn.resourcemanager.recovery.enabled</name>
  70.         <value>true</value>
  71.     </property>
  72.    <!-- 指定resourcemanager的状态信息存储在zookeeper集群 -->
  73.     <property>
  74.         <name>yarn.resourcemanager.store.class</name>
  75.         <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  76.     </property>
  77.     <property>
  78.         <name>yarn.scheduler.maximum-allocation-mb</name>
  79.         <value>2048</value>
  80.     </property>
  81.     <property>
  82.         <name>yarn.scheduler.minimum-allocation-mb</name>
  83.         <value>2048</value>
  84.     </property>
  85.     <property>
  86.         <name>yarn.nodemanager.vmem-pmem-ratio</name>
  87.         <value>2.1</value>
  88.     </property>
  89.     <property>
  90.         <name>mapred.child.java.opts</name>
  91.         <value>-Xmx1024m</value>
  92.     </property>
  93.   <property>
  94.     <name>yarn.resourcemanager.address.rm1</name>
  95.     <value>slave12:8032</value>
  96.   </property>
  97.   <property>
  98.     <name>yarn.resourcemanager.scheduler.address.rm1</name>
  99.     <value>slave12:8030</value>
  100.   </property>
  101.   <property>
  102.     <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
  103.     <value>slave12:8031</value>
  104.   </property>
  105.   <property>
  106.     <name>yarn.resourcemanager.admin.address.rm1</name>
  107.     <value>slave12:8033</value>
  108.   </property>
  109.   <property>
  110.     <name>yarn.nodemanager.address.rm1</name>
  111.     <value>slave12:8041</value>
  112.   </property>
  113.   <property>
  114.     <name>yarn.resourcemanager.address.rm2</name>
  115.     <value>slave13:8032</value>
  116.   </property>
  117.   <property>
  118.     <name>yarn.resourcemanager.scheduler.address.rm2</name>
  119.     <value>slave13:8030</value>
  120.   </property>
  121.   <property>
  122.     <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
  123.     <value>slave13:8031</value>
  124.   </property>
  125.   <property>
  126.     <name>yarn.resourcemanager.admin.address.rm2</name>
  127.     <value>slave13:8033</value>
  128.   </property>
  129.   <property>
  130.     <name>yarn.nodemanager.address.rm2</name>
  131.     <value>slave13:8041</value>
  132.   </property>
  133.   <property>
  134.     <name>yarn.nodemanager.localizer.address</name>
  135.     <value>0.0.0.0:8040</value>
  136.   </property>
  137.   <property>
  138.    <description>NM Webapp address.</description>
  139.     <name>yarn.nodemanager.webapp.address</name>
  140.     <value>0.0.0.0:8042</value>
  141.   </property>
  142. <property>
  143.     <name>yarn.nodemanager.address</name>
  144.     <value>${yarn.resourcemanager.hostname}:8041</value>
  145. </property>
  146. <property>
  147. <name>yarn.application.classpath</name>
  148. <value>/data/hadoop/etc/hadoop:/data/hadoop/share/hadoop/common/lib/*:/data/hadoop/share/hadoop/common/*:/data/hadoop/share/hadoop/hdfs:/data/hadoop/share/hadoop/hdfs/lib/*:/data/hadoop/share/hadoop/hdfs/*:/data/hadoop/share/hadoop/mapreduce/lib/*:/data/hadoop/share/hadoop/mapreduce/*:/data/hadoop/share/hadoop/yarn:/data/hadoop/share/hadoop/yarn/lib/*
  149. :/data/hadoop/share/hadoop/yarn/*</value>    </property>
  150. </configuration>
复制代码
#修改workers
  1. vi /data/hadoop/etc/hadoop/workers
  2. master11
  3. slave12
  4. slave13
复制代码
7  分发文件和配置
  1. #master11
  2. cd /data/  
  3. scp  -r   hadoop/  slave12:/data
  4. scp  -r   hadoop/  slave13:/data
  5. scp  -r  hbase/  slave13:/data
  6. scp  -r  hbase/  slave12:/data
  7. scp  -r   zookeeper/  slave12:/data
  8. scp  -r   zookeeper/  slave13:/data
  9. #3台服务器的/etc/profile 变量一致
  10. export JAVA_HOME=/usr/local/jdk
  11. export PATH=$JAVA_HOME/bin:$PATH
  12. CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
  13. export CLASSPATH
  14. export HADOOP_HOME=/data/hadoop
  15. export PATH=$PATH:$HADOOP_HOME/bin:$PATH:$HADOOP_HOME/sbin
  16. export ZooKeeper_HOME=/data/zookeeper
  17. export PATH=$PATH:$ZooKeeper_HOME/bin
  18. #
  19. export HBASE_LOG_DIR=/data/hbase/logs
  20. export HBASE_MANAGES_ZK=false
  21. export HBASE_HOME=/data/hbase
  22. export PATH=$PATH:$HBASE_HOME/bin
  23. export HIVE_HOME=/data/hive
  24. export PATH=$PATH:$HIVE_HOME/bin
  25. export HDFS_NAMENODE_USER=root
  26. export HDFS_DATANODE_USER=root
  27. export HDFS_SECONDARYNAMENODE_USER=root
  28. export YARN_RESOURCEMANAGER_USER=root
  29. export YARN_NODEMANAGER_USER=root
  30. export HDFS_ZKFC_USER=root
  31. export HDFS_DATANODE_SECURE_USER=root
  32. export HDFS_JOURNALNODE_USER=root
复制代码
8 集群启动
#HA模式第一次或删除在格式化版本
  1. #第一次需要格式化,master11上面
  2. start-dfs.sh
  3. hdfs  namenode -format
  4. ll /data/hadoop/data/dfs/name/current/
  5. total 16
  6. -rw-r--r--. 1 root root 399 May 13 20:21 fsimage_0000000000000000000
  7. -rw-r--r--. 1 root root  62 May 13 20:21 fsimage_0000000000000000000.md5
  8. -rw-r--r--. 1 root root   2 May 13 20:21 seen_txid
  9. -rw-r--r--. 1 root root 218 May 13 20:21 VERSION
  10. #同步数据到slave12节点(其余namenode节点)
  11. scp  -r  /data/hadoop/data/dfs/name/*  slave12:/data/hadoop/data/dfs/name/
  12. #成功如图
复制代码

  1. #在任意一台 NameNode上初始化 ZooKeeper 中的 HA 状态
  2. [root@master11 hadoop]# jps
  3. 2400 QuorumPeerMain
  4. 4897 Jps
  5. 3620 JournalNode
  6. 3383 DataNode
  7. #
  8. hdfs zkfc -formatZK
  9. #如下图
复制代码
 

#集群正常启动顺序
  1. #zookeeper,3台服务器都执行
  2. zkServer.sh start
  3. #查看
  4. [root@master11 ~]# zkServer.sh status
  5. ZooKeeper JMX enabled by default
  6. Using config: /data/zookeeper/bin/../conf/zoo.cfg
  7. Client port found: 2181. Client address: localhost. Client SSL: false.
  8. Mode: follower
  9. [root@slave12 data]# zkServer.sh status
  10. ZooKeeper JMX enabled by default
  11. Using config: /data/zookeeper/bin/../conf/zoo.cfg
  12. Client port found: 2181. Client address: localhost. Client SSL: false.
  13. Mode: leader
  14. [root@slave13 ~]# zkServer.sh  status
  15. ZooKeeper JMX enabled by default
  16. Using config: /data/zookeeper/bin/../conf/zoo.cfg
  17. Client port found: 2181. Client address: localhost. Client SSL: false.
  18. Mode: follower
  19. #master11 ,hadoop集群一键启动
  20. start-all.sh start
  21. #一键停止
  22. stop-all.sh
  23. #jps 查看如图
复制代码
 



#查看集群状态
  1. #NameNode
  2. [root@master11 ~]# hdfs  haadmin  -getServiceState nn1
  3. active
  4. [root@master11 ~]# hdfs  haadmin  -getServiceState nn2
  5. standby
  6. [root@master11 ~]# hdfs haadmin -ns mycluster -getAllServiceState
  7. master11:8020                                      active   
  8. slave12:8020                                       standby
  9. #yarn
  10. [root@master11 ~]# yarn rmadmin -getServiceState rm1
  11. standby
  12. [root@master11 ~]# yarn rmadmin -getServiceState rm2
  13. active
复制代码
#查看HDFS web ui
 


#查看 yarn集群


9 hadoop 测试使用
  1. #创建目录
  2. hdfs dfs  -mkdir  /testdata
  3. #查看
  4. [root@master11 ~]# hdfs dfs  -ls /
  5. Found 2 items
  6. drwxr-xr-x   - root supergroup          0 2024-05-14 17:00 /hbase
  7. drwxr-xr-x   - root supergroup          0 2024-05-14 20:32 /testdata
  8. #上传文件
  9. hdfs dfs  -put  jdk-8u191-linux-x64.tar.gz   /testdata
  10. #查看文件
  11. [root@master11 soft]# hdfs dfs  -ls /testdata/
  12. Found 1 items
  13. -rw-r--r--   3 root supergroup  191753373 2024-05-14 20:40 /testdata/jdk-8u191-linux-x64.tar.gz
复制代码
 

 

 

10 启动Hbase,hadoop的active节点
  1. [root@master11 ~]# hdfs  haadmin  -getServiceState nn1
  2. active
  3. #启动
  4. start-hbase.sh
  5. #查看
  6. [root@master11 ~]# jps
  7. 16401 NodeManager
  8. 15491 NameNode
  9. 21543 HMaster
  10. 15848 JournalNode
  11. 1435 QuorumPeerMain
  12. 16029 DFSZKFailoverController
  13. 21902 Jps
  14. 15631 DataNode
复制代码
 11 安装Hive
#解压和配置环境变量
  1. tar zxvf apache-hive-4.0.0-bin.tar.gz
  2. mv  apache-hive-4.0.0-bin/  /data/hive
  3. #环境变量
  4. vi /etc/profile
  5. export HIVE_HOME=/data/hive
  6. export PATH=$PATH:$HIVE_HOME/bin
  7. source /etc/profile
复制代码
# 安装mysql ,可参考
mysql 8.3 二进制版本安装
#mysql驱动
  1. mv mysql-connector-java-8.0.29.jar  /data/hive/lib/
复制代码
  1. schematool -dbType mysql -initSchema
  2. #报错
  3. SLF4J: Class path contains multiple SLF4J bindings.
  4. SLF4J: Found binding in [jar:file:/data/hive/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  5. SLF4J: Found binding in [jar:file:/data/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  6. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  7. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
  8. Exception in thread "main" [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '=' (code 61); expected a semi-colon after the reference for entity 'characterEncoding'
  9. at [row,col,system-id]: [5,86,"file:/data/hive/conf/hive-site.xml"]
  10.         at com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:40)
  11.         at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:737)
  12.         at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3745)
  13.         at com.ctc.wstx.sr.BasicStreamReader.getTextCharacters(BasicStreamReader.java:914)
  14.         at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3434)
  15.         at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3213)
  16.         at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3106)
  17.         at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3072)
  18.         at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2945)
  19.         at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2927)
  20.         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1431)
  21.         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1403)
  22.         at org.apache.hadoop.hive.metastore.conf.MetastoreConf.newMetastoreConf(MetastoreConf.java:2120)
  23.         at org.apache.hadoop.hive.metastore.conf.MetastoreConf.newMetastoreConf(MetastoreConf.java:2072)
  24.         at org.apache.hive.beeline.schematool.HiveSchemaTool.main(HiveSchemaTool.java:144)
  25.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  26.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  27.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  28.         at java.lang.reflect.Method.invoke(Method.java:498)
  29.         at org.apache.hadoop.util.RunJar.run(RunJar.java:330)
  30.         at org.apache.hadoop.util.RunJar.main(RunJar.java:245)
  31. Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '=' (code 61); expected a semi-colon after the reference for entity 'characterEncoding'
  32. at [row,col,system-id]: [5,86,"file:/data/hive/conf/hive-site.xml"]
  33.         at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:666)
  34.         at com.ctc.wstx.sr.StreamScanner.parseEntityName(StreamScanner.java:2080)
  35.         at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1538)
  36.         at com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4765)
  37.         at com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3789)
  38.         at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3743)
  39.         ... 18 more
  40. #解决 vi /data/hive/conf/hive-site.xml
  41. &字符 需要转义 改成 &amp;
  42. #成功提示 Initialization script completed
  43. 数据库如下图
复制代码
 

#启动,hive 在master11,mysql 安装在slave12 
  1. cd /data/hive/
  2. nohup hive --service metastore & (启动hive元数据服务)
  3. nohup ./bin/hiveserver2 & (启动jdbc连接服务)
  4. #直接hive,提示“No current connection”
  5. hive
  6. [root@master11 hive]# hive
  7. SLF4J: Class path contains multiple SLF4J bindings.
  8. SLF4J: Found binding in [jar:file:/data/hive/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: Found binding in [jar:file:/data/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  11. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
  12. SLF4J: Class path contains multiple SLF4J bindings.
  13. SLF4J: Found binding in [jar:file:/data/hive/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  14. SLF4J: Found binding in [jar:file:/data/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  15. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  16. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
  17. Beeline version 4.0.0 by Apache Hive
  18. beeline> show  databases;
  19. No current connection
  20. beeline>
  21. #在提示符 输入!connect jdbc:hive2://master11:10000,之后输入mysql用户和密码
  22. beeline> !connect jdbc:hive2://master11:10000
  23. Connecting to jdbc:hive2://master11:10000
  24. Enter username for jdbc:hive2://master11:10000: root
  25. Enter password for jdbc:hive2://master11:10000: *********
  26. Connected to: Apache Hive (version 4.0.0)
  27. Driver: Hive JDBC (version 4.0.0)
  28. Transaction isolation: TRANSACTION_REPEATABLE_READ
  29. 0: jdbc:hive2://master11:10000> show  databases;
  30. INFO  : Compiling command(queryId=root_20240514222349_ac19af6a-3c43-49fd-bcd0-25fc0e5b76c6): show  databases
  31. INFO  : Semantic Analysis Completed (retrial = false)
  32. INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
  33. INFO  : Completed compiling command(queryId=root_20240514222349_ac19af6a-3c43-49fd-bcd0-25fc0e5b76c6); Time taken: 0.021 seconds
  34. INFO  : Concurrency mode is disabled, not creating a lock manager
  35. INFO  : Executing command(queryId=root_20240514222349_ac19af6a-3c43-49fd-bcd0-25fc0e5b76c6): show  databases
  36. INFO  : Starting task [Stage-0:DDL] in serial mode
  37. INFO  : Completed executing command(queryId=root_20240514222349_ac19af6a-3c43-49fd-bcd0-25fc0e5b76c6); Time taken: 0.017 seconds
  38. +----------------+
  39. | database_name  |
  40. +----------------+
  41. | default        |
  42. +----------------+
  43. 1 row selected (0.124 seconds)
  44. 0: jdbc:hive2://master11:10000>
复制代码
Hadoop3.4.0+HBase2.5.8+ZooKeeper3.8.4+Hive4.0+Sqoop 分布式高可用集群部署安装,已完成。接待大家一起交流哦。下一篇,项目实战。


免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

勿忘初心做自己

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表