HBase有关ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetExcepti ...

打印 上一主题 下一主题

主题 900|帖子 900|积分 2700

HBase有关ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet



  

写在前面



  • Linux版本:CentOS7.5
  • Hadoop版本:Hadoop-3.1.3
  • ZooKeeper版本:ZooKeeper3.5.7
  • HBase版本:HBase-2.0.5
  • 集群:完全分布式(三台节点:hdp01,hdp02,hdp03)
问题发现

实验场景

   模仿数据集导入到HBase,数据集为100w,但是由于磁盘空间不足,导致集群中的第三台节点hdp03的HBase被退出,hdp03节点被关闭,模仿数据失败。。。。。
  注意:hdp03因空间不足而被关机,重新启动并开启相应的服务,zk和hbase的进程都是完备的。这一点在下文中的「测试Error缘故原由」即可体现。
  



IDEA步伐实行报错如下

  1. Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 20 actions: ConnectException: 20 times, servers with issues: hdp03,16020,1677226920150
  2.         at org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54)
  3.         at org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1226)
  4.         at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:455)
  5.         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:553)
  6.         at cn.whybigdata.dynamic_rule.datagen.UserProfileDataGen.main(UserProfileDataGen.java:59)
  7. Process finished with exit code 1
复制代码
  这就是由于磁盘空间不足导致的
  HBase Shell交互端报错展示

   查看HBase中的所有表
  –> 出现错误:ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
  1. hbase(main):001:0> list
  2. TABLE
  3. ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
  4.         at org.apache.hadoop.hbase.master.HMaster.checkServiceStarted(HMaster.java:2932)
  5.         at org.apache.hadoop.hbase.master.MasterRpcServices.isMasterRunning(MasterRpcServices.java:1084)
  6.         at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
  7.         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
  8.         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
  9.         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
  10.         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
  11. List all user tables in hbase. Optional regular expression parameter could
  12. be used to filter the output. Examples:
  13.   hbase> list
  14.   hbase> list 'abc.*'
  15.   hbase> list 'ns:abc.*'
  16.   hbase> list 'ns:.*'
  17. Took 8.9350 seconds
复制代码
解决方法



  • 进入到hadoop安装目录下的 bin目录,实行以下下令
  1. [whybigdata@hdp01 hbase-2.0.5]$ cd ../hadoop-3.1.3/bin/
  2. [whybigdata@hdp01 bin]$ pwd
  3. /opt/apps/hadoop-3.1.3/bin
复制代码



  • 移除hadoop的安全模式
  1. [whybigdata@hdp01 bin]$ ./hadoop dfsadmin -safemode leave
  2. WARNING: Use of this script to execute dfsadmin is deprecated.
  3. WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.
  4. Safe mode is OFF
复制代码
  关闭Hadoop的安全模式之后,就可以list出HBase的表了,但是依旧不可以扫描scan大概查询表数据量等操纵
  

  • 其他操纵依旧不可行
  1. hbase(main):002:0> count 'user_profile'
  2. ERROR: org.apache.hadoop.hbase.NotServingRegionException: user_profile,,1677228015439.69f3f6a477f90bdc138e31f08ee909d8. is not online on hdp03,16020,1677229361798
  3.         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3272)
  4.         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3249)
  5.         at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
  6.         at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2948)
  7.         at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3285)
  8.         at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002)
  9.         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
  10.         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
  11.         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
  12.         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
  13. Count the number of rows in a table.  Return value is the number of rows.
  14. This operation may take a LONG time (Run '$HADOOP_HOME/bin/hadoop jar
  15. hbase.jar rowcount' to run a counting mapreduce job). Current count is shown
  16. every 1000 rows by default. Count interval may be optionally specified. Scan
  17. caching is enabled on count scans by default. Default cache size is 10 rows.
  18. If your rows are small in size, you may want to increase this
  19. parameter. Examples:
  20. hbase> count 'ns1:t1'
  21. hbase> count 't1'
  22. hbase> count 't1', INTERVAL => 100000
  23. hbase> count 't1', CACHE => 1000
  24. hbase> count 't1', INTERVAL => 10, CACHE => 1000
  25. hbase> count 't1', FILTER => "
  26.     (QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"
  27. hbase> count 't1', COLUMNS => ['c1', 'c2'], STARTROW => 'abc', STOPROW => 'xyz'
  28. The same commands also can be run on a table reference. Suppose you had a reference
  29. t to table 't1', the corresponding commands would be:
  30. hbase> t.count
  31. hbase> t.count INTERVAL => 100000
  32. hbase> t.count CACHE => 1000
  33. hbase> t.count INTERVAL => 10, CACHE => 1000
  34. hbase> t.count FILTER => "
  35.     (QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"
  36. hbase> t.count COLUMNS => ['c1', 'c2'], STARTROW => 'abc', STOPROW => 'xyz'
  37. Took 8.8512 seconds
复制代码
  扫描表也是出现错误
  ERROR: org.apache.hadoop.hbase.NotServingRegionException:
  1. hbase(main):003:0>  scan 'user_profile',{LIMIT => 10}
  2. ROW                               COLUMN+CELL
  3. ERROR: org.apache.hadoop.hbase.NotServingRegionException: user_profile,,1677228015439.69f3f6a477f90bdc138e31f08ee909d8. is not online on hdp03,16020,1677229361798
  4.         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3272)
  5.         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3249)
  6.         at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
  7.         at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2948)
  8.         at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3285)
  9.         at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002)
  10.         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
  11.         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
  12.         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
  13.         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
  14. Scan a table; pass table name and optionally a dictionary of scanner
  15. specifications.  Scanner specifications may include one or more of:
  16. TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP,
  17. MAXLENGTH, COLUMNS, CACHE, RAW, VERSIONS, ALL_METRICS, METRICS,
  18. REGION_REPLICA_ID, ISOLATION_LEVEL, READ_TYPE, ALLOW_PARTIAL_RESULTS,
  19. BATCH or MAX_RESULT_SIZE
  20. If no columns are specified, all columns will be scanned.
  21. To scan all members of a column family, leave the qualifier empty as in
  22. 'col_family'.
  23. The filter can be specified in two ways:
  24. 1. Using a filterString - more information on this is available in the
  25. Filter Language document attached to the HBASE-4176 JIRA
  26. 2. Using the entire package name of the filter.
  27. If you wish to see metrics regarding the execution of the scan, the
  28. ALL_METRICS boolean should be set to true. Alternatively, if you would
  29. prefer to see only a subset of the metrics, the METRICS array can be
  30. defined to include the names of only the metrics you care about.
  31. Some examples:
  32.   hbase> scan 'hbase:meta'
  33.   hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
  34.   hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
  35.   hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
  36.   hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804000, 1303668904000]}
  37.   hbase> scan 't1', {REVERSED => true}
  38.   hbase> scan 't1', {ALL_METRICS => true}
  39.   hbase> scan 't1', {METRICS => ['RPC_RETRIES', 'ROWS_FILTERED']}
  40.   hbase> scan 't1', {ROWPREFIXFILTER => 'row2', FILTER => "
  41.     (QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"}
  42.   hbase> scan 't1', {FILTER =>
  43.     org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
  44.   hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
  45.   hbase> scan 't1', {ISOLATION_LEVEL => 'READ_UNCOMMITTED'}
  46.   hbase> scan 't1', {MAX_RESULT_SIZE => 123456}
  47. For setting the Operation Attributes
  48.   hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
  49.   hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']}
  50. For experts, there is an additional option -- CACHE_BLOCKS -- which
  51. switches block caching for the scanner on (true) or off (false).  By
  52. default it is enabled.  Examples:
  53.   hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
  54. Also for experts, there is an advanced option -- RAW -- which instructs the
  55. scanner to return all cells (including delete markers and uncollected deleted
  56. cells). This option cannot be combined with requesting specific COLUMNS.
  57. Disabled by default.  Example:
  58.   hbase> scan 't1', {RAW => true, VERSIONS => 10}
  59. There is yet another option -- READ_TYPE -- which instructs the scanner to
  60. use a specific read type. Example:
  61.   hbase> scan 't1', {READ_TYPE => 'PREAD'}
  62. Besides the default 'toStringBinary' format, 'scan' supports custom formatting
  63. by column.  A user can define a FORMATTER by adding it to the column name in
  64. the scan specification.  The FORMATTER can be stipulated:
  65. 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
  66. 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
  67. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
  68.   hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt',
  69.     'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
  70. Note that you can specify a FORMATTER by column only (cf:qualifier). You can set a
  71. formatter for all columns (including, all key parts) using the "FORMATTER"
  72. and "FORMATTER_CLASS" options. The default "FORMATTER_CLASS" is
  73. "org.apache.hadoop.hbase.util.Bytes".
  74.   hbase> scan 't1', {FORMATTER => 'toString'}
  75.   hbase> scan 't1', {FORMATTER_CLASS => 'org.apache.hadoop.hbase.util.Bytes', FORMATTER => 'toString'}
  76. Scan can also be used directly from a table, by first getting a reference to a
  77. table, like such:
  78.   hbase> t = get_table 't'
  79.   hbase> t.scan
  80. Note in the above situation, you can still provide all the filtering, columns,
  81. options, etc as described above.
  82. Took 8.2657 seconds
复制代码
测试Error缘故原由

   新建一张表,来检测是否是表user_profile自己的问题,其实这里显而易见了,但是照旧操纵一下,稳妥些。
  

  • 创建新的表stu,成功创建并可以插入、查询数据
  1. hbase(main):004:0> create 'stu', 'info'
  2. Created table stu
  3. Took 4.3756 seconds
  4. => Hbase::Table - stu
  5. hbase(main):005:0> list
  6. TABLE
  7. stu
  8. user_profile
  9. 2 row(s)
  10. Took 0.0085 seconds
  11. => ["stu", "user_profile"]
  12. hbase(main):006:0> put 'stu', '1001', 'info:name', 'zhangsan'
  13. Took 0.1275 seconds
  14. hbase(main):007:0> scan 'stu'
  15. ROW                               COLUMN+CELL
  16. 1001                             column=info:name, timestamp=1677230857034, value=zhangsan
  17. 1 row(s)
  18. Took 0.0197 seconds
  19. hbase(main):008:0> count 'stu'
  20. 1 row(s)
  21. Took 0.0277 seconds
  22. => 1
复制代码
  证明是表user_profile自身的缘故原由;
    编写文章最后才发现,由于是表user_profile自身的缘故原由,以是我是不是可以直接把表user_profile删除,再新建一个同名表,然后再次模仿导入数据不就可以了???傻了!!!
  解决方案

   检测数据表user_profile的状态,控制台会陆续打印出集群的状态以及表的状态干系信息
  1. [whybigdata@hdp01 hbase-2.0.5]$ hbase hbck 'user_profile'
  2. 2023-02-24 17:34:49,265 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hbase Fsck connecting to ZooKeeper ensemble=hdp01:2181,hdp02:2181,hdp03:2181
  3. 2023-02-24 17:34:49,273 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
  4. 2023-02-24 17:34:49,273 INFO  [main] zookeeper.ZooKeeper: Client environment:host.name=hdp01
  5. 2023-02-24 17:34:49,273 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_212
  6. 2023-02-24 17:34:49,273 INFO  [main] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
  7. 2023-02-24 17:34:49,273 INFO  [main] zookeeper.ZooKeeper: Client environment:java.home=/opt/apps/jdk1.8.0_212/jre
  8. 2023-02-24 17:34:49,273 INFO  [main] zookeeper.ZooKeeper: /hadoop/yarn/hadoop-yarn-server-web-proxy-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-api-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-client-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-common-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-tests-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-services-api-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-services-core-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-common-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-router-3.1.3.jar
  9. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:java.library.path=/opt/apps/hadoop-3.1.3/lib/native
  10. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
  11. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
  12. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
  13. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
  14. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.el7.x86_64
  15. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:user.name=whybigdata
  16. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/whybigdata
  17. 2023-02-24 17:34:49,274 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/opt/apps/hbase-2.0.5
  18. 2023-02-24 17:34:49,275 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@7a362b6b
  19. 2023-02-24 17:34:49,290 INFO  [main-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp02/192.168.10.12:2181. Will not attempt to authenticate using SASL (unknown error)
  20. Allow checking/fixes for table: user_profile
  21. HBaseFsck command line options: user_profile
  22. 2023-02-24 17:34:49,294 INFO  [main] util.HBaseFsck: Launching hbck
  23. 2023-02-24 17:34:49,295 INFO  [main-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Socket connection established to hdp02/192.168.10.12:2181, initiating session
  24. 2023-02-24 17:34:49,306 INFO  [main-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp02/192.168.10.12:2181, sessionid = 0x3000001abaf0003, negotiated timeout = 40000
  25. 2023-02-24 17:34:49,353 INFO  [main] zookeeper.ReadOnlyZKClient: Connect 0x45fd9a4d to hdp01:2181,hdp02:2181,hdp03:2181 with session timeout=90000ms, retries 30, retry interval 1000ms, keepAlive=60000ms
  26. 2023-02-24 17:34:49,359 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$13/371452875@5321776b
  27. 2023-02-24 17:34:49,360 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp02/192.168.10.12:2181. Will not attempt to authenticate using SASL (unknown error)
  28. 2023-02-24 17:34:49,361 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Socket connection established to hdp02/192.168.10.12:2181, initiating session
  29. 2023-02-24 17:34:49,365 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp02/192.168.10.12:2181, sessionid = 0x3000001abaf0004, negotiated timeout = 40000
  30. Version: 2.0.5
  31. 2023-02-24 17:34:49,970 INFO  [main] util.HBaseFsck: Computing mapping of all store files
  32. 2023-02-24 17:34:50,240 INFO  [main] util.HBaseFsck: Validating mapping using HDFS state
  33. 2023-02-24 17:34:50,240 INFO  [main] util.HBaseFsck: Computing mapping of all link files
  34. .
  35. 2023-02-24 17:34:50,292 INFO  [main] util.HBaseFsck: Validating mapping using HDFS state
  36. Number of live region servers: 3
  37. Number of dead region servers: 0
  38. Master: hdp01,16000,1677229359514
  39. Number of backup masters: 0
  40. Average load: 1.0
  41. Number of requests: 76
  42. Number of regions: 3
  43. Number of regions in transition: 0
  44. 2023-02-24 17:34:50,406 INFO  [main] util.HBaseFsck: Loading regionsinfo from the hbase:meta table
  45. Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 0
  46. 2023-02-24 17:34:50,501 INFO  [main] util.HBaseFsck: getTableDescriptors == tableNames => [user_profile]
  47. 2023-02-24 17:34:50,502 INFO  [main] zookeeper.ReadOnlyZKClient: Connect 0x1ce93c18 to hdp01:2181,hdp02:2181,hdp03:2181 with session timeout=90000ms, retries 30, retry interval 1000ms, keepAlive=60000ms
  48. 2023-02-24 17:34:50,504 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$13/371452875@5321776b
  49. 2023-02-24 17:34:50,505 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-SendThread(hdp01:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp01/192.168.10.11:2181. Will not attempt to authenticate using SASL (unknown error)
  50. 2023-02-24 17:34:50,506 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-SendThread(hdp01:2181)] zookeeper.ClientCnxn: Socket connection established to hdp01/192.168.10.11:2181, initiating session
  51. 2023-02-24 17:34:50,521 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-SendThread(hdp01:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp01/192.168.10.11:2181, sessionid = 0x20000018ff60005, negotiated timeout = 40000
  52. 2023-02-24 17:34:50,537 INFO  [main] client.ConnectionImplementation: Closing master protocol: MasterService
  53. 2023-02-24 17:34:50,537 INFO  [main] zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x1ce93c18 to hdp01:2181,hdp02:2181,hdp03:2181
  54. Number of Tables: 1
  55. 2023-02-24 17:34:50,542 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18] zookeeper.ZooKeeper: Session: 0x20000018ff60005 closed
  56. 2023-02-24 17:34:50,542 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x20000018ff60005
  57. 2023-02-24 17:34:50,550 INFO  [main] util.HBaseFsck: Loading region directories from HDFS
  58. 2023-02-24 17:34:50,588 INFO  [main] util.HBaseFsck: Loading region information from HDFS
  59. 2023-02-24 17:34:50,638 INFO  [hbasefsck-pool1-t5] sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
  60. 2023-02-24 17:34:50,638 INFO  [hbasefsck-pool1-t9] sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
  61. 2023-02-24 17:34:50,638 INFO  [hbasefsck-pool1-t10] sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
  62. 2023-02-24 17:34:50,765 INFO  [main] util.HBaseFsck: Checking and fixing region consistency
  63. ERROR: Region { meta => user_profile,,1677228015439.69f3f6a477f90bdc138e31f08ee909d8., hdfs => hdfs://hdp01:8020/hbase/data/default/user_profile/69f3f6a477f90bdc138e31f08ee909d8, deployed => , replicaId => 0 } not deployed on any region server.
  64. ERROR: Region { meta => user_profile,003155,1677228015439.690658266a0b11c87aada6935c91a1f7., hdfs => hdfs://hdp01:8020/hbase/data/default/user_profile/690658266a0b11c87aada6935c91a1f7, deployed => , replicaId => 0 } not deployed on any region server.
  65. 2023-02-24 17:34:50,804 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel. set hbasefsck.overlap.merge.parallel to false to run serially.
  66. ERROR: There is a hole in the region chain between  and .  You need to create a new .regioninfo and region dir in hdfs to plug the hole.
  67. ERROR: Found inconsistency in table user_profile
  68. Summary:
  69. Table user_profile is okay.
  70.     Number of regions: 0
  71.     Deployed on:
  72. Table hbase:meta is okay.
  73.     Number of regions: 1
  74.     Deployed on:  hdp02,16020,1677229361706
  75. 3 inconsistencies detected.
  76. Status: INCONSISTENT
  77. 2023-02-24 17:34:50,881 INFO  [main] zookeeper.ZooKeeper: Session: 0x3000001abaf0003 closed
  78. 2023-02-24 17:34:50,881 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x3000001abaf0003
  79. 2023-02-24 17:34:50,881 INFO  [main] client.ConnectionImplementation: Closing master protocol: MasterService
  80. 2023-02-24 17:34:50,882 INFO  [main] zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x45fd9a4d to hdp01:2181,hdp02:2181,hdp03:2181
  81. 2023-02-24 17:34:50,887 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d] zookeeper.ZooKeeper: Session: 0x3000001abaf0004 closed
  82. 2023-02-24 17:34:50,888 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x3000001abaf0004
复制代码

   由上图可以看到,终极效果是:数据不一致(INCONSISTENT),同时还可以看到有一个 region hole的问题
  

  • 对比操纵:
   检测可以正常操纵的表:stu表的信息如下所示
  1. [whybigdata@hdp01 hbase-2.0.5]$ hbase hbck 'stu'
  2. 2023-02-24 17:42:33,198 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hbase Fsck connecting to ZooKeeper ensemble=hdp01:2181,hdp02:2181,hdp03:2181
  3. 2023-02-24 17:42:33,206 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
  4. 2023-02-24 17:42:33,206 INFO  [main] zookeeper.ZooKeeper: Client environment:host.name=hdp01
  5. 2023-02-24 17:42:33,206 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_212
  6. 2023-02-24 17:42:33,206 INFO  [main] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
  7. 2023-02-24 17:42:33,206 INFO  [main] zookeeper.ZooKeeper: Client environment:java.home=/opt/apps/jdk1.8.0_212/jre
  8. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: /hadoop/yarn/hadoop-yarn-server-web-proxy-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-api-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-client-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-common-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-tests-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-services-api-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-services-core-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-common-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-router-3.1.3.jar
  9. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:java.library.path=/opt/apps/hadoop-3.1.3/lib/native
  10. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
  11. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
  12. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
  13. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
  14. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.el7.x86_64
  15. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:user.name=whybigdata
  16. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/whybigdata
  17. 2023-02-24 17:42:33,207 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/opt/apps/hbase-2.0.5
  18. 2023-02-24 17:42:33,208 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@7a362b6b
  19. 2023-02-24 17:42:33,222 INFO  [main-SendThread(hdp03:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp03/192.168.10.13:2181. Will not attempt to authenticate using SASL (unknown error)
  20. Allow checking/fixes for table: stu
  21. HBaseFsck command line options: stu
  22. 2023-02-24 17:42:33,226 INFO  [main] util.HBaseFsck: Launching hbck
  23. 2023-02-24 17:42:33,227 INFO  [main-SendThread(hdp03:2181)] zookeeper.ClientCnxn: Socket connection established to hdp03/192.168.10.13:2181, initiating session
  24. 2023-02-24 17:42:33,235 INFO  [main-SendThread(hdp03:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp03/192.168.10.13:2181, sessionid = 0x40000034a3b0005, negotiated timeout = 40000
  25. 2023-02-24 17:42:33,281 INFO  [main] zookeeper.ReadOnlyZKClient: Connect 0x45fd9a4d to hdp01:2181,hdp02:2181,hdp03:2181 with session timeout=90000ms, retries 30, retry interval 1000ms, keepAlive=60000ms
  26. 2023-02-24 17:42:33,286 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$13/499703683@58ca425e
  27. 2023-02-24 17:42:33,288 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp02/192.168.10.12:2181. Will not attempt to authenticate using SASL (unknown error)
  28. 2023-02-24 17:42:33,289 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Socket connection established to hdp02/192.168.10.12:2181, initiating session
  29. 2023-02-24 17:42:33,294 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp02/192.168.10.12:2181, sessionid = 0x3000001abaf0006, negotiated timeout = 40000
  30. Version: 2.0.5
  31. 2023-02-24 17:42:33,788 INFO  [main] util.HBaseFsck: Computing mapping of all store files
  32. 2023-02-24 17:42:34,024 INFO  [main] util.HBaseFsck: Validating mapping using HDFS state
  33. 2023-02-24 17:42:34,025 INFO  [main] util.HBaseFsck: Computing mapping of all link files
  34. .
  35. 2023-02-24 17:42:34,075 INFO  [main] util.HBaseFsck: Validating mapping using HDFS state
  36. Number of live region servers: 3
  37. Number of dead region servers: 0
  38. Master: hdp01,16000,1677229359514
  39. Number of backup masters: 0
  40. Average load: 1.0
  41. Number of requests: 97
  42. Number of regions: 3
  43. Number of regions in transition: 0
  44. 2023-02-24 17:42:34,168 INFO  [main] util.HBaseFsck: Loading regionsinfo from the hbase:meta table
  45. Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 0
  46. 2023-02-24 17:42:34,245 INFO  [main] util.HBaseFsck: getTableDescriptors == tableNames => [stu]
  47. 2023-02-24 17:42:34,245 INFO  [main] zookeeper.ReadOnlyZKClient: Connect 0x1ce93c18 to hdp01:2181,hdp02:2181,hdp03:2181 with session timeout=90000ms, retries 30, retry interval 1000ms, keepAlive=60000ms
  48. 2023-02-24 17:42:34,246 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$13/499703683@58ca425e
  49. 2023-02-24 17:42:34,247 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp02/192.168.10.12:2181. Will not attempt to authenticate using SASL (unknown error)
  50. 2023-02-24 17:42:34,248 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Socket connection established to hdp02/192.168.10.12:2181, initiating session
  51. 2023-02-24 17:42:34,252 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-SendThread(hdp02:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp02/192.168.10.12:2181, sessionid = 0x3000001abaf0007, negotiated timeout = 40000
  52. 2023-02-24 17:42:34,264 INFO  [main] client.ConnectionImplementation: Closing master protocol: MasterService
  53. 2023-02-24 17:42:34,264 INFO  [main] zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x1ce93c18 to hdp01:2181,hdp02:2181,hdp03:2181
  54. Number of Tables: 1
  55. 2023-02-24 17:42:34,269 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18] zookeeper.ZooKeeper: Session: 0x3000001abaf0007 closed
  56. 2023-02-24 17:42:34,270 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x1ce93c18-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x3000001abaf0007
  57. 2023-02-24 17:42:34,273 INFO  [main] util.HBaseFsck: Loading region directories from HDFS
  58. 2023-02-24 17:42:34,306 INFO  [main] util.HBaseFsck: Loading region information from HDFS
  59. 2023-02-24 17:42:34,343 INFO  [hbasefsck-pool1-t1] sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
  60. 2023-02-24 17:42:34,343 INFO  [hbasefsck-pool1-t16] sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
  61. 2023-02-24 17:42:34,444 INFO  [main] util.HBaseFsck: Checking and fixing region consistency
  62. 2023-02-24 17:42:34,471 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel. set hbasefsck.overlap.merge.parallel to false to run serially.
  63. Summary:
  64. Table hbase:meta is okay.
  65.     Number of regions: 1
  66.     Deployed on:  hdp02,16020,1677229361706
  67. Table stu is okay.
  68.     Number of regions: 1
  69.     Deployed on:  hdp01,16020,1677229361937
  70. 0 inconsistencies detected.
  71. Status: OK
  72. 2023-02-24 17:42:34,541 INFO  [main] zookeeper.ZooKeeper: Session: 0x40000034a3b0005 closed
  73. 2023-02-24 17:42:34,542 INFO  [main] client.ConnectionImplementation: Closing master protocol: MasterService
  74. 2023-02-24 17:42:34,541 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x40000034a3b0005
  75. 2023-02-24 17:42:34,542 INFO  [main] zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x45fd9a4d to hdp01:2181,hdp02:2181,hdp03:2181
  76. 2023-02-24 17:42:34,547 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d] zookeeper.ZooKeeper: Session: 0x3000001abaf0006 closed
  77. 2023-02-24 17:42:34,547 INFO  [ReadOnlyZKClient-hdp01:2181,hdp02:2181,hdp03:2181@0x45fd9a4d-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x3000001abaf0006
复制代码
  可以看到,正常表的检测效果的region servers个数正常且无死亡如下:
  

   以看到,正常表状态表现为OK
  



  • 实验修改表user_profile
  1. [whybigdata@hdp01 hbase-2.0.5]$ hbase hbck -fix "user_profile"
  2. 2023-02-24 18:17:24,321 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hbase Fsck connecting to ZooKeeper ensemble=hdp01:2181,hdp02:2181,hdp03:2181
  3. 2023-02-24 18:17:24,328 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
  4. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:host.name=hdp01
  5. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_212
  6. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
  7. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:java.home=/opt/apps/jdk1.8.0_212/jre
  8. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: /hadoop/yarn/hadoop-yarn-server-web-proxy-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-api-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-client-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-common-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-tests-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-services-api-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-services-core-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-common-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.1.3.jar:/opt/apps/hadoop-3.1.3/share/hadoop/yarn/hadoop-yarn-server-router-3.1.3.jar
  9. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:java.library.path=/opt/apps/hadoop-3.1.3/lib/native
  10. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
  11. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
  12. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
  13. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
  14. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.el7.x86_64
  15. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:user.name=whybigdata
  16. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/whybigdata
  17. 2023-02-24 18:17:24,329 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/opt/apps/hbase-2.0.5
  18. 2023-02-24 18:17:24,330 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=hdp01:2181,hdp02:2181,hdp03:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@7a362b6b
  19. This option is deprecated, please use  -fixAssignments instead.
  20. 2023-02-24 18:17:24,344 INFO  [main-SendThread(hdp03:2181)] zookeeper.ClientCnxn: Opening socket connection to server hdp03/192.168.10.13:2181. Will not attempt to authenticate using SASL (unknown error)
  21. Allow checking/fixes for table: user_profile
  22. HBaseFsck command line options: -fix user_profile
  23. 2023-02-24 18:17:24,350 INFO  [main-SendThread(hdp03:2181)] zookeeper.ClientCnxn: Socket connection established to hdp03/192.168.10.13:2181, initiating session
  24. 2023-02-24 18:17:24,360 INFO  [main-SendThread(hdp03:2181)] zookeeper.ClientCnxn: Session establishment complete on server hdp03/192.168.10.13:2181, sessionid = 0x4000043ee360001, negotiated timeout = 40000
  25. 2023-02-24 18:17:24,462 INFO  [pool-6-thread-1] util.HBaseFsck: Failed to create lock file hbase-hbck.lock, try=1 of 5
  26. 2023-02-24 18:17:24,665 INFO  [pool-6-thread-1] util.HBaseFsck: Failed to create lock file hbase-hbck.lock, try=2 of 5
  27. 2023-02-24 18:17:25,069 INFO  [pool-6-thread-1] util.HBaseFsck: Failed to create lock file hbase-hbck.lock, try=3 of 5
  28. 2023-02-24 18:17:25,873 INFO  [pool-6-thread-1] util.HBaseFsck: Failed to create lock file hbase-hbck.lock, try=4 of 5
  29. 2023-02-24 18:17:27,476 INFO  [pool-6-thread-1] util.HBaseFsck: Failed to create lock file hbase-hbck.lock, try=5 of 5
  30. 2023-02-24 18:17:30,677 ERROR [main] util.HBaseFsck: Another instance of hbck is fixing HBase, exiting this instance. [If you are sure no other instance is running, delete the lock file hdfs://hdp01:8020/hbase/.tmp/hbase-hbck.lock and rerun the tool]
  31. Exception in thread "main" java.io.IOException: Duplicate hbck - Abort
  32.         at org.apache.hadoop.hbase.util.HBaseFsck.connect(HBaseFsck.java:555)
  33.         at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:5105)
  34.         at org.apache.hadoop.hbase.util.HBaseFsck$HBaseFsckTool.run(HBaseFsck.java:4928)
  35.         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
  36.         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
  37.         at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:4916)
复制代码

   可以看到错误:说是有另外一个运行的实例也在修复HBase,但是我是第一次实行修复操纵,理论上没有另一个运行的实例;按照其建议,删撤除指定的lock文件,即删除hdfs路径 /hbase/.tmp 下的 hbase-hbck.lock 文件,如下图所示:
  



  • 继续实行上述修复下令:
  1. [whybigdata@hdp01 hbase-2.0.5]$ hbase hbck -fix "user_profile"
  2. ......
  3. ERROR: option '-fix' is not supportted!
  4. -----------------------------------------------------------------------
  5. NOTE: As of HBase version 2.0, the hbck tool is significantly changed.
  6. In general, all Read-Only options are supported and can be be used
  7. safely. Most -fix/ -repair options are NOT supported. Please see usage
  8. below for details on which options are not supported.
  9. -----------------------------------------------------------------------
  10. Usage: fsck [opts] {only tables}
  11. where [opts] are:
  12.    -help Display help options (this)
  13.    -details Display full report of all regions.
  14.    -timelag <timeInSeconds>  Process only regions that  have not experienced any metadata updates in the last  <timeInSeconds> seconds.
  15.    -sleepBeforeRerun <timeInSeconds> Sleep this many seconds before checking if the fix worked if run with -fix
  16.    -summary Print only summary of the tables and status.
  17.    -metaonly Only check the state of the hbase:meta table.
  18.    -sidelineDir <hdfs://> HDFS path to backup existing meta.
  19.    -boundaries Verify that regions boundaries are the same between META and store files.
  20.    -exclusive Abort if another hbck is exclusive or fixing.
  21.   Datafile Repair options: (expert features, use with caution!)
  22.    -checkCorruptHFiles     Check all Hfiles by opening them to make sure they are valid
  23.    -sidelineCorruptHFiles  Quarantine corrupted HFiles.  implies -checkCorruptHFiles
  24. Replication options
  25.    -fixReplication   Deletes replication queues for removed peers
  26.   Metadata Repair options supported as of version 2.0: (expert features, use with caution!)
  27.    -fixVersionFile   Try to fix missing hbase.version file in hdfs.
  28.    -fixReferenceFiles  Try to offline lingering reference store files
  29.    -fixHFileLinks  Try to offline lingering HFileLinks
  30.    -noHdfsChecking   Don't load/check region info from HDFS. Assumes hbase:meta region info is good. Won't check/fix any HDFS issue, e.g. hole, orphan, or overlap
  31.    -ignorePreCheckPermission  ignore filesystem permission pre-check
  32. NOTE: Following options are NOT supported as of HBase version 2.0+.
  33.   UNSUPPORTED Metadata Repair options: (expert features, use with caution!)
  34.    -fix              Try to fix region assignments.  This is for backwards compatiblity
  35.    -fixAssignments   Try to fix region assignments.  Replaces the old -fix
  36.    -fixMeta          Try to fix meta problems.  This assumes HDFS region info is good.
  37.    -fixHdfsHoles     Try to fix region holes in hdfs.
  38.    -fixHdfsOrphans   Try to fix region dirs with no .regioninfo file in hdfs
  39.    -fixTableOrphans  Try to fix table dirs with no .tableinfo file in hdfs (online mode only)
  40.    -fixHdfsOverlaps  Try to fix region overlaps in hdfs.
  41.    -maxMerge <n>     When fixing region overlaps, allow at most <n> regions to merge. (n=5 by default)
  42.    -sidelineBigOverlaps  When fixing region overlaps, allow to sideline big overlaps
  43.    -maxOverlapsToSideline <n>  When fixing region overlaps, allow at most <n> regions to sideline per group. (n=2 by default)
  44.    -fixSplitParents  Try to force offline split parents to be online.
  45.    -removeParents    Try to offline and sideline lingering parents and keep daughter regions.
  46.    -fixEmptyMetaCells  Try to fix hbase:meta entries not referencing any region (empty REGIONINFO_QUALIFIER rows)
  47.   UNSUPPORTED Metadata Repair shortcuts
  48.    -repair           Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps -fixReferenceFiles-fixHFileLinks
  49.    -repairHoles      Shortcut for -fixAssignments -fixMeta -fixHdfsHoles
  50. ......
复制代码
  可以看到, -fix选项在HBase2.x后不再支持
  

   HBase2.x支持的选项如下:
  

   查询了干系资料后,发现HBase2.x要利用修复功能必要利用得是hbck2,但是官方没有直接提供,必要自己去下载对应安装包的hbck2包,然后编译,选择自己必要的功能,才可以在HBase上利用。
  呃呃呃呃呃呃。。。。
  兜兜转转,照旧得卸载重装!!!
  鉴于本领有限,此处照旧之间卸载重装HBase吧
  彻底删除HBase数据



  • 毗连ZK,进入zk客户端
  1. [whybigdata@hadoop103 zookeeper-3.5.7]$ bin/zkCli.sh
复制代码


  • 查看当前znode的所包括的内容
  1. [zk: localhost:2181(CONNECTED) 4] ls -s /
  2. [admin, brokers, cluster, config, consumers, controller_epoch, hbase, isr_change_notification, latest_producer_id_block, log_dir_event_notification, zookeeper]cZxid = 0x0
  3. ctime = Thu Jan 01 08:00:00 CST 1970
  4. mZxid = 0x0
  5. mtime = Thu Jan 01 08:00:00 CST 1970
  6. pZxid = 0x600000002
  7. cversion = 19
  8. dataVersion = 0
  9. aclVersion = 0
  10. ephemeralOwner = 0x0
  11. dataLength = 0
  12. numChildren = 11
复制代码


  • 删除
  1. [zk: localhost:2181(CONNECTED) 5] rmr /hbase
  2. The command 'rmr' has been deprecated. Please use 'deleteall' instead.
  3. [zk: localhost:2181(CONNECTED) 6] ls -s /
  4. [admin, brokers, cluster, config, consumers, controller_epoch, isr_change_notification, latest_producer_id_block, log_dir_event_notification, zookeeper]cZxid = 0x0
  5. ctime = Thu Jan 01 08:00:00 CST 1970
  6. mZxid = 0x0
  7. mtime = Thu Jan 01 08:00:00 CST 1970
  8. pZxid = 0x800000396
  9. cversion = 20
  10. dataVersion = 0
  11. aclVersion = 0
  12. ephemeralOwner = 0x0
  13. dataLength = 0
  14. numChildren = 10
复制代码


  • 重启ZK、重启hadoop(hdfs、yarn)、重启hbase
   但是停止HBase集群,等待了好长时间,hdp01节点上的HMaster进程依旧存在,且停止过程并未中止。
  –> 我强制将HMaster进程给kill掉了(其实我也知道会有问题),果不其然,重启所有服务后,HBase的HMaster进程就观察不到了。害害害!!!
  卸载HBase

   卸载HBase前必要实行上述「彻底删除HBase数据」的步骤:
  

  • 删除hdfs上/hbase的目录(这个目录是在hbase-site.xml的参数hbase.rootdir设置的)
  • 删除zk上 /hbase的znode信息
  • 重启zk、hadoop集群
  • 删除三台节点的hbase安装目录
  • 重新解压安装hbase
  参考资料

https://www.playpi.org/2019101201.html
https://www.cnblogs.com/data-magnifier/p/15383318.html

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

大号在练葵花宝典

金牌会员
这个人很懒什么都没写!
快速回复 返回顶部 返回列表