目录
1、组件环境
2、安装datax
2.1、下载datax并解压
3、安装datax-web
3.0、下载datax-web的源码,进行编译
3.1、在MySQL中创建datax-web元数据
3.2、安装data-web
3.2.1实行install.sh命令解压部署
3.2.1、手动修改 datax-admin配置文件
3.2.2、手动修改 datax-executor配置文件
3.2.3、更换datax下的python实行文件
3.2.4、更换MySQLjar文件
4、创建MySQL和Hive数据
4.1、创建MySQL数据库
4.2、创建Hive数据库
5、配置datax和datax-web
5.1、启动datax-web服务
5.2、页面访问及配置
5.2.1创建项目
5.2.2、添加mysql和hive的数据源
5.2.3创建DataX使命模板
5.2.4、使命构建
5.2.5、查询创建的使命
5.2.6、在Hive中实行数据插入
6、查看效果
1、组件环境
名称 | 版本 | 形貌 | 下载所在 | hadoop | 3.4.0 | 官方下载bin | | hive | 3.1.3 | 下载源码编译 | | mysql | 8.0.31 | | | datax | 0.0.1-SNAPSHOT | | http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz | datax-web | datax-web-2.1.2 | 下载源码进行编译 | github.com | centos | centos7 | x86版本 | | java | 1.8
| | | 2、安装datax
2.1、下载datax并解压
http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
tar -zxvf datax.tar.gz
解压到/cluster目录下
实行测试命令
./bin/datax.py job/job.json
出现报错
File "/cluster/datax/bin/./datax.py", line 114
print readerRef
^^^^^^^^^^^^^^^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(...)?
阐明:体系中安装python3和python2,默认是python3,由于datax bin中的datax默认支持python2,
以是需要指定python版本为python2,后面会将这个三个文件进行备份更换,datax-web中doc下提供了默认的支持。
- Python (2.x) (支持Python3需要修改更换datax/bin下面的三个python文件,更换文件在doc/datax-web/datax-python3下) 必选,重要用于调度实行底层DataX的启动脚本,默认的方式是以Java子历程方式实行DataX,用户可以选择以Python方式来做自定义的改造
- 参考所在:datax-web/doc/datax-web/datax-web-deploy.md at master · WeiYe-Jing/datax-web (github.com)
python2 bin/datax.py job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2024-10-13 21:04:41.543 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2024-10-13 21:04:41.553 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.40-b25
jvmInfo: Linux amd64 5.8.13-1.el7.elrepo.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2024-10-13 21:04:41.575 [main] INFO Engine -
{
"content":[
{
"reader":{
"name":"streamreader",
"parameter":{
"column":[
{
"type":"string",
"value":"DataX"
},
{
"type":"long",
"value":19890604
},
{
"type":"date",
"value":"1989-06-04 00:00:00"
},
{
"type":"bool",
"value":true
},
{
"type":"bytes",
"value":"test"
}
],
"sliceRecordCount":100000
}
},
"writer":{
"name":"streamwriter",
"parameter":{
"encoding":"UTF-8",
"print":false
}
}
}
],
"setting":{
"errorLimit":{
"percentage":0.02,
"record":0
},
"speed":{
"byte":10485760
}
}
}
2024-10-13 21:04:41.599 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2024-10-13 21:04:41.601 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2024-10-13 21:04:41.601 [main] INFO JobContainer - DataX jobContainer starts job.
2024-10-13 21:04:41.604 [main] INFO JobContainer - Set jobId = 0
2024-10-13 21:04:41.623 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2024-10-13 21:04:41.624 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do prepare work .
2024-10-13 21:04:41.624 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do prepare work .
2024-10-13 21:04:41.624 [job-0] INFO JobContainer - jobContainer starts to do split ...
2024-10-13 21:04:41.625 [job-0] INFO JobContainer - Job set Max-Byte-Speed to 10485760 bytes.
2024-10-13 21:04:41.626 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.
2024-10-13 21:04:41.627 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.
2024-10-13 21:04:41.649 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2024-10-13 21:04:41.654 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2024-10-13 21:04:41.657 [job-0] INFO JobContainer - Running by standalone Mode.
2024-10-13 21:04:41.666 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2024-10-13 21:04:41.671 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2024-10-13 21:04:41.672 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2024-10-13 21:04:41.685 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2024-10-13 21:04:41.986 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[302]ms
2024-10-13 21:04:41.987 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2024-10-13 21:04:51.677 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.022s | All Task WaitReaderTime 0.040s | Percentage 100.00%
2024-10-13 21:04:51.677 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2024-10-13 21:04:51.678 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do post work.
2024-10-13 21:04:51.678 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do post work.
2024-10-13 21:04:51.678 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2024-10-13 21:04:51.680 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /cluster/datax/hook
2024-10-13 21:04:51.682 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
PS Scavenge | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
2024-10-13 21:04:51.682 [job-0] INFO JobContainer - PerfTrace not enable!
2024-10-13 21:04:51.683 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.022s | All Task WaitReaderTime 0.040s | Percentage 100.00%
2024-10-13 21:04:51.684 [job-0] INFO JobContainer -
使命启动时刻 : 2024-10-13 21:04:41
使命结束时刻 : 2024-10-13 21:04:51
使命总计耗时 : 10s
使命均匀流量 : 253.91KB/s
记录写入速度 : 10000rec/s
读出记录总数 : 100000
读写失败总数 : 0
./bin/datax.py -r hdfsreader -w mysqlwriter
3、安装datax-web
3.0、下载datax-web的源码,进行编译
git@github.com:WeiYe-Jing/datax-web.git
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
- 打包成功后的DataX包位于 {DataX_source_code_home}/target/datax/datax/ ,结构如下:
- $ cd {DataX_source_code_home}
- $ ls ./target/datax/datax/
- bin conf job lib log log_perf plugin script tmp
复制代码
3.1、在MySQL中创建datax-web元数据
mysql -u root -p
password:******
CREATE DATABASE dataxweb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
use dataxweb;
3.2、安装data-web
3.2.1实行install.sh命令解压部署
在/cluster/datax-web-2.1.2/bin目录下
实行./install.sh
根据提示实行
实行完成后,会解压文件,并初始化数据库。
3.2.1、手动修改 datax-admin配置文件
/cluster/datax-web-2.1.2/modules/datax-admin/bin/env.properties
内容如下
# environment variables
JAVA_HOME=/java/jdk
WEB_LOG_PATH=/cluster/datax-web-2.1.2/modules/datax-admin/logs
WEB_CONF_PATH=/cluster/datax-web-2.1.2/modules/datax-admin/conf
DATA_PATH=/cluster/datax-web-2.1.2/modules/datax-admin/data
SERVER_PORT=6895
PID_FILE_PATH=/cluster/datax-web-2.1.2/modules/datax-admin/dataxadmin.pid
# mail account
MAIL_USERNAME="example@qq.com"
MAIL_PASSWORD="*********************"
#debug
REMOTE_DEBUG_SWITCH=true
REMOTE_DEBUG_PORT=7223
3.2.2、手动修改 datax-executor配置文件
/cluster/datax-web-2.1.2/modules/datax-executor/bin/env.properties
内容如下:重要是PYTHON_PATH=/cluster/datax/bin/datax.py
# environment variables
JAVA_HOME=/java/jdk
SERVICE_LOG_PATH=/cluster/datax-web-2.1.2/modules/datax-executor/logs
SERVICE_CONF_PATH=/cluster/datax-web-2.1.2/modules/datax-executor/conf
DATA_PATH=/cluster/datax-web-2.1.2/modules/datax-executor/data
## datax json文件存放位置
JSON_PATH=/cluster/datax-web-2.1.2/modules/datax-executor/json
## executor_port
EXECUTOR_PORT=9999
## 保持和datax-admin端口一致
DATAX_ADMIN_PORT=6895
## PYTHON脚本实行位置
PYTHON_PATH=/cluster/datax/bin/datax.py
## dataxweb 服务端口
SERVER_PORT=9504
PID_FILE_PATH=/cluster/datax-web-2.1.2/modules/datax-executor/service.pid
#debug 远程调试端口
REMOTE_DEBUG_SWITCH=true
REMOTE_DEBUG_PORT=7224
3.2.3、更换datax下的python实行文件
- Python (2.x) (支持Python3需要修改更换datax/bin下面的三个python文件,更换文件在doc/datax-web/datax-python3下) 必选,重要用于调度实行底层DataX的启动脚本,默认的方式是以Java子历程方式实行DataX,用户可以选择以Python方式来做自定义的改造
- 这个单个文件可从datax-web源码目录中获取
3.2.4、更换MySQLjar文件
datax下的mysql reader和writer的jar版本过低,使用的mysql8数据库,需要更换下jar文件
路径是:
/cluster/datax/plugin/writer/mysqlwriter/libs/mysql-connector-j-8.3.0.jar
和
/cluster/datax/plugin/reader/mysqlreader/libs/mysql-connector-j-8.3.0.jar
4、创建MySQL和Hive数据
4.1、创建MySQL数据库
=================================================================================================================================
2024年10月13日 星期日 第41周 00:07:20
mysql建表
-- m31094.mm definition
=================================================================================================================================
CREATE TABLE `mm` (
`uuid` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
`name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`time` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`age` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`sex` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`job` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`address` text COLLATE utf8mb4_unicode_ci,
PRIMARY KEY (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
不设置主键uuid
4.2、创建Hive数据库
hive建表
create database m31094;
drop table m31094.mm;
CREATE TABLE m31094.mm (
`uuid` STRING COMMENT '主键',
`name` STRING COMMENT '姓名',
`time` STRING COMMENT '时间',
`age` STRING COMMENT '年事',
`sex` STRING COMMENT '性别',
`job` STRING COMMENT '工作',
`address` STRING COMMENT '所在'
) COMMENT '美女表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
建表信息:
hive查询表形貌信息
DESCRIBE FORMATTED m31094.mm;
5、配置datax和datax-web
5.1、启动datax-web服务
启动命令
/cluster/datax-web-2.1.2/modules/datax-admin/bin/datax-admin.sh restart
tail -f /cluster/datax-web-2.1.2/modules/datax-admin/bin/console.out
/cluster/datax-web-2.1.2/modules/datax-executor/bin/datax-executor.sh restart
tail -f /cluster/datax-web-2.1.2/modules/datax-executor/bin/console.out
查看历程的命令jps -l
kill历程的命令
sudo kill -9 $(ps -ef|grep datax|gawk '$0 !~/grep/ {print $2}' |tr -s '\n' ' ')
datax-executor需要启动成功,可在页面查看主动注册的信息。
5.2、页面访问及配置
http://ip:6895/index.html
6895是自定义端口,根据现实设置
登任命户名、暗码:admin/123456
5.2.1创建项目
5.2.2、添加mysql和hive的数据源
MySQL
HIVE
5.2.3创建DataX使命模板
5.2.4、使命构建
步骤一、配置输入
步骤二、配置输出
步骤三、 字段映射
步骤四、构建、选择使命模板,复制JSON、下一步操作
{
"job": {
"setting": {
"speed": {
"channel": 3,
"byte": 1048576
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "hdfsreader",
"parameter": {
"path": "/cluster/hive/warehouse/m31094.db/mm",
"defaultFS": "hdfs://10.7.215.33:8020",
"fileType": "text",
"fieldDelimiter": ",",
"skipHeader": false,
"column": [
{
"index": "0",
"type": "string"
},
{
"index": "1",
"type": "string"
},
{
"index": "2",
"type": "string"
},
{
"index": "3",
"type": "string"
},
{
"index": "4",
"type": "string"
},
{
"index": "5",
"type": "string"
},
{
"index": "6",
"type": "string"
}
]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"username": "yRjwDFuoPKlqya9h9H2Amg==",
"password": "XCYVpFosvZBBWobFzmLWvA==",
"column": [
"`uuid`",
"`name`",
"`time`",
"`age`",
"`sex`",
"`job`",
"`address`"
],
"connection": [
{
"table": [
"mm"
],
"jdbcUrl": "jdbc:mysql://10.7.215.33:3306/m31094"
}
]
}
}
}
]
}
}
5.2.5、查询创建的使命
5.2.6、在Hive中实行数据插入
insert into m31094.mm values('9','hive数据使用datax同步到MySQL',from_unixtime(unix_timestamp()),'1000000000090101','北京','新疆','加油');
控制台输出内容:
6、查看效果
日志实行环境
- 2024-10-13 21:30:00 [JobThread.run-130] <br>----------- datax-web job execute start -----------<br>----------- Param:
- 2024-10-13 21:30:00 [BuildCommand.buildDataXParam-100] ------------------Command parameters:
- 2024-10-13 21:30:00 [ExecutorJobHandler.execute-57] ------------------DataX process id: 95006
- 2024-10-13 21:30:00 [ProcessCallbackThread.callbackLog-186] <br>----------- datax-web job callback finish.
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.588 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.597 [main] INFO Engine - the machine info =>
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] osInfo: Oracle Corporation 1.8 25.40-b25
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] jvmInfo: Linux amd64 5.8.13-1.el7.elrepo.x86_64
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] cpu num: 8
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] totalPhysicalMemory: -0.00G
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] freePhysicalMemory: -0.00G
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] maxFileDescriptorCount: -1
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] currentOpenFileDescriptorCount: -1
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] GC Names [PS MarkSweep, PS Scavenge]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] MEMORY_NAME | allocation_size | init_size
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] PS Eden Space | 256.00MB | 256.00MB
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] Code Cache | 240.00MB | 2.44MB
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] Compressed Class Space | 1,024.00MB | 0.00MB
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] PS Survivor Space | 42.50MB | 42.50MB
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] PS Old Gen | 683.00MB | 683.00MB
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] Metaspace | -0.00MB | 0.00MB
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.622 [main] INFO Engine -
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "content":[
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "reader":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "name":"hdfsreader",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "parameter":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "column":[
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"0",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"1",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"2",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"3",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"4",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"5",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "index":"6",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "type":"string"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] ],
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "defaultFS":"hdfs://10.7.215.33:8020",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "fieldDelimiter":",",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "fileType":"text",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "path":"hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "skipHeader":false
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "writer":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "name":"mysqlwriter",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "parameter":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "column":[
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`uuid`",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`name`",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`time`",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`age`",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`sex`",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`job`",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "`address`"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] ],
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "connection":[
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] {
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "jdbcUrl":"jdbc:mysql://10.7.215.33:3306/m31094",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "table":[
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "mm"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] ]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] ],
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "password":"******",
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "username":"root"
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] ],
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "setting":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "errorLimit":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "percentage":0.02,
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "record":0
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] },
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "speed":{
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "byte":1048576,
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] "channel":3
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] }
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.645 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.647 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.648 [main] INFO JobContainer - DataX jobContainer starts job.
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.650 [main] INFO JobContainer - Set jobId = 0
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.667 [job-0] INFO HdfsReader$Job - init() begin...
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.993 [job-0] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":[]}
- 2024-10-13 21:30:00 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:00.994 [job-0] INFO HdfsReader$Job - init() ok and end...
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:01.580 [job-0] INFO OriginalConfPretreatmentUtil - table:[mm] all columns:[
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] uuid,name,time,age,sex,job,address
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] ].
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:01.613 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] INSERT INTO %s (`uuid`,`name`,`time`,`age`,`sex`,`job`,`address`) VALUES(?,?,?,?,?,?,?)
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] ], which jdbcUrl like:[jdbc:mysql://10.7.215.33:3306/m31094?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:01.614 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:01.614 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] do prepare work .
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:01.614 [job-0] INFO HdfsReader$Job - prepare(), start to getAllFiles...
- 2024-10-13 21:30:01 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:01.614 [job-0] INFO HdfsReader$Job - get HDFS all files in path = [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm]
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.699 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.709 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_1]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.718 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_2]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.728 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_3]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.737 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_4]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.759 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_5]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.771 [job-0] INFO HdfsReader$Job - [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_6]是[text]类型的文件, 将该文件加入source files列表
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.771 [job-0] INFO HdfsReader$Job - 您即将读取的文件数为: [7], 列表为: [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_5,hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_4,hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_3,hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_2,hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_1,hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0,hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_6]
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.772 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.773 [job-0] INFO JobContainer - jobContainer starts to do split ...
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.774 [job-0] INFO JobContainer - Job set Max-Byte-Speed to 1048576 bytes.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.775 [job-0] INFO HdfsReader$Job - split() begin...
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.777 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] splits to [7] tasks.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.779 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [7] tasks.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.797 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.810 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.813 [job-0] INFO JobContainer - Running by standalone Mode.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.825 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [7] tasks.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.830 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.831 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.845 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[2] attemptCount[1] is started
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.883 [0-0-2-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.885 [0-0-2-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.886 [0-0-2-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_3]
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.918 [0-0-2-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:02 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:02.925 [0-0-2-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.247 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[2] is successed, used[403]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.250 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[5] attemptCount[1] is started
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.286 [0-0-5-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.287 [0-0-5-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.287 [0-0-5-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.290 [0-0-5-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.292 [0-0-5-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.351 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[5] is successed, used[101]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.354 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[4] attemptCount[1] is started
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.379 [0-0-4-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.380 [0-0-4-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.380 [0-0-4-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_1]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.384 [0-0-4-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.386 [0-0-4-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.454 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[4] is successed, used[101]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.457 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.486 [0-0-0-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.487 [0-0-0-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.487 [0-0-0-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_5]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.489 [0-0-0-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.491 [0-0-0-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.558 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[101]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.561 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[1] attemptCount[1] is started
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.587 [0-0-1-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.588 [0-0-1-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.588 [0-0-1-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_4]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.592 [0-0-1-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.594 [0-0-1-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.662 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[1] is successed, used[101]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.664 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[3] attemptCount[1] is started
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.691 [0-0-3-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.691 [0-0-3-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.691 [0-0-3-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_2]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.694 [0-0-3-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.696 [0-0-3-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.765 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[3] is successed, used[101]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.768 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[6] attemptCount[1] is started
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.791 [0-0-6-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]}
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.791 [0-0-6-reader] INFO Reader$Task - read start
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.791 [0-0-6-reader] INFO Reader$Task - reading file : [hdfs://10.7.215.33:8020/cluster/hive/warehouse/m31094.db/mm/000000_0_copy_6]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.795 [0-0-6-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":",","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":""","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null]
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.798 [0-0-6-reader] INFO Reader$Task - end read source files...
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.868 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[6] is successed, used[100]ms
- 2024-10-13 21:30:03 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:03.869 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.838 [job-0] INFO StandAloneJobContainerCommunicator - Total 7 records, 282 bytes | Speed 28B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.028s | Percentage 100.00%
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.838 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.838 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.839 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] do post work.
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.839 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.840 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /cluster/datax/hook
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.841 [job-0] INFO JobContainer -
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] [total cpu info] =>
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] averageCpu | maxDeltaCpu | minDeltaCpu
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] -1.00% | -1.00% | -1.00%
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] [total gc info] =>
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] PS MarkSweep | 1 | 1 | 1 | 0.040s | 0.040s | 0.040s
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] PS Scavenge | 1 | 1 | 1 | 0.022s | 0.022s | 0.022s
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.841 [job-0] INFO JobContainer - PerfTrace not enable!
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.842 [job-0] INFO StandAloneJobContainerCommunicator - Total 7 records, 282 bytes | Speed 28B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.028s | Percentage 100.00%
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 2024-10-13 21:30:12.843 [job-0] INFO JobContainer -
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 任务启动时刻 : 2024-10-13 21:30:00
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 任务结束时刻 : 2024-10-13 21:30:12
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 任务总计耗时 : 12s
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 任务平均流量 : 28B/s
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 记录写入速度 : 0rec/s
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 读出记录总数 : 7
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 读写失败总数 : 0
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53]
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 十月 13, 2024 9:30:01 下午 org.apache.hadoop.util.NativeCodeLoader <clinit>
- 2024-10-13 21:30:12 [AnalysisStatistics.analysisStatisticsLog-53] 警告: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- 2024-10-13 21:30:12 [JobThread.run-165] <br>----------- datax-web job execute end(finish) -----------<br>----------- ReturnT:ReturnT [code=200, msg=LogStatistics{taskStartTime=2024-10-13 21:30:00, taskEndTime=2024-10-13 21:30:12, taskTotalTime=12s, taskAverageFlow=28B/s, taskRecordWritingSpeed=0rec/s, taskRecordReaderNum=7, taskRecordWriteFailNum=0}, content=null]
- 2024-10-13 21:30:12 [TriggerCallbackThread.callbackLog-186] <br>----------- datax-web job callback finish.
复制代码
查询MySQL数据库查询数据同步
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |