目次
sqoop 安装
设置 mysql
sqoop 安装
sqoop 指令集
sqoop 使用
sqoop 创建 hive 表
sqoop 全量导入表
sqoop 增量导入表
sqoop 全量导出表
sqoop 分区表导入表
sqoop 分区表导出表
上机练习
sqoop 安装
设置 mysql
create database test DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; -- 创建数据库 show databases; -- 创建账号 Create user 'test'@'%' identified by 'test'; Grant all privileges on test.* to test@'%' identified by 'test' with grant option; -- 刷新权限设置 flush privileges; sqoop 安装
# 1. 拷贝安装包以及 mysql 的 jar 到 /root 目次 并解压 tar -zvxf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz -C /opt/module/ # 2. 设置情况变量 vim /etc/profile.d/my_env.sh # 参加 sqoop 路径 #SQOOP_HOME export SQOOP_HOME=/opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha export PATH=$PATHSQOOP_HOME/bin # 刷新情况变量 source /etc/profile # 3. 设置 sqoop cd $SQOOP_HOME/conf cp sqoop-env-template.sh sqoop-env.sh vim sqoop-env.sh # 设置信息 export HADOOP_COMMON_HOME=/opt/module/hadoop-3.1.3 export HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.3 export HIVE_HOME=/opt/module/apache-hive-3.1.2-bin #export ZOOKEEPER_HOME=/opt/module/zookeeper-3.4.10 # 拷贝 jdbc 驱动到 sqoop 的 lib 目次下 cp mysql-connector-java-5.1.27-bin.jar /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/lib/ # 连接测试 sqoop list-databases --connect jdbc:mysql://localhost:3306/ --username root --password root sqoop 指令集
sqoop 使用
sqoop 创建 hive 表
$ bin/sqoop create-hive-table \ --connect jdbc:mysql://hadoop102:3306/company \ --username test \ --password test \ --table test \ --hive-table test sqoop 全量导入表
#!/bin/bash sqoop import \ --connect "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc oding=utf-8" \ --username test \ --password test \ --table student2 \ --hive-import \ --delete-target-dir \ --hive-database db_hive \ --fields-terminated-by "\t" \ --target-dir "/user/hive/warehouse/db_hive.db/student2_sqoop" \ --hive-table student2_sqoop \ -m 1 sqoop 增量导入表
bin/sqoop import \ --connect jdbc:mysql://hadoop102:3306/test \ --username root \ --password root \ --table emp \ --check-column deptno \ --incremental lastmodified \ --last-value "10" \ --m 1 \ --append sqoop 全量导出表
#!/bin/bash sqoop export \ --connect "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc oding=utf-8" \ --username test \ --password test \ -m 1 \ --table student2 \ --input-fields-terminated-by ',' \ --export-dir '/user/hive/warehouse/db_hive.db/student2'
sqoop 分区表导入表
#!/bin/bash sqoop_import(){ sqoop import \ --connect "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc oding=utf-8" \ --username test \ --password test \ --query "select * from maket_p where cast(year(ord_date) as decimal)='$part' and \$CONDITIONS" \ --hive-import \ --create-hive-table \ --hive-overwrite \ --fields-terminated-by "\t" \ --hive-database db_hive \ --hive-table market_sqoop \ --target-dir "/user/hive/warehouse/db_hive.db/market_sqoop/type_p=$part/" \ --hive-partition-key type_p \ --hive-partition-value "$part" \ -m 1 } for part in `mysql -uroot -proot --database=test -N -e \ "select distinct cast(year(ord_date) as decimal) from maket_p order by cast(year(ord_date) as decimal) "` do echo "$part 年数据 导入 ..." sqoop_import done beeline -u jdbc:hive2://hadoop100:10000/db_hive \ -n root -p root --outputformat=csv2 --showHeader=false \ -e 'msck repair table market_sqoop;' sqoop 分区表导出表
#!/bin/bash sqoop_maket(){ sqoop export \ --connect "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc oding=utf-8" \ --username test \ --password test \ -m 1 \ --table maket_p \ --input-fields-terminated-by '\t' \ --export-dir "/user/hive/warehouse/db_hive.db/maket_p/$val/" } part=`beeline -u jdbc:hive2://hadoop100:10000/db_hive \ -n root -p root --outputformat=csv2 --showHeader=false \ -e 'show partitions maket_p;'` for val in $part do echo $val sqoop_maket done 上机练习
1 全表数据导出表自选 选择 hive 数据库的一张表导出到 mysql 的一张表里: - #!/bin/bash
- sqoop export \
- --connect
- "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc
- oding=utf-8" \
- --username test \
- --password test \--table student2 \
- --export-dir /user/hive/warehouse/db_hive.db/student2 \
- --fields-terminated-by "," \
- -m 1
复制代码 效果展示: 2 全表数据导入 - #!/bin/bash
- sqoop import \
- --connect
- "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc
- oding=utf-8" \
- --username test \
- --password test \
- --table student2 \
- --delete-target-dir \
- --create-hive-table \
- --hive-import \
- --hive-database db_hive \
- --hive-overwrite \
- --hive-table student2_sqoop \
- --target-dir /user/hive/warehouse/db_hive.db/student2_sqoop \
- --input-fields-terminated-by ',' \
- -m 1
复制代码
3 分区表数据导出 - #!/bin/bash
- sqoop_comm(){
- sqoop export \
- --connect
- "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc
- oding=utf-8" \
- --username test \
- --password test \
- --table log_ \
- --export-dir /user/hive/warehouse/db_hive.db/log/$part \
- --fields-terminated-by '\t' \
- -m 1
- }
- for part in \
- `beeline -u "jdbc:hive2://hadoop100:10000/db_hive" \
- --showHeader=False --outputformat=csv2 \
- -n root -p 123456 -e 'show partitions log'`
- do
- echo "-----------$part-----------"
- sqoop_comm
- done
复制代码
4 分区表数据导入 - #!/bin/bash
- sqoop_comm(){
- sqoop import \
- --connect
- "jdbc:mysql://hadoop100:3306/test?useUnicode=true&characterEnc
- oding=utf-8" \
- --username test \
- --password test \
- --query "select * from log_ where date_format(cast(date_ as
- date),'%Y%m%d')=$part and \$CONDITIONS" \
- --delete-target-dir \
- --create-hive-table \
- --hive-import \
- --hive-overwrite \
- --hive-table sqoop_log \
- --hive-database db_hive \
- --hive-partition-key load_date \
- --hive-partition-value $part \
- --target-dir
- /user/hive/warehouse/db_hive.db/sqoop_log/load_date=$part \
- --input-fields-terminated-by '\t' \
- -m 1
- }
- for part in \
- `mysql -utest -ptest --database=test -Ne "select distinct DATE_FORMAT(cast(date_ as date),'%Y%m%d')from
- log_"`
- do
- echo "---------$part---------"
- sqoop_comm
- done
- beeline -u "jdbc:hive2://hadoop100:10000/db_hive" \
- --showHeader=False --outputformat=csv2 \
- -n root -p 123456 -e 'msck repair table sqoop_log
复制代码
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |