不到断气不罢休 发表于 2024-3-9 21:23:00

聊聊流式数据湖Paimon(五)

从Demo入手,了解Paimon/Flink项目搭建的全过程。记录下采坑之旅。
创建Flink项目

在IDEA中创建Flink项目,由于没有Flink的archetype,因此需要手动创建一下。
参考:idea快速创建flink项目,至此Flink的项目框架就搭建起来了。
注意:必须注释掉pom文件中的provided;否则运行时会报错:
Error: A JNI error has occurred, please check your installation and try again
https://cdn.nlark.com/yuque/0/2023/png/28551376/1703744623444-02d3f2d9-3519-4512-bb14-d6e0a4f80df2.png#averageHue=%23fcf9f8&clientId=u7093ddab-5ab1-4&from=paste&height=334&id=ue8b04350&originHeight=418&originWidth=1010&originalType=binary&ratio=1.25&rotation=0&showTitle=false&size=44058&status=done&style=none&taskId=ue99fb449-4c66-45dd-ba9d-b15fd14e01b&title=&width=808
搭建Flink伪集群

在 Flink包地址 中,选择对应的版本,下载文件https://cdn.nlark.com/yuque/0/2023/png/28551376/1703745359471-af76a88d-c1e7-4c89-95e7-8a2c7128f195.png#averageHue=%23f6f4f3&clientId=u877350a2-e962-4&from=paste&height=321&id=u27371efe&originHeight=401&originWidth=815&originalType=binary&ratio=1.25&rotation=0&showTitle=false&size=33679&status=done&style=none&taskId=u6e0b1ebf-03c2-4744-8397-b936b120e01&title=&width=652
解压后,其文件内容,如下
https://cdn.nlark.com/yuque/0/2023/png/28551376/1703745374641-ee8d794b-b919-4155-90fa-a74d438df895.png#averageHue=%23fbfaf9&clientId=u877350a2-e962-4&from=paste&height=285&id=u4df54c84&originHeight=356&originWidth=784&originalType=binary&ratio=1.25&rotation=0&showTitle=false&size=31280&status=done&style=none&taskId=ubf02f32e-4f80-43d5-b4c4-6053f418d9c&title=&width=627.2
在bin目录下,运行start-cluster.bat脚本即可。打开浏览器访问:localhost:8081,就可以查看Flink的webui
https://cdn.nlark.com/yuque/0/2023/png/28551376/1703745444360-6dfe0a1a-ecaa-414b-aea2-290ef0173acd.png#averageHue=%23fcfbfb&clientId=u877350a2-e962-4&from=paste&height=693&id=u54cdd7e6&originHeight=866&originWidth=1845&originalType=binary&ratio=1.25&rotation=0&showTitle=false&size=106082&status=done&style=none&taskId=ubaaefdba-1922-4841-96d5-645f630f356&title=&width=1476
高版本的Flink中已经没有bat脚本,可参考 flink新版本无bat启动文件的解决办法
补充缺失的依赖

Flink的框架搭建好之后,参考 新一代数据湖存储技术Apache Paimon入门Demo 写一个简单的Paimon程序。但在这个过程中,必须补充 缺失的POM依赖。而这些依赖在编译时并不会报错,一旦运行,各种各样的抛错:
java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
Unable to create catalog xxx
Unsupported SQL query! executeSql()
如下是所有需要的pom依赖:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java</artifactId>
<version>${flink.version}</version>

</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
<version>${flink.version}</version>

</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-java-bridge</artifactId>
<version>1.18.0</version>
</dependency>
<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-flink-1.18</artifactId>
<version>0.6.0-incubating</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-loader</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-runtime</artifactId>
<version>1.18.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-base</artifactId>
<version>${flink.version}</version>
</dependency>







<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>${log4j.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>${log4j.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>${log4j.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.2.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
<version>3.2.3</version>
</dependency>
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
页: [1]
查看完整版本: 聊聊流式数据湖Paimon(五)