flink部署在hadoop上:超详细步骤 整合Apache Hudi
flink部署在hadoop上:超详细步骤 整合Apache Hudimvn clean install -DskipTests -DskipITs -Dcheckstyle.skip=true-Drat.skip=true-Dhadoop.version=3.0.0 -Pflink-bundle-shade-hive2注意:steven@wangyuxiangdeMacBook-Pro ~ git clone https://github.com/apache/hudi.gitCloning into'hudi'...remote: Enumerating objects: 122696 done.remote: Counting objects: 100% (5537/5537) done.remote: Compressing objects: 100% (674/674) done.remote: Total 122696 (delta
各组件版本如下
Flink 1.13.1
Hudi 0.10
Hive 2.1.1
CDH 6.3.0
Kafka 2.2.1
1.1 Hudi 代码下载编译
下载代码至本地
steven@wangyuxiangdeMacBook-Pro ~ git clone https://github.com/apache/hudi.git
Cloning into'hudi'...
remote: Enumerating objects: 122696 done.
remote: Counting objects: 100% (5537/5537) done.
remote: Compressing objects: 100% (674/674) done.
remote: Total 122696 (delta 4071) reused 4988 (delta 3811) pack-reused 117159
Receiving objects: 100% (122696/122696) 75.85 MiB | 5.32 MiB/s done.
Resolving deltas: 100% (61608/61608) done.
使用Idea打开Hudi项目,更改packging/hudi-flink-bundle的pom.xml文件,修改flink-bundle-shade-hive2 profile下的hive-version为chd6.3.0的版本
使用命令进行编译
mvn clean install -DskipTests -DskipITs -Dcheckstyle.skip=true-Drat.skip=true-Dhadoop.version=3.0.0 -Pflink-bundle-shade-hive2
注意:
1.因为chd6.3.0使用的是hadoop3.0.0 所以要指定hadoop的版本
2.使用hive2.1.1的版本,也要指定hive的版本,不然使用sync to hive的时候会报类的冲突问题
![]()