hudi.git
5 hours ago[HUDI-4023] Decouple hudi-spark from hudi-utilities-slim-bundle (#5641) master
Sagar Sumit [Thu, 26 May 2022 05:58:49 +0000 (11:28 +0530)] 
[HUDI-4023] Decouple hudi-spark from hudi-utilities-slim-bundle (#5641)

6 hours ago[HUDI-4040] Bulk insert Support CustomColumnsSortPartitioner with Row (#5502)
RexAn [Thu, 26 May 2022 05:09:04 +0000 (13:09 +0800)] 
[HUDI-4040] Bulk insert Support CustomColumnsSortPartitioner with Row (#5502)

* Along the lines of RDDCustomColumnsSortPartitioner but for Row

7 hours ago[HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (part2) (...
Danny Chan [Thu, 26 May 2022 03:21:39 +0000 (11:21 +0800)] 
[HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (part2) (#5676)

21 hours ago[HUDI-3193] Decouple hudi-aws from hudi-client-common (#5666)
Sagar Sumit [Wed, 25 May 2022 14:08:56 +0000 (19:38 +0530)] 
[HUDI-3193] Decouple hudi-aws from hudi-client-common (#5666)

Move HoodieMetricsCloudWatchConfig to hudi-client-common

22 hours ago[HUDI-4146] Claim RFC-55 for Improve Hive/Meta sync class design and hierachies ...
冯健 [Wed, 25 May 2022 12:31:39 +0000 (20:31 +0800)] 
[HUDI-4146] Claim RFC-55 for Improve Hive/Meta sync class design and hierachies (#5682)

41 hours ago[MINOR] Fix a potential NPE and some finer points of hudi cli (#5656)
luoyajun [Tue, 24 May 2022 18:13:18 +0000 (02:13 +0800)] 
[MINOR] Fix a potential NPE and some finer points of hudi cli (#5656)

42 hours agoMerge pull request #3599 from yuzhaojing/HUDI-2207
Zhaojing Yu [Tue, 24 May 2022 16:47:28 +0000 (00:47 +0800)] 
Merge pull request #3599 from yuzhaojing/HUDI-2207

[HUDI-2207] Support independent flink hudi clustering function

47 hours ago[HUDI-4132] Fixing determining target table schema for delta sync with empty batch...
Sivabalan Narayanan [Tue, 24 May 2022 12:17:15 +0000 (08:17 -0400)] 
[HUDI-4132] Fixing determining target table schema for delta sync with empty batch (#5648)

47 hours ago[HUDI-2207] Support independent flink hudi clustering function 3599/head
喻兆靖 [Sat, 21 May 2022 13:25:15 +0000 (21:25 +0800)] 
[HUDI-2207] Support independent flink hudi clustering function

2 days ago[HUDI-4135] remove netty and netty-all (#5663)
liujinhui [Tue, 24 May 2022 10:56:28 +0000 (18:56 +0800)] 
[HUDI-4135] remove netty and netty-all (#5663)

2 days ago[HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (#5669)
Danny Chan [Tue, 24 May 2022 09:33:30 +0000 (17:33 +0800)] 
[HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (#5669)

2 days ago[HUDI-2473] Fixing compaction write operation in commit metadata (#5203)
Sivabalan Narayanan [Tue, 24 May 2022 07:33:21 +0000 (03:33 -0400)] 
[HUDI-2473] Fixing compaction write operation in commit metadata (#5203)

2 days ago[HUDI-4138] Fix the concurrency modification of hoodie table config for flink (#5660)
Danny Chan [Tue, 24 May 2022 05:07:55 +0000 (13:07 +0800)] 
[HUDI-4138] Fix the concurrency modification of hoodie table config for flink (#5660)

* Remove the metadata cleaning strategy for flink, that means the multi-modal index may be affected
* Improve the HoodieTable#clearMetadataTablePartitionsConfig to only update table config when necessary
* Remove the modification of read code path in HoodieTableConfig

2 days ago[HUDI-4084] Add support to test async table services with integ test suite framework...
Sivabalan Narayanan [Tue, 24 May 2022 03:05:56 +0000 (23:05 -0400)] 
[HUDI-4084] Add support to test async table services with integ test suite framework (#5557)

* Add support to test async table services with integ test suite framework

* Make await time for validation configurable

2 days ago[HUDI-4134] Fix Method naming consistency issues in FSUtils (#5655)
Heap [Mon, 23 May 2022 22:28:48 +0000 (06:28 +0800)] 
[HUDI-4134] Fix Method naming consistency issues in FSUtils (#5655)

2 days ago[MINOR] Removing redundant semicolons and line breaks (#5662)
felixYyu [Mon, 23 May 2022 22:26:36 +0000 (06:26 +0800)] 
[MINOR] Removing redundant semicolons and line breaks (#5662)

2 days ago[HUDI-3933] Add UT cases to cover different key gen (#5638)
Y Ethan Guo [Mon, 23 May 2022 13:48:09 +0000 (06:48 -0700)] 
[HUDI-3933] Add UT cases to cover different key gen (#5638)

2 days ago[HUDI-4142] Claim RFC-54 for new table APIs (#5665)
Sagar Sumit [Mon, 23 May 2022 12:40:07 +0000 (18:10 +0530)] 
[HUDI-4142] Claim RFC-54 for new table APIs (#5665)

3 days ago[HUDI-4129] Initializes a new fs view for WriteProfile#reload (#5640)
YuangZhang [Mon, 23 May 2022 01:57:34 +0000 (09:57 +0800)] 
[HUDI-4129] Initializes a new fs view for WriteProfile#reload (#5640)

Co-authored-by: zhangyuang <zhangyuang@corp.netease.com>
4 days ago[HUDI-4051] Allow nested field as primary key and preCombineField in spark sql (...
Raymond Xu [Sun, 22 May 2022 07:47:51 +0000 (00:47 -0700)] 
[HUDI-4051] Allow nested field as primary key and preCombineField in spark sql (#5517)

* [HUDI-4051] Allow nested field as preCombineField in spark sql

* relax validation for primary key

4 days ago[HUDI-3890] fix rat plugin issue with sql files (#5644)
uday08bce [Sat, 21 May 2022 16:22:55 +0000 (18:22 +0200)] 
[HUDI-3890] fix rat plugin issue with sql files (#5644)

4 days ago[HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition...
Jin Xing [Sat, 21 May 2022 14:41:18 +0000 (22:41 +0800)] 
[HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition (#5588)

4 days ago[HUDI-3858] Shade javax.servlet for Spark bundle jar (#5295)
YueZhang [Sat, 21 May 2022 13:16:14 +0000 (21:16 +0800)] 
[HUDI-3858] Shade javax.servlet for Spark bundle jar (#5295)

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
4 days ago[MINOR] remove unused gson test dependency (#5652)
Raymond Xu [Sat, 21 May 2022 12:34:08 +0000 (05:34 -0700)] 
[MINOR] remove unused gson test dependency (#5652)

5 days ago[HUDI-4122] Fix NPE caused by adding kafka nodes (#5632)
wangxianghu [Sat, 21 May 2022 03:12:53 +0000 (07:12 +0400)] 
[HUDI-4122] Fix NPE caused by adding kafka nodes (#5632)

5 days ago[MINOR] Minor fixes to exception log and removing unwanted metrics flush in integ...
Sivabalan Narayanan [Fri, 20 May 2022 23:27:35 +0000 (19:27 -0400)] 
[MINOR] Minor fixes to exception log and removing unwanted metrics flush in integ test (#5646)

5 days ago[HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource...
huberylee [Fri, 20 May 2022 14:25:32 +0000 (22:25 +0800)] 
[HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource table (#5532)

5 days ago[HUDI-4130] Remove the upgrade/downgrade for flink #initTable (#5642)
Danny Chan [Fri, 20 May 2022 13:31:23 +0000 (21:31 +0800)] 
[HUDI-4130] Remove the upgrade/downgrade for flink #initTable (#5642)

6 days ago[HUDI-4119] the first read result is incorrect when Flink upsert- Kafka connector...
aliceyyan [Fri, 20 May 2022 10:10:24 +0000 (18:10 +0800)] 
[HUDI-4119] the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi (#5626)

* HUDI-4119 the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi

Co-authored-by: aliceyyan <aliceyyan@tencent.com>
7 days ago[HUDI-4114] Remove the unnecessary fs view sync for BaseWriteClient#initTable (#5617)
Danny Chan [Thu, 19 May 2022 02:59:05 +0000 (10:59 +0800)] 
[HUDI-4114] Remove the unnecessary fs view sync for BaseWriteClient#initTable (#5617)

No need to #sync actively because the table instance is instantiated freshly,
its view manager has empty fiew instantces, the fs view would be synced lazily when
is it requested.

7 days ago[HUDI-4116] Unify clustering/compaction related procedures' output type (#5620)
huberylee [Thu, 19 May 2022 01:48:03 +0000 (09:48 +0800)] 
[HUDI-4116] Unify clustering/compaction related procedures' output type (#5620)

* Unify clustering/compaction related procedures' output type

* Address review comments

7 days agoRevert "[HUDI-3870] Add timeout rollback for flink online compaction (#5314)" (#5622)
Danny Chan [Wed, 18 May 2022 12:30:54 +0000 (20:30 +0800)] 
Revert "[HUDI-3870] Add timeout rollback for flink online compaction (#5314)" (#5622)

This reverts commit 6f9b02decb5bb2b83709b1b6ec04a97e4d102c11.

8 days ago[HUDI-4111] Bump ANTLR runtime version in Spark 3.x (#5606)
cxzl25 [Wed, 18 May 2022 11:18:52 +0000 (19:18 +0800)] 
[HUDI-4111] Bump ANTLR runtime version in Spark 3.x (#5606)

8 days ago[HUDI-3942] [RFC-50] Improve Timeline Server (#5392)
Zhaojing Yu [Wed, 18 May 2022 10:43:48 +0000 (18:43 +0800)] 
[HUDI-3942] [RFC-50] Improve Timeline Server (#5392)

8 days agoClean the marker files for flink compaction (#5611)
luokey [Wed, 18 May 2022 03:21:14 +0000 (11:21 +0800)] 
Clean the marker files for flink compaction (#5611)

Co-authored-by: 854194341@qq.com <loukey_7821>
8 days ago[HUDI-4109] Copy the old record directly when it is chosen for merging (#5603)
Danny Chan [Wed, 18 May 2022 02:17:00 +0000 (10:17 +0800)] 
[HUDI-4109] Copy the old record directly when it is chosen for merging (#5603)

8 days ago[minor] Some code refactoring for LogFileComparator and Instant instantiation (#5600)
Danny Chan [Wed, 18 May 2022 01:30:09 +0000 (09:30 +0800)] 
[minor] Some code refactoring for LogFileComparator and Instant instantiation (#5600)

8 days ago[MINOR] Fixing spark long running yaml for non-partitioned (#5607)
Sivabalan Narayanan [Tue, 17 May 2022 13:58:18 +0000 (09:58 -0400)] 
[MINOR] Fixing spark long running yaml for non-partitioned (#5607)

8 days ago[HUDI-4110] Clean the marker files for flink compaction (#5604)
BruceLin [Tue, 17 May 2022 13:09:27 +0000 (21:09 +0800)] 
[HUDI-4110] Clean the marker files for flink compaction (#5604)

9 days ago[HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand (#5564)
Jin Xing [Tue, 17 May 2022 06:12:50 +0000 (14:12 +0800)] 
[HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand (#5564)

* [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

* Set hoodie.query.as.ro.table in serde properties

9 days ago[HUDI-4101] BucketIndexPartitioner should take partition path for better dispersion...
Danny Chan [Tue, 17 May 2022 02:34:57 +0000 (10:34 +0800)] 
[HUDI-4101] BucketIndexPartitioner should take partition path for better dispersion (#5590)

9 days ago[HUDI-4104] DeltaWriteProfile includes the pending compaction file slice when decidin...
Danny Chan [Tue, 17 May 2022 02:34:15 +0000 (10:34 +0800)] 
[HUDI-4104] DeltaWriteProfile includes the pending compaction file slice when deciding small buckets (#5594)

9 days ago[HUDI-3654] Preparations for hudi metastore. (#5572)
Shawy Geng [Tue, 17 May 2022 01:47:10 +0000 (09:47 +0800)] 
[HUDI-3654] Preparations for hudi metastore. (#5572)

* [HUDI-3654] Preparations for hudi metastore.

Co-authored-by: gengxiaoyu <gengxiaoyu@bytedance.com>
9 days ago[HUDI-4103] [HUDI-4001] Filter the properties should not be used when create table...
董可伦 [Mon, 16 May 2022 15:26:23 +0000 (23:26 +0800)] 
[HUDI-4103] [HUDI-4001] Filter the properties should not be used when create table for Spark SQL

10 days ago[HUDI-4098] Metadata table heartbeat for instant has expired, last heartbeat 0 (...
Danny Chan [Mon, 16 May 2022 09:40:08 +0000 (17:40 +0800)] 
[HUDI-4098] Metadata table heartbeat for instant has expired, last heartbeat 0 (#5583)

10 days ago[HUDI-3123] consistent hashing index: basic write path (upsert/insert) (#4480)
Yuwei XIAO [Mon, 16 May 2022 03:07:01 +0000 (11:07 +0800)] 
[HUDI-3123] consistent hashing index: basic write path (upsert/insert) (#4480)

 1. basic write path(insert/upsert) implementation
 2. adapt simple bucket index

10 days agofix hive sync no partition table error (#5585)
陈浩 [Mon, 16 May 2022 01:51:24 +0000 (09:51 +0800)] 
fix hive sync no partition table error (#5585)

10 days ago[HUDI-4001] Filter the properties should not be used when create table for Spark...
董可伦 [Mon, 16 May 2022 01:50:29 +0000 (09:50 +0800)] 
[HUDI-4001] Filter the properties should not be used when create table for Spark SQL (#5495)

11 days ago[HUDI-3980] Suport kerberos hbase index (#5464)
xi chaomin [Sat, 14 May 2022 11:37:31 +0000 (19:37 +0800)] 
[HUDI-3980] Suport kerberos hbase index (#5464)

- Add configurations in HoodieHBaseIndexConfig.java to support kerberos hbase connection.

Co-authored-by: xicm <xicm@asiainfo.com>
12 days ago[HUDI-4097] add table info to jobStatus (#5529)
wqwl611 [Sat, 14 May 2022 01:01:15 +0000 (09:01 +0800)] 
[HUDI-4097] add table info to jobStatus (#5529)

Co-authored-by: wqwl611 <wqwl611@gmail.com>
12 days ago[HUDI-4072] Fix NULL schema for empty batches in deltastreamer (#5543)
Sivabalan Narayanan [Fri, 13 May 2022 12:26:47 +0000 (08:26 -0400)] 
[HUDI-4072] Fix NULL schema for empty batches in deltastreamer (#5543)

12 days ago[HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5574)
Bo Cui [Fri, 13 May 2022 11:52:55 +0000 (19:52 +0800)] 
[HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5574)

* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink

13 days ago[HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compact… (#5545)
Bo Cui [Fri, 13 May 2022 06:32:48 +0000 (14:32 +0800)] 
[HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compact… (#5545)

* [HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compaction files

13 days ago[MINOR] Fix a NPE for Option (#5461)
Xingcan Cui [Fri, 13 May 2022 04:20:40 +0000 (00:20 -0400)] 
[MINOR] Fix a NPE for Option (#5461)

13 days ago[HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5528)
Bo Cui [Fri, 13 May 2022 01:50:11 +0000 (09:50 +0800)] 
[HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5528)

* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink

13 days ago[HUDI-4018][HUDI-4027] Adding integ test yamls for immutable use-cases. Added delete...
Sivabalan Narayanan [Fri, 13 May 2022 01:01:55 +0000 (21:01 -0400)] 
[HUDI-4018][HUDI-4027] Adding integ test yamls for immutable use-cases. Added delete partition support to integ tests (#5501)

- Added pure immutable test yamls to integ test framework. Added SparkBulkInsertNode as part of it.
- Added delete_partition support to integ test framework using spark-datasource.
- Added a single yaml to test all non core write operations (insert overwrite, insert overwrite table and delete partitions)
- Added tests for 4 concurrent spark datasource writers (multi-writer tests).
- Fixed readme w/ sample commands for multi-writer.

13 days ago[HUDI-3963][Claim RFC number 53] Use Lock-Free Message Queue Improving Hoodie Writing...
YueZhang [Thu, 12 May 2022 11:26:00 +0000 (19:26 +0800)] 
[HUDI-3963][Claim RFC number 53] Use Lock-Free Message Queue Improving Hoodie Writing Efficiency. (#5562)

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2 weeks ago[HUDI-4085] Fixing flakiness with parquet empty batch tests in TestHoodieDeltaStreame...
Sivabalan Narayanan [Wed, 11 May 2022 20:02:54 +0000 (16:02 -0400)] 
[HUDI-4085] Fixing flakiness with parquet empty batch tests in TestHoodieDeltaStreamer (#5559)

2 weeks ago[HUDI-4079] Supports showing table comment for hudi with spark3 (#5546)
Jin Xing [Wed, 11 May 2022 14:28:58 +0000 (22:28 +0800)] 
[HUDI-4079] Supports showing table comment for hudi with spark3 (#5546)

2 weeks ago[HUDI-4038] Avoid calling `getDataSize` after every record written (#5497)
Alexey Kudinkin [Wed, 11 May 2022 12:08:31 +0000 (05:08 -0700)] 
[HUDI-4038] Avoid calling `getDataSize` after every record written (#5497)

- getDataSize has non-trivial overhead in the current ParquetWriter impl, requiring traversal of already composed Column Groups in memory. Instead we can sample these calls to getDataSize to amortize its cost.

Co-authored-by: sivabalan <n.siva.b@gmail.com>
2 weeks ago[HUDI-4003] Try to read all the log file to parse schema (#5473)
Lanyuanxiaoyao [Tue, 10 May 2022 22:45:53 +0000 (06:45 +0800)] 
[HUDI-4003] Try to read all the log file to parse schema (#5473)

2 weeks ago[HUDI-4044] When reading data from flink-hudi to external storage, the … (#5516)
aliceyyan [Tue, 10 May 2022 02:25:13 +0000 (10:25 +0800)] 
[HUDI-4044] When reading data from flink-hudi to external storage, the … (#5516)

Co-authored-by: aliceyyan <aliceyyan@tencent.com>
2 weeks ago[HUDI-3995] Making perf optimizations for bulk insert row writer path (#5462)
Sivabalan Narayanan [Mon, 9 May 2022 16:40:22 +0000 (12:40 -0400)] 
[HUDI-3995] Making perf optimizations for bulk insert row writer path (#5462)

- Avoid using udf for key generator for SimpleKeyGen and NonPartitionedKeyGen.
- Fixed NonPartitioned Key generator to directly fetch record key from row rather than involving GenericRecord.
- Other minor fixes around using static values instead of looking up hashmap.

2 weeks ago[HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOpti… (#5526)
xicm [Mon, 9 May 2022 08:35:50 +0000 (16:35 +0800)] 
[HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOpti… (#5526)

* [HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOptimized

Co-authored-by: xicm <xicm@asiainfo.com>
2 weeks ago[MINOR] Fixing close for HoodieCatalog's test (#5531)
ForwardXu [Mon, 9 May 2022 07:17:24 +0000 (15:17 +0800)] 
[MINOR] Fixing close for HoodieCatalog's test (#5531)

* [MINOR] Fixing close for HoodieCatalog's test

2 weeks ago[HUDI-4055]refactor ratelimiter to avoid stack overflow (#5530)
guanziyue [Mon, 9 May 2022 02:27:37 +0000 (10:27 +0800)] 
[HUDI-4055]refactor ratelimiter to avoid stack overflow (#5530)

2 weeks ago[MINOR] fixing flaky tests in deltastreamer tests (#5521)
Sivabalan Narayanan [Sat, 7 May 2022 19:37:20 +0000 (15:37 -0400)] 
[MINOR] fixing flaky tests in deltastreamer tests (#5521)

2 weeks ago[MINOR] Fixing class not found when using flink and enable metadata table (#5527)
BruceLin [Sat, 7 May 2022 12:03:18 +0000 (20:03 +0800)] 
[MINOR] Fixing class not found when using flink and enable metadata table (#5527)

2 weeks ago[HUDI-3849] AvroDeserializer supports AVRO_REBASE_MODE_IN_READ configuration (#5287)
cxzl25 [Sat, 7 May 2022 07:39:14 +0000 (15:39 +0800)] 
[HUDI-3849] AvroDeserializer supports AVRO_REBASE_MODE_IN_READ configuration (#5287)

2 weeks ago[HUDI-3675] Adding post write termination strategy to deltastreamer continuous mode...
Sivabalan Narayanan [Fri, 6 May 2022 13:27:29 +0000 (09:27 -0400)] 
[HUDI-3675] Adding post write termination strategy to deltastreamer continuous mode (#5073)

- Added a postWriteTerminationStrategy to deltastreamer continuous mode. One can enable by setting the appropriate termination strategy using DeltastreamerConfig.postWriteTerminationStrategyClass. If not, continuous mode is expected to run forever.
- Added one concrete impl for termination strategy as NoNewDataTerminationStrategy which shuts down deltastreamer if there is no new data to consume from source for N consecutive rounds.

2 weeks ago[HUDI-4017] Improve spark sql coverage in CI (#5512)
Raymond Xu [Fri, 6 May 2022 12:52:06 +0000 (05:52 -0700)] 
[HUDI-4017] Improve spark sql coverage in CI (#5512)

Add GitHub actions tasks to run spark sql UTs under spark 3.1 and 3.2.

2 weeks ago[HUDI-4042] Support truncate-partition for Spark-3.2 (#5506)
Jin Xing [Fri, 6 May 2022 07:29:47 +0000 (15:29 +0800)] 
[HUDI-4042] Support truncate-partition for Spark-3.2 (#5506)

2 weeks ago[HUDI-2875] Make HoodieParquetWriter Thread safe and memory executor exit gracefully...
guanziyue [Thu, 5 May 2022 20:49:34 +0000 (04:49 +0800)] 
[HUDI-2875] Make HoodieParquetWriter Thread safe and memory executor exit gracefully (#4264)

2 weeks ago[MINOR] Optimize code logic (#5499)
qianchutao [Thu, 5 May 2022 16:33:06 +0000 (00:33 +0800)] 
[MINOR] Optimize code logic (#5499)

3 weeks ago[HUDI-3667] Run unit tests of hudi-integ-tests in CI (#5078)
Y Ethan Guo [Thu, 5 May 2022 06:39:18 +0000 (23:39 -0700)] 
[HUDI-3667] Run unit tests of hudi-integ-tests in CI (#5078)

3 weeks ago[HUDI-4031] Avoid clustering update handling when no pending replacecommit (#5487)
Sagar Sumit [Wed, 4 May 2022 14:17:11 +0000 (19:47 +0530)] 
[HUDI-4031] Avoid clustering update handling when no pending replacecommit (#5487)

3 weeks ago[HUDI-4005] Update release scripts to help validation (#5479)
Raymond Xu [Wed, 4 May 2022 14:15:54 +0000 (07:15 -0700)] 
[HUDI-4005] Update release scripts to help validation (#5479)

3 weeks ago[MINOR] Update RFC status (#5486)
Sagar Sumit [Tue, 3 May 2022 15:57:18 +0000 (21:27 +0530)] 
[MINOR] Update RFC status (#5486)

3 weeks ago[HUDI-3211][RFC-44] Add RFC for Hudi Connector for Presto (#4563)
Todd Gao [Mon, 2 May 2022 16:35:23 +0000 (00:35 +0800)] 
[HUDI-3211][RFC-44] Add RFC for Hudi Connector for Presto (#4563)

* Add RFC doc

Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
* Add note regarding catalog naming

Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
3 weeks ago[MINOR] Update DOAP for release 0.11.0 (#5467)
Raymond Xu [Sat, 30 Apr 2022 17:51:16 +0000 (10:51 -0700)] 
[MINOR] Update DOAP for release 0.11.0 (#5467)

3 weeks ago[HUDI-3978] Fix use of partition path field as hive partition field in flink (#5434)
Wangyh [Sat, 30 Apr 2022 03:58:54 +0000 (11:58 +0800)] 
[HUDI-3978] Fix use of partition path field as hive partition field in flink (#5434)

* Fix partition path fields as hive sync partition fields error

3 weeks ago[HUDI-3862] Fix default configurations of HoodieHBaseIndexConfig (#5308)
xicm [Fri, 29 Apr 2022 23:21:52 +0000 (07:21 +0800)] 
[HUDI-3862] Fix default configurations of HoodieHBaseIndexConfig (#5308)

Co-authored-by: xicm <xicm@asiainfo.com>
3 weeks ago[MINOR] Fix CI by ignoring SparkContext error (#5468)
Y Ethan Guo [Fri, 29 Apr 2022 18:19:07 +0000 (11:19 -0700)] 
[MINOR] Fix CI by ignoring SparkContext error (#5468)

Sets spark.driver.allowMultipleContexts = true when constructing Spark conf in UtilHelpers

3 weeks ago[HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index...
吴祥平 [Fri, 29 Apr 2022 06:10:20 +0000 (14:10 +0800)] 
[HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index  (#5185)

* fix duplicate fileId with bucket Index
* replace to load FileGroup from FileSystemView

3 weeks ago[MINOR] support different cleaning policy for flink (#5459)
Gary Li [Fri, 29 Apr 2022 01:48:44 +0000 (09:48 +0800)] 
[MINOR] support different cleaning policy for flink (#5459)

3 weeks ago[HUDI-3943] Some description fixes for 0.10.1 docs (#5447)
LiChuang [Thu, 28 Apr 2022 22:18:56 +0000 (06:18 +0800)] 
[HUDI-3943] Some description fixes for 0.10.1 docs (#5447)

4 weeks ago[HUDI-3815] Fix docs description of metadata.compaction.delta_commits default value...
Ibson [Wed, 27 Apr 2022 23:09:44 +0000 (07:09 +0800)] 
[HUDI-3815] Fix docs description of metadata.compaction.delta_commits default value error (#5368)

Co-authored-by: pusheng.li01 <pusheng.li01@liulishuo.com>
4 weeks ago[HUDI-3945] After the async compaction operation is complete, the task should exit...
watermelon12138 [Wed, 27 Apr 2022 13:16:09 +0000 (21:16 +0800)] 
[HUDI-3945] After the async compaction operation is complete, the task should exit. (#5391)

Co-authored-by: y00617041 <yangxuan42@huawei.com>
4 weeks agoClaim RFC 52 for Introduce Secondary Index to Improve HUDI Query Performance (#5441)
huberylee [Wed, 27 Apr 2022 06:07:29 +0000 (14:07 +0800)] 
Claim RFC 52 for Introduce Secondary Index to Improve HUDI Query Performance (#5441)

4 weeks ago[HUDI-3977] Flink hudi table with date type partition path throws HoodieNotSupportedE...
Danny Chan [Wed, 27 Apr 2022 05:19:55 +0000 (13:19 +0800)] 
[HUDI-3977] Flink hudi table with date type partition path throws HoodieNotSupportedException (#5432)

4 weeks ago[MINOR] Update alter rename command class type for pattern matching (#5381)
KnightChess [Wed, 27 Apr 2022 02:39:51 +0000 (10:39 +0800)] 
[MINOR] Update alter rename command class type for pattern matching (#5381)

4 weeks ago[HUDI-3478] Claim RFC 51 For CDC (#5437)
Yann Byron [Tue, 26 Apr 2022 15:26:47 +0000 (23:26 +0800)] 
[HUDI-3478] Claim RFC 51 For CDC (#5437)

4 weeks ago[HUDI-3972] Fixing hoodie.properties/tableConfig for no preCombine field with writes...
Sivabalan Narayanan [Tue, 26 Apr 2022 03:03:10 +0000 (23:03 -0400)] 
[HUDI-3972] Fixing hoodie.properties/tableConfig for no preCombine field with writes (#5424)

Fixed instantiation of new table to set the null for preCombine if not explicitly set by the user.

4 weeks ago[HUDI-3085] Improve bulk insert partitioner abstraction (#4441)
Yuwei XIAO [Mon, 25 Apr 2022 10:42:17 +0000 (18:42 +0800)] 
[HUDI-3085] Improve bulk insert partitioner abstraction (#4441)

4 weeks agoRevert "[HUDI-3951]support generan parameter 'sink.parallelism' for flink-hudi (...
ForwardXu [Mon, 25 Apr 2022 04:58:27 +0000 (12:58 +0800)] 
Revert "[HUDI-3951]support generan parameter 'sink.parallelism' for flink-hudi (#5405)" (#5421)

This reverts commit bda3db078e927421c10932cfcb3019cfddb125b6.

4 weeks ago[HUDI-3946] Validate option path in flink hudi sink (#5397)
Ruguo Yu [Mon, 25 Apr 2022 02:13:47 +0000 (10:13 +0800)] 
[HUDI-3946] Validate option path in flink hudi sink (#5397)

4 weeks agosupport generan parameter 'sink.parallelism' for flink-hudi (#5405)
hehuiyuan [Sun, 24 Apr 2022 11:09:39 +0000 (19:09 +0800)] 
support generan parameter 'sink.parallelism' for flink-hudi (#5405)

Co-authored-by: hehuiyuan1 <hehuiyuan@jd.com>
4 weeks ago[HUDI-3923] Fix cast exception while reading boolean type of partitioned field (...
miomiocat [Sat, 23 Apr 2022 12:12:54 +0000 (20:12 +0800)] 
[HUDI-3923] Fix cast exception while reading boolean type of partitioned field (#5373)

4 weeks ago[HUDI-3948] Fix presto bundle missing HBase classes (#5398)
Y Ethan Guo [Sat, 23 Apr 2022 08:33:55 +0000 (01:33 -0700)] 
[HUDI-3948] Fix presto bundle missing HBase classes (#5398)

4 weeks ago[HUDI-3950] add parquet-avro to gcp-bundle (#5399)
Raymond Xu [Sat, 23 Apr 2022 03:59:49 +0000 (20:59 -0700)] 
[HUDI-3950] add parquet-avro to gcp-bundle (#5399)