hudi.git
10 months ago[MINOR] Update release version to reflect published version 0.9.0 release-0.9.0 3611/head release-0.9.0
Udit Mehrotra [Tue, 24 Aug 2021 20:57:49 +0000 (13:57 -0700)] 
[MINOR] Update release version to reflect published version 0.9.0

10 months agoBumping release candidate number 2 release-0.9.0-rc2
Udit Mehrotra [Fri, 20 Aug 2021 20:45:17 +0000 (13:45 -0700)] 
Bumping release candidate number 2

10 months agoKeep non-conflicting names for common configs between DataSourceOptions and HoodieWri...
Udit Mehrotra [Fri, 20 Aug 2021 09:42:59 +0000 (02:42 -0700)] 
Keep non-conflicting names for common configs between DataSourceOptions and HoodieWriteConfig (#3511)

10 months agoRestore 0.8.0 config keys with deprecated annotation (#3506)
Udit Mehrotra [Thu, 19 Aug 2021 20:36:40 +0000 (13:36 -0700)] 
Restore 0.8.0 config keys with deprecated annotation (#3506)

Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
10 months ago[HUDI-2322] Use correct meta columns while preparing dataset for bulk insert (#3504)
Sagar Sumit [Thu, 19 Aug 2021 16:07:12 +0000 (21:37 +0530)] 
[HUDI-2322] Use correct meta columns while preparing dataset for bulk insert (#3504)

10 months ago[MINOR] Fixing release validation script (#3493)
Sivabalan Narayanan [Thu, 19 Aug 2021 11:54:56 +0000 (07:54 -0400)] 
[MINOR] Fixing release validation script (#3493)

10 months ago[HUDI-1363] Include _hoodie_operation meta column in removeMetadataFields (#3501)
Sagar Sumit [Thu, 19 Aug 2021 11:03:54 +0000 (16:33 +0530)] 
[HUDI-1363] Include _hoodie_operation meta column in removeMetadataFields (#3501)

10 months ago[HUDI-2167] HoodieCompactionConfig get HoodieCleaningPolicy NullPointerException
leiqiang [Tue, 13 Jul 2021 03:26:12 +0000 (11:26 +0800)] 
[HUDI-2167] HoodieCompactionConfig get HoodieCleaningPolicy NullPointerException

close apache/hudi#3402

10 months agoHUDI-1674 (#3488)
liujinhui [Wed, 18 Aug 2021 05:45:48 +0000 (13:45 +0800)] 
HUDI-1674 (#3488)

10 months ago[Hot Fix]Add apache license to spark_command.txt.template (#3479) release-0.9.0-rc1
Udit Mehrotra [Sun, 15 Aug 2021 21:04:03 +0000 (14:04 -0700)] 
[Hot Fix]Add apache license to spark_command.txt.template (#3479)

10 months agoCreate release branch for version 0.9.0.
Udit Mehrotra [Sun, 15 Aug 2021 01:52:18 +0000 (18:52 -0700)] 
Create release branch for version 0.9.0.

10 months ago[HUDI-2268] Add upgrade and downgrade to and from 0.9.0 (#3470)
Y Ethan Guo [Sun, 15 Aug 2021 00:20:23 +0000 (17:20 -0700)] 
[HUDI-2268] Add upgrade and downgrade to and from 0.9.0 (#3470)

- Added upgrade and downgrade step to and from 0.9.0. Upgrade adds few table properties. Downgrade recreates timeline server based marker files if any.

10 months ago[MINOR] Adding back all old default val members to DataSourceOptions (#3474)
vinoth chandar [Sat, 14 Aug 2021 21:49:22 +0000 (14:49 -0700)] 
[MINOR] Adding back all old default val members to DataSourceOptions (#3474)

- Added @Deprecated
 - Added @deprecated javadoc to keys and defaults suggested how to migrate
 - Moved all deprecated members to bottom to improve readability

10 months ago[HUDI-1897] Deltastreamer source for AWS S3 (#3433)
Sagar Sumit [Sat, 14 Aug 2021 12:25:10 +0000 (17:55 +0530)] 
[HUDI-1897] Deltastreamer source for AWS S3 (#3433)

- Added two sources for two stage pipeline. a. S3EventsSource that fetches events from SQS and ingests to a meta hoodie table. b. S3EventsHoodieIncrSource reads S3 events from this meta hoodie table, fetches actual objects from S3 and ingests to sink hoodie table.
- Added selectors to assist in S3EventsSource.

Co-authored-by: Satish M <84978833+satishmittal1111@users.noreply.github.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
10 months ago[HUDI-2305] Add MARKERS.type and fix marker-based rollback (#3472)
Y Ethan Guo [Sat, 14 Aug 2021 12:18:49 +0000 (05:18 -0700)] 
[HUDI-2305] Add MARKERS.type and fix marker-based rollback (#3472)

- Rollback infers the directory structure and does rollback based on the strategy used while markers were written. "write markers type" in write config is used to determine marker strategy only for new writes.

10 months ago[HUDI-2307] When using delete_partition with ds should not rely on the primary key...
liujinhui [Sat, 14 Aug 2021 06:53:39 +0000 (14:53 +0800)] 
[HUDI-2307]  When using delete_partition with ds should not rely on the primary key (#3469)

- Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

10 months ago[HUDI-2119] Ensure the rolled-back instance was previously synced to the Metadata...
Prashant Wason [Sat, 14 Aug 2021 04:23:34 +0000 (21:23 -0700)] 
[HUDI-2119] Ensure the rolled-back instance was previously synced to the Metadata Table when syncing a Rollback Instant. (#3210)

* [HUDI-2119] Ensure the rolled-back instance was previously synced to the Metadata Table when syncing a Rollback Instant.

If the rolled-back instant was synced to the Metadata Table, a corresponding deltacommit with the same timestamp should have been created on the Metadata Table timeline. To ensure we can always perfomr this check, the Metadata Table instants should not be archived until their corresponding instants are present in the dataset timeline. But ensuring this requires a large number of instants to be kept on the metadata table.

In this change, the metadata table will keep atleast the number of instants that the main dataset is keeping. If the instant being rolled back was before the metadata table timeline, the code will throw an exception and the metadata table will have to be re-bootstrapped. This should be a very rare occurance and should occur only when the dataset is being repaired by rolling back multiple commits or restoring to an much older time.

* Fixed checkstyle

* Improvements from review comments.

Fixed  checkstyle
Replaced explicit null check with Option.ofNullable
Removed redundant function getSynedInstantTime

* Renamed getSyncedInstantTime and getSyncedInstantTimeForReader.

Sync is confusing so renamed to getUpdateTime() and getReaderTime().

* Removed getReaderTime which is only for testing as the same method can be accessed during testing differently without making it part of the public interface.

* Fix compilation error

* Reverting changes to HoodieMetadataFileSystemView

Co-authored-by: Vinoth Chandar <vinoth@apache.org>
10 months ago[HUDI-2151] Flipping defaults (#3452)
Sivabalan Narayanan [Fri, 13 Aug 2021 23:29:22 +0000 (19:29 -0400)] 
[HUDI-2151]  Flipping defaults (#3452)

10 months ago[HUDI-1363] Provide option to drop partition columns (#3465)
Sagar Sumit [Fri, 13 Aug 2021 17:01:26 +0000 (22:31 +0530)] 
[HUDI-1363] Provide option to drop partition columns (#3465)

- Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

10 months agoMINOR fix method use error (#3467)
liujinhui [Fri, 13 Aug 2021 11:59:51 +0000 (19:59 +0800)] 
MINOR fix method use error (#3467)

10 months ago[MINOR] Tweak change log more as FULL for flink streaming source (#3466)
Danny Chan [Fri, 13 Aug 2021 08:31:16 +0000 (16:31 +0800)] 
[MINOR] Tweak change log more as FULL for flink streaming source (#3466)

10 months ago[HUDI-2279]Support column name matching for insert * and update set * in merge into...
董可伦 [Fri, 13 Aug 2021 06:10:07 +0000 (14:10 +0800)] 
[HUDI-2279]Support column name matching for insert * and update set * in merge into (#3415)

10 months ago[MINOR] Deprecate older configs (#3464)
Sagar Sumit [Fri, 13 Aug 2021 03:31:04 +0000 (09:01 +0530)] 
[MINOR] Deprecate older configs (#3464)

Rename and deprecate props in HoodieWriteConfig

Rename and deprecate older props

10 months ago[HUDI-1292] Created a config to enable/disable syncing of metadata table. (#3427)
Prashant Wason [Thu, 12 Aug 2021 22:45:57 +0000 (15:45 -0700)] 
[HUDI-1292] Created a config to enable/disable syncing of metadata table. (#3427)

* [HUDI-1292] Created a config to enable/disable syncing of metadata table.

- Metadata Table should only be synced from a single pipeline to prevent conflicts.
- Skip syncing metadata table for clustering and compaction
- Renamed useFileListingMetadata

Co-authored-by: Vinoth Chandar <vinoth@apache.org>
10 months ago[HUDI-2294] Adding virtual keys support to deltastreamer (#3450)
Sivabalan Narayanan [Thu, 12 Aug 2021 12:02:39 +0000 (08:02 -0400)] 
[HUDI-2294] Adding virtual keys support to deltastreamer (#3450)

10 months agoMINOR (#3459)
liujinhui [Thu, 12 Aug 2021 10:19:05 +0000 (18:19 +0800)] 
MINOR (#3459)

MOVE hoodie Deltrstreamer to hudi-utilties

10 months ago[MINOR] Correct TestKafkaSource class and comment (#3451)
vinoyang [Thu, 12 Aug 2021 01:11:00 +0000 (09:11 +0800)] 
[MINOR] Correct TestKafkaSource class and comment (#3451)

10 months ago[HUDI-2017] Add API to set a metric in the registry. (#3084)
Prashant Wason [Wed, 11 Aug 2021 23:47:16 +0000 (16:47 -0700)] 
[HUDI-2017] Add API to set a metric in the registry. (#3084)

Registry.add() API adds the new value to existing metric value. For some use-cases We need a API to set/replace the existing value.

Metadata Table is synced in preWrite() and postWrite() functions of commit. As part of the sync, the current sizes and basefile/logfile counts are published as metrics. If we use the Registry.add() API, the count and sizes are incorrectly published as sum of the two values. This is corrected by using the Registry.set() API instead.

10 months ago[HUDI-1518] Remove the logic that delete replaced file when archive (#3310)
zhangyue19921010 [Wed, 11 Aug 2021 17:54:44 +0000 (01:54 +0800)] 
[HUDI-1518] Remove the logic that delete replaced file when archive (#3310)

* remove delete replaced file when archive

* done

* remove unsed import

* remove delete replaced files when archive realted UT

* code reviewed

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
10 months ago[HUDI-1138] Add timeline-server-based marker file strategy for improving marker-relat...
Y Ethan Guo [Wed, 11 Aug 2021 15:48:13 +0000 (08:48 -0700)] 
[HUDI-1138] Add timeline-server-based marker file strategy for improving marker-related latency (#3233)

- Can be enabled for cloud stores like S3. Not supported for hdfs yet, due to partial write failures.

10 months ago[HUDI-2298] The HoodieMergedLogRecordScanner should set up the operation of the chose...
Danny Chan [Wed, 11 Aug 2021 14:55:43 +0000 (22:55 +0800)] 
[HUDI-2298] The HoodieMergedLogRecordScanner should set up the operation of the chosen record (#3456)

10 months ago[HUDI-2286] Handle the case of failed deltacommit on the metadata table. (#3428)
Prashant Wason [Wed, 11 Aug 2021 14:39:48 +0000 (07:39 -0700)] 
[HUDI-2286] Handle the case of failed deltacommit on the metadata table. (#3428)

A failed deltacommit on the metadata table will be automatically rolled back. Assuming the failed commit was "t10", the rollback will happen the next time at "t11". Post rollback, when we try to sync the dataset to the metadata table, we should look for all unsynched instants including t11. Current code ignores t11 since the latest commit timestamp on metadata table is t11 (due to rollback).

10 months ago[HUDI-1774] Adding support for delete_partitions to spark data source (#3437)
Sivabalan Narayanan [Wed, 11 Aug 2021 05:03:01 +0000 (01:03 -0400)] 
[HUDI-1774] Adding support for delete_partitions to spark data source (#3437)

10 months ago[HUDI-2292] MOR should not predicate pushdown when reading with payload_combine type...
Shawy Geng [Wed, 11 Aug 2021 04:17:39 +0000 (12:17 +0800)] 
[HUDI-2292] MOR should not predicate pushdown when reading with payload_combine type (#3443)

10 months ago[HUDI-1939] remove joda time in hivesync module (#3430)
Raymond Xu [Wed, 11 Aug 2021 03:25:41 +0000 (20:25 -0700)] 
[HUDI-1939] remove joda time in hivesync module (#3430)

10 months ago[HUDI-2170] [HUDI-1763] Always choose the latest record for HoodieRecordPayload ...
swuferhong [Wed, 11 Aug 2021 02:20:55 +0000 (10:20 +0800)] 
[HUDI-2170] [HUDI-1763] Always choose the latest record for HoodieRecordPayload (#3401)

10 months ago[HUDI-2042] Compare the field object directly in OverwriteWithLatestAvroPayload ...
Shawy Geng [Tue, 10 Aug 2021 21:48:53 +0000 (05:48 +0800)] 
[HUDI-2042] Compare the field object directly in OverwriteWithLatestAvroPayload (#3108)

10 months ago[MINOR] Fix contribution link in PULL_REQUEST_TEMPLATE (#3425)
Damon P. Cortesi [Tue, 10 Aug 2021 20:01:45 +0000 (13:01 -0700)] 
[MINOR] Fix contribution link in PULL_REQUEST_TEMPLATE (#3425)

10 months ago[MINOR] Delete useless com.uber.hoodie.hadoop.hive.HoodieCombineHiveInputFormat ...
vinoyang [Tue, 10 Aug 2021 19:05:31 +0000 (03:05 +0800)] 
[MINOR] Delete useless com.uber.hoodie.hadoop.hive.HoodieCombineHiveInputFormat (#3298)

10 months ago[HUDI-1129] Improving schema evolution support in hudi (#2927)
Sivabalan Narayanan [Tue, 10 Aug 2021 16:15:37 +0000 (12:15 -0400)] 
[HUDI-1129] Improving schema evolution support in hudi (#2927)

* Adding support to ingest records with old schema after table's schema is evolved

* Rebasing against latest master

- Trimming test file to be < 800 lines
- Renaming config names

* Addressing feedback

Co-authored-by: Vinoth Chandar <vinoth@apache.org>
10 months ago[MINOR] Fix travis from errors (#3432)
zhangyue19921010 [Tue, 10 Aug 2021 15:25:49 +0000 (23:25 +0800)] 
[MINOR] Fix travis from errors (#3432)

10 months ago[HUDI-2288] Support storage on ks3 for hudi (#3434)
xuzifu666 [Tue, 10 Aug 2021 15:18:12 +0000 (23:18 +0800)] 
[HUDI-2288] Support storage on ks3 for hudi (#3434)

Co-authored-by: xuzifu <xuzifu.com>
10 months ago[HUDI-1771] Propagate CDC format for hoodie (#3285)
swuferhong [Tue, 10 Aug 2021 12:23:23 +0000 (20:23 +0800)] 
[HUDI-1771] Propagate CDC format for hoodie (#3285)

10 months ago[HUDI-2194] Skip the latest N partitions when choosing partitions to create Clusterin...
zhangyue19921010 [Mon, 9 Aug 2021 17:10:15 +0000 (01:10 +0800)] 
[HUDI-2194] Skip the latest N partitions when choosing partitions to create ClusteringPlan (#3300)

* skip from latest partitions based on hoodie.clustering.plan.strategy.daybased.skipfromlatest.partitions && 0(default means skip nothing)

* change config verison

* add ut

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
10 months ago[HUDI-2208] Support Bulk Insert For Spark Sql (#3328)
pengzhiwei [Mon, 9 Aug 2021 04:18:31 +0000 (12:18 +0800)] 
[HUDI-2208] Support Bulk Insert For Spark Sql (#3328)

10 months ago[HUDI-2247] Filter file where length less than parquet MAGIC length (#3363)
yuzhaojing [Mon, 9 Aug 2021 01:15:42 +0000 (09:15 +0800)] 
[HUDI-2247] Filter file where length less than parquet MAGIC length (#3363)

Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
10 months ago[HUDI-2243] Support Time Travel Query For Hoodie Table (#3360)
pengzhiwei [Sat, 7 Aug 2021 23:07:22 +0000 (07:07 +0800)] 
[HUDI-2243] Support Time Travel Query For Hoodie Table (#3360)

10 months ago[HUDI-1842] Spark Sql Support For pre-existing Hoodie Table (#3393)
pengzhiwei [Sat, 7 Aug 2021 11:49:26 +0000 (19:49 +0800)] 
[HUDI-1842] Spark Sql Support For pre-existing Hoodie Table (#3393)

10 months ago[HUDI-1468] Support custom clustering strategies and preserve commit metadata as...
Sagar Sumit [Sat, 7 Aug 2021 02:53:08 +0000 (08:23 +0530)] 
[HUDI-1468] Support custom clustering strategies and preserve commit metadata as part of clustering (#3419)

Co-authored-by: Satish Kotha <satishkotha@uber.com>
10 months ago[MINOR] fix compile error in compaction command (#3421)
pengzhiwei [Fri, 6 Aug 2021 08:18:19 +0000 (16:18 +0800)] 
[MINOR] fix compile error in compaction command (#3421)

10 months ago[HUDI-2182] Support Compaction Command For Spark Sql (#3277)
pengzhiwei [Fri, 6 Aug 2021 07:12:10 +0000 (15:12 +0800)] 
[HUDI-2182] Support Compaction Command For Spark Sql (#3277)

10 months ago[HUDI-2278] Use INT64 timestamp with precision 3 for flink parquet writer (#3414)
Danny Chan [Fri, 6 Aug 2021 03:06:21 +0000 (11:06 +0800)] 
[HUDI-2278] Use INT64 timestamp with precision 3 for flink parquet writer (#3414)

10 months ago[HUDI-2274] Allows INSERT duplicates for Flink MOR table (#3403)
Danny Chan [Fri, 6 Aug 2021 02:30:52 +0000 (10:30 +0800)] 
[HUDI-2274] Allows INSERT duplicates for Flink MOR table (#3403)

10 months ago[HUDI-2233] Use HMS To Sync Hive Meta For Spark Sql (#3387)
pengzhiwei [Thu, 5 Aug 2021 13:57:22 +0000 (21:57 +0800)] 
[HUDI-2233] Use HMS To Sync Hive Meta For Spark Sql (#3387)

10 months ago[HUDI-2273] Migrating some long running tests to functional test profile (#3398)
Sivabalan Narayanan [Wed, 4 Aug 2021 23:08:50 +0000 (19:08 -0400)] 
[HUDI-2273] Migrating some long running tests to functional test profile (#3398)

10 months ago[HUDI-2232] [SQL] MERGE INTO fails with table having nested struct (#3379)
pengzhiwei [Wed, 4 Aug 2021 10:20:29 +0000 (18:20 +0800)] 
[HUDI-2232]  [SQL] MERGE INTO fails with table having nested struct (#3379)

10 months ago[HUDI-2087] Support Append only in Flink stream (#3390)
yuzhaojing [Wed, 4 Aug 2021 09:53:20 +0000 (17:53 +0800)] 
[HUDI-2087] Support Append only in Flink stream (#3390)

Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
10 months ago[HUDI-2258] Metadata table for flink (#3381)
Danny Chan [Wed, 4 Aug 2021 02:54:55 +0000 (10:54 +0800)] 
[HUDI-2258] Metadata table for flink (#3381)

10 months ago[HUDI-2090] Ensure Disk Maps create a subfolder with appropriate prefixes and cleans...
rmahindra123 [Wed, 4 Aug 2021 00:51:25 +0000 (17:51 -0700)] 
[HUDI-2090] Ensure Disk Maps create a subfolder with appropriate prefixes and cleans them up on close  (#3329)

* Add UUID to the folder name for External Spillable File System

* Fix to ensure that Disk maps folders do not interefere across users

* Fix test

* Fix test

* Rebase with latest mater and address comments

* Add Shutdown Hooks for the Disk Map

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
10 months ago[HUDI-2255] Refactor Datasource options (#3373)
wenningd [Wed, 4 Aug 2021 00:50:30 +0000 (17:50 -0700)] 
[HUDI-2255] Refactor Datasource options (#3373)

Co-authored-by: Wenning Ding <wenningd@amazon.com>
10 months ago[HUDI-1371] [HUDI-1893] Support metadata based listing for Spark DataSource and Spark...
Udit Mehrotra [Tue, 3 Aug 2021 21:47:40 +0000 (14:47 -0700)] 
[HUDI-1371] [HUDI-1893] Support metadata based listing for Spark DataSource and Spark SQL (#2893)

10 months ago[HUDI-2272] Pass base file format to sync clients (#3397)
rmahindra123 [Tue, 3 Aug 2021 21:46:02 +0000 (14:46 -0700)] 
[HUDI-2272] Pass base file format to sync clients (#3397)

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
10 months ago[HUDI-2072] Add pre-commit validator framework (#3153)
satishkotha [Tue, 3 Aug 2021 19:07:45 +0000 (12:07 -0700)] 
[HUDI-2072] Add pre-commit validator framework (#3153)

* [HUDI-2072] Add pre-commit validator framework

* trigger Travis rebuild

10 months ago[HUDI-2269] Release the disk map resource for flink streaming reader (#3384) 3378/head
Danny Chan [Tue, 3 Aug 2021 05:55:35 +0000 (13:55 +0800)] 
[HUDI-2269] Release the disk map resource for flink streaming reader (#3384)

10 months ago[HUDI-2225] Add a compaction job in hudi-examples (#3347)
Sagar Sumit [Tue, 3 Aug 2021 03:31:56 +0000 (09:01 +0530)] 
[HUDI-2225] Add a compaction job in hudi-examples (#3347)

10 months ago[MINOR] Improving runtime of TestStructuredStreaming by 2 mins (#3382)
vinoth chandar [Mon, 2 Aug 2021 20:42:46 +0000 (13:42 -0700)] 
[MINOR] Improving runtime of TestStructuredStreaming by 2 mins (#3382)

10 months ago[HUDI-2177][HUDI-2200] Adding virtual keys support for MOR table (#3315)
Sivabalan Narayanan [Mon, 2 Aug 2021 13:45:09 +0000 (09:45 -0400)] 
[HUDI-2177][HUDI-2200] Adding virtual keys support for MOR table (#3315)

10 months ago[HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieCl...
zhangyue19921010 [Mon, 2 Aug 2021 00:07:59 +0000 (08:07 +0800)] 
[HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering (#3259)

* add --mode schedule/execute/scheduleandexecute

* fix checkstyle

* add UT testHoodieAsyncClusteringJobWithScheduleAndExecute

* log changed

* try to make ut success

* try to fix ut

* modify ut

* review changed

* code review

* code review

* code review

* code review

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
10 months ago[HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle (#3341)
Gary Li [Fri, 30 Jul 2021 09:36:57 +0000 (17:36 +0800)] 
[HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle (#3341)

10 months ago[HUDI-2184] Support setting hive sync partition extractor class based on flink config...
swuferhong [Fri, 30 Jul 2021 09:24:00 +0000 (17:24 +0800)] 
[HUDI-2184] Support setting hive sync partition extractor class based on flink configuration (#3284)

10 months ago[HUDI-2254] Builtin sort operator for flink bulk insert (#3372)
Danny Chan [Fri, 30 Jul 2021 08:58:11 +0000 (16:58 +0800)] 
[HUDI-2254] Builtin sort operator for flink bulk insert (#3372)

10 months ago[HUDI-2252] Default consumes from the latest instant for flink streaming reader ...
swuferhong [Fri, 30 Jul 2021 06:25:05 +0000 (14:25 +0800)] 
[HUDI-2252] Default consumes from the latest instant for flink streaming reader (#3368)

10 months ago[HUDI-2253] Refactoring few tests to reduce runningtime. DeltaStreamer and MultiDelta...
Sivabalan Narayanan [Fri, 30 Jul 2021 05:22:26 +0000 (01:22 -0400)] 
[HUDI-2253] Refactoring few tests to reduce runningtime. DeltaStreamer and MultiDeltaStreamer tests. Bulk insert row writer tests (#3371)

Co-authored-by: Sivabalan Narayanan <nsb@Sivabalans-MBP.attlocal.net>
10 months ago[HUDI-2251] Fix Exception Cause By Table Name Case Sensitivity For Append Mode Write...
pengzhiwei [Thu, 29 Jul 2021 21:36:56 +0000 (05:36 +0800)] 
[HUDI-2251] Fix Exception Cause By Table Name Case Sensitivity For Append Mode Write (#3367)

10 months ago[HUDI-2117] Unpersist the input rdd after the commit is completed to … (#3207)
Shawy Geng [Thu, 29 Jul 2021 15:16:58 +0000 (23:16 +0800)] 
[HUDI-2117] Unpersist the input rdd after the commit is completed to … (#3207)

Co-authored-by: Vinoth Chandar <vinoth@apache.org>
10 months ago[MINOR] fix check style error (#3365)
pengzhiwei [Thu, 29 Jul 2021 09:29:10 +0000 (17:29 +0800)] 
[MINOR] fix check style error (#3365)

10 months ago[HUDI-1425] Performance loss with the additional hoodieRecords.isEmpty() in HoodieSpa...
pengzhiwei [Thu, 29 Jul 2021 04:30:18 +0000 (12:30 +0800)] 
[HUDI-1425] Performance loss with the additional hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write (#2296)

10 months ago[HUDI-2241] Explicit parallelism for flink bulk insert (#3357)
Danny Chan [Thu, 29 Jul 2021 01:57:37 +0000 (09:57 +0800)] 
[HUDI-2241] Explicit parallelism for flink bulk insert (#3357)

10 months ago[HUDI-2228] Add option 'hive_sync.mode' for flink writer (#3352)
swuferhong [Wed, 28 Jul 2021 11:45:50 +0000 (19:45 +0800)] 
[HUDI-2228] Add option 'hive_sync.mode' for flink writer (#3352)

10 months ago[HUDI-2244] Fix database alreadyExists exception while hive sync (#3361)
swuferhong [Wed, 28 Jul 2021 11:40:16 +0000 (19:40 +0800)] 
[HUDI-2244] Fix database alreadyExists exception while hive sync (#3361)

10 months ago[HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew (#3362)
Danny Chan [Wed, 28 Jul 2021 11:26:37 +0000 (19:26 +0800)] 
[HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew (#3362)

10 months ago[HUDI-2230] Make codahale times transient to avoid serializable exceptions (#3345)
davehagman [Wed, 28 Jul 2021 06:45:09 +0000 (02:45 -0400)] 
[HUDI-2230] Make codahale times transient to avoid serializable exceptions (#3345)

10 months ago[HUDI-2044] Integrate consumers with rocksDB and compression within External Spillabl...
rmahindra123 [Wed, 28 Jul 2021 05:31:03 +0000 (22:31 -0700)] 
[HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map (#3318)

10 months ago[HUDI-2215] Add rateLimiter when Flink writes to hudi. (#3338)
mincwang [Wed, 28 Jul 2021 00:23:23 +0000 (08:23 +0800)] 
[HUDI-2215] Add rateLimiter when Flink writes to hudi. (#3338)

Co-authored-by: wangminchao <wangminchao@asinking.com>
10 months ago[HUDI-2227] Only sync hive meta on successful commit for flink batch writer (#3351)
Danny Chan [Tue, 27 Jul 2021 12:10:08 +0000 (20:10 +0800)] 
[HUDI-2227] Only sync hive meta on successful commit for flink batch writer (#3351)

10 months ago[HUDI-2223] Fix Alter Partitioned Table Failed (#3350)
pengzhiwei [Tue, 27 Jul 2021 12:01:04 +0000 (20:01 +0800)] 
[HUDI-2223] Fix Alter Partitioned Table Failed (#3350)

10 months ago[HUDI-2217] Fix no value present in incremental query on MOR (#3340)
Gary Li [Tue, 27 Jul 2021 09:30:01 +0000 (17:30 +0800)] 
[HUDI-2217] Fix no value present in incremental query on MOR (#3340)

10 months ago[HUDI-2219] Fix NPE of HoodieConfig (#3342)
Danny Chan [Tue, 27 Jul 2021 07:18:05 +0000 (15:18 +0800)] 
[HUDI-2219] Fix NPE of HoodieConfig (#3342)

10 months ago[HUDI-2209] Bulk insert for flink writer (#3334)
Danny Chan [Tue, 27 Jul 2021 02:58:23 +0000 (10:58 +0800)] 
[HUDI-2209] Bulk insert for flink writer (#3334)

10 months ago[MINOR] Correct the words accroding in the comments to according (#3343)
xiang2102 [Tue, 27 Jul 2021 00:48:58 +0000 (08:48 +0800)] 
[MINOR] Correct the words accroding in the comments to according (#3343)

Correct the words 'accroding' in the comments to 'according'

11 months ago[HUDI-2176, 2178, 2179] Adding virtual key support to COW table (#3306)
Sivabalan Narayanan [Mon, 26 Jul 2021 21:21:04 +0000 (17:21 -0400)] 
[HUDI-2176, 2178, 2179] Adding virtual key support to COW table (#3306)

11 months ago[HUDI-2214]residual temporary files after clustering are not cleaned up (#3335)
xiarixiaoyao [Mon, 26 Jul 2021 17:26:20 +0000 (01:26 +0800)] 
[HUDI-2214]residual temporary files after clustering are not cleaned up (#3335)

11 months ago[MINOR] Close log scanner after compaction completed (#3294)
Gary Li [Mon, 26 Jul 2021 09:39:13 +0000 (17:39 +0800)] 
[MINOR] Close log scanner after compaction completed (#3294)

11 months ago[HUDI-2216] Correct the words fiels in the comments to fields (#3339)
董可伦 [Sun, 25 Jul 2021 04:15:57 +0000 (12:15 +0800)] 
[HUDI-2216] Correct the words fiels in the comments to fields (#3339)

11 months ago[HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi...
rmahindra123 [Sat, 24 Jul 2021 04:33:34 +0000 (21:33 -0700)] 
[HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo (#3302)

11 months ago[MINOR] Replace deprecated method isDir with isDirectory (#3319)
Xuedong Luan [Sat, 24 Jul 2021 02:02:24 +0000 (10:02 +0800)] 
[MINOR] Replace deprecated method isDir with isDirectory (#3319)

11 months ago[HUDI-1848] Adding support for HMS for running DDL queries in hive-sy… (#2879)
jsbali [Fri, 23 Jul 2021 16:03:15 +0000 (21:33 +0530)] 
[HUDI-1848] Adding support for HMS for running DDL queries in hive-sy… (#2879)

* [HUDI-1848] Adding support for HMS for running DDL queries in hive-sync-tool

* [HUDI-1848] Fixing test cases

* [HUDI-1848] CR changes

* [HUDI-1848] Fix checkstyle violations

* [HUDI-1848] Fixed a bug when metastore api fails for complex schemas with multiple levels.

* [HUDI-1848] Adding the complex schema and resolving merge conflicts

* [HUDI-1848] Adding some more javadocs

* [HUDI-1848] Added javadocs for DDLExecutor impls

* [HUDI-1848] Fixed style issue

11 months ago[HUDI-2213] Remove unnecessary parameter for HoodieMetrics constructor and fix NPE...
Xuedong Luan [Fri, 23 Jul 2021 11:57:35 +0000 (19:57 +0800)] 
[HUDI-2213] Remove unnecessary parameter for HoodieMetrics constructor and fix NPE in UT (#3333)

11 months ago[HUDI-2212] Missing PrimaryKey In Hoodie Properties For CTAS Table (#3332)
pengzhiwei [Fri, 23 Jul 2021 07:21:57 +0000 (15:21 +0800)] 
[HUDI-2212] Missing PrimaryKey In Hoodie Properties For CTAS Table (#3332)

11 months ago[HUDI-2211] Fix NullPointerException in TestHoodieConsoleMetrics (#3331)
Xuedong Luan [Fri, 23 Jul 2021 03:22:54 +0000 (11:22 +0800)] 
[HUDI-2211] Fix NullPointerException in TestHoodieConsoleMetrics (#3331)