iceberg.git
25 hours agoSpark: Update Antlr in Spark 3.2 extensions to 4.8 (#5208) 0.13.x
John Zhuge [Wed, 6 Jul 2022 16:52:59 +0000 (09:52 -0700)] 
Spark: Update Antlr in Spark 3.2 extensions to 4.8 (#5208)

This will match the Antlr version in Spart 3.2.

5 weeks agoAdd version.txt for release 0.13.2 apache-iceberg-0.13.2 apache-iceberg-0.13.2-rc1
Russell Spitzer [Wed, 1 Jun 2022 19:37:18 +0000 (14:37 -0500)] 
Add version.txt for release 0.13.2

5 weeks agoDev: Fix Source Release Script (#4932)
Russell Spitzer [Wed, 1 Jun 2022 18:48:42 +0000 (13:48 -0500)] 
Dev: Fix Source Release Script (#4932)

5 weeks agoSpark: Extend commit unknown exception handling to SparkPositionDeltaWrite (#4893)
Prashant Singh [Wed, 1 Jun 2022 14:37:52 +0000 (20:07 +0530)] 
Spark: Extend commit unknown exception handling to SparkPositionDeltaWrite (#4893)

Co-authored-by: Prashant Singh <psinghvk@amazon.com>
5 weeks agoCore: Fix query failure when using projection on top of partitions metadata table...
Szehon Ho [Sun, 29 May 2022 20:39:19 +0000 (13:39 -0700)] 
Core: Fix query failure when using projection on top of partitions metadata table (#4720) (#4890)

6 weeks agoSpark: Fix Alignment of Merge Commands with Mixed Case (#4848) (#4874)
Russell Spitzer [Wed, 25 May 2022 23:54:21 +0000 (16:54 -0700)] 
Spark: Fix Alignment of Merge Commands with Mixed Case (#4848) (#4874)

* Spark: Fix Alignment of Merge Commands with Mixed Case

Prior to this a mixed-case insert statement would fail to be marked
as aligned after our alignment rule was applied. This would then fail the
entire MERGE INTO command. The commands were correctly aligned but
our alignment check was always case sensitive.

6 weeks agoSpark: Backport CommitStateUnknownException handling for RewriteManifestSparkAction...
Eduard Tudenhöfner [Tue, 24 May 2022 18:50:24 +0000 (20:50 +0200)] 
Spark: Backport CommitStateUnknownException handling for RewriteManifestSparkAction (#4850) (#4854)

Co-authored-by: Prashant Singh <psinghvk@amazon.com>
6 weeks agoCore: Backport filter pushdown fix for metadata tables with evolved specs to 0.13...
Szehon Ho [Tue, 24 May 2022 14:58:39 +0000 (07:58 -0700)] 
Core: Backport filter pushdown fix for metadata tables with evolved specs to 0.13 (#4520) (#4569)

6 weeks agoSpark: Handle CommitStateUnknown exception in RewriteManifestSparkAction (#4836)...
Eduard Tudenhöfner [Tue, 24 May 2022 14:56:17 +0000 (16:56 +0200)] 
Spark: Handle CommitStateUnknown exception in RewriteManifestSparkAction (#4836) (#4852)

Co-authored-by: Prashant Singh <35593236+singhpk234@users.noreply.github.com>
Co-authored-by: Prashant Singh <psinghvk@amazon.com>
6 weeks agoNessie: Fix NPE while accessing refreshed table in nessie catalog (#4509) (#4840)
Eduard Tudenhöfner [Mon, 23 May 2022 23:55:19 +0000 (01:55 +0200)] 
Nessie: Fix NPE while accessing refreshed table in nessie catalog (#4509) (#4840)

Co-authored-by: Ajantha Bhat <ajanthabhat@gmail.com>
6 weeks agoFlink: Backport upsert delete file metadata fixes to 0.13 (#4786)
Eduard Tudenhöfner [Thu, 19 May 2022 20:20:47 +0000 (22:20 +0200)] 
Flink: Backport upsert delete file metadata fixes to 0.13 (#4786)

Co-authored-by: Kyle Bendickson <kjbendickson@gmail.com>
Co-authored-by: liliwei <hililiwei@gmail.com>
Co-authored-by: wangzeyu <1249369293@qq.com>
Co-authored-by: openinx <openinx@gmail.com>
7 weeks agoSpark: Update commit state unknown handling (backport to 0.13) (#4787)
Eduard Tudenhöfner [Tue, 17 May 2022 16:49:25 +0000 (18:49 +0200)] 
Spark: Update commit state unknown handling (backport to 0.13) (#4787)

Co-authored-by: Russell Spitzer <rspitzer@apple.com>
7 weeks agoCore: Fix transaction retry logic (#4464) (#4783)
Eduard Tudenhöfner [Tue, 17 May 2022 16:46:41 +0000 (18:46 +0200)] 
Core: Fix transaction retry logic (#4464) (#4783)

Co-authored-by: Ajantha Bhat <ajanthabhat@gmail.com>
7 weeks agoCore: Fix delete file handling in upgraded tables with rewritten manifests (#4514...
Eduard Tudenhöfner [Tue, 17 May 2022 16:45:38 +0000 (18:45 +0200)] 
Core: Fix delete file handling in upgraded tables with rewritten manifests (#4514) (#4782)

Co-authored-by: vanliu <vanliu@tencent.com>
7 weeks agoSpark: Fix NPEs in Spark value converter (#4663) (#4781)
Eduard Tudenhöfner [Tue, 17 May 2022 16:45:08 +0000 (18:45 +0200)] 
Spark: Fix NPEs in Spark value converter (#4663) (#4781)

Co-authored-by: Edgar Rodriguez <edgar.rodriguez@airbnb.com>
7 weeks agoCore: Fix table corruption from OOM during commit cleanup (#4673) (#4779)
Eduard Tudenhöfner [Tue, 17 May 2022 16:44:31 +0000 (18:44 +0200)] 
Core: Fix table corruption from OOM during commit cleanup (#4673) (#4779)

Co-authored-by: Ryan Blue <blue@apache.org>
8 weeks agoFlink 1.12: Log a warning message when upsert is enabled (#4754)
Kyle Bendickson [Thu, 12 May 2022 14:24:35 +0000 (07:24 -0700)] 
Flink 1.12: Log a warning message when upsert is enabled (#4754)

2 months agoCore: Fixes read metadata failed after dropped partition transform for V1 format...
Xianyang Liu [Mon, 18 Apr 2022 17:12:14 +0000 (01:12 +0800)] 
Core: Fixes read metadata failed after dropped partition transform for V1 format (#3411) (#4572)

4 months agoCore: Remove accidentally added method to TableMetadata (#4155)
Wing Yew Poon [Fri, 18 Feb 2022 23:36:10 +0000 (15:36 -0800)] 
Core: Remove accidentally added method to TableMetadata (#4155)

4 months agoCore: Fix history timestamp for rollbacks (#4135)
Rishi [Thu, 17 Feb 2022 18:03:43 +0000 (10:03 -0800)] 
Core: Fix history timestamp for rollbacks (#4135)

4 months agoAdd version.txt for release 0.13.1 apache-iceberg-0.13.1
Jack Ye [Thu, 10 Feb 2022 23:21:35 +0000 (15:21 -0800)] 
Add version.txt for release 0.13.1

4 months agoFlink: Ensure temp manifest names are unique across tasks (#3986)
Peidian li [Fri, 4 Feb 2022 17:38:39 +0000 (01:38 +0800)] 
Flink: Ensure temp manifest names are unique across tasks (#3986)

4 months agoSpark: Fix create table in Hadoop catalog root namespace (#4024)
Cheng Pan [Wed, 2 Feb 2022 22:06:16 +0000 (06:06 +0800)] 
Spark: Fix create table in Hadoop catalog root namespace (#4024)

4 months agoSpark 3.2: Fix predicate pushdown in row-level operations (#4023)
Anton Okolnychyi [Tue, 1 Feb 2022 20:46:48 +0000 (12:46 -0800)] 
Spark 3.2: Fix predicate pushdown in row-level operations (#4023)

5 months agoAdd version.txt for release 0.13.0 apache-iceberg-0.13.0
Jack Ye [Fri, 28 Jan 2022 08:56:28 +0000 (00:56 -0800)] 
Add version.txt for release 0.13.0

5 months agoSpark 3.2: Fix cardinality check for alternative join implementations (#3992) release-base-0.13.0
Anton Okolnychyi [Fri, 28 Jan 2022 01:30:04 +0000 (17:30 -0800)] 
Spark 3.2: Fix cardinality check for alternative join implementations (#3992)

5 months agoDocs: Add s3.checksum-enabled to AWS (#3996)
Ashish Singh [Fri, 28 Jan 2022 00:17:56 +0000 (16:17 -0800)] 
Docs: Add s3.checksum-enabled to AWS (#3996)

5 months agoDocs: Fix MapType example (#3993)
liliwei [Fri, 28 Jan 2022 00:17:01 +0000 (08:17 +0800)] 
Docs: Fix MapType example (#3993)

5 months agoDocs: Update release instructions (#3982)
Jack Ye [Fri, 28 Jan 2022 00:14:47 +0000 (16:14 -0800)] 
Docs: Update release instructions (#3982)

5 months agoAWS: Support checksum validation with S3 eTags (#3813)
Ashish Singh [Tue, 25 Jan 2022 20:50:19 +0000 (12:50 -0800)] 
AWS: Support checksum validation with S3 eTags (#3813)

* [S3FileIO] Add capability to perform checksum validations using S3 eTags.

* fix checkstyle error

* Update to move checksum checks to s3 server side

* Enable s3 checksum checks in aws integration tests

* Catch protocol error and log helpful error message

* Use digest bytes instead of MessageDigest and update tests

* Fix checkstyle failure

* Use DigestOutputStream

* Remove redundant spaces

* rename etag to checksum in leftover places

* address

* Remove ununsed import

* Config name change

* minor updates

5 months agoDocs: Add section to include instructions for Hive on Tez (#3944)
0xffmeta [Tue, 25 Jan 2022 10:54:36 +0000 (18:54 +0800)] 
Docs: Add section to include instructions for Hive on Tez (#3944)

5 months agoSpark 3.2: Revise distribution and ordering for merge-on-read DELETE (#3970)
Anton Okolnychyi [Tue, 25 Jan 2022 08:03:36 +0000 (00:03 -0800)] 
Spark 3.2: Revise distribution and ordering for merge-on-read DELETE (#3970)

5 months agoDocs: Add Amazon EMR announcement (#3976)
Rajarshi Sarkar [Tue, 25 Jan 2022 05:46:57 +0000 (11:16 +0530)] 
Docs: Add Amazon EMR announcement (#3976)

5 months agoAWS: fix Glue catalog for unknown commit status (#3967)
夏川和 [Tue, 25 Jan 2022 01:14:43 +0000 (17:14 -0800)] 
AWS: fix Glue catalog for unknown commit status (#3967)

5 months agoSpark 3.2: Add tests for copy-on-write MERGE distribution and ordering (#3964)
Anton Okolnychyi [Mon, 24 Jan 2022 22:48:45 +0000 (14:48 -0800)] 
Spark 3.2: Add tests for copy-on-write MERGE distribution and ordering (#3964)

5 months agoAWS: show old fields in Glue table (#3888)
夏川和 [Mon, 24 Jan 2022 21:29:48 +0000 (13:29 -0800)] 
AWS: show old fields in Glue table (#3888)

5 months agoCore: Add reserved UUID Table Property and Expose in HMS. (#3914)
Yufei Gu [Mon, 24 Jan 2022 20:39:47 +0000 (12:39 -0800)] 
Core: Add reserved UUID Table Property and Expose in HMS. (#3914)

Co-authored-by: Karuppayya Rajendran <karuppayya.rajendran@apple.com>
Co-authored-by: Yufei Gu <yufei_gu@apple.com>
5 months agoAWS: fix Iceberg to Glue schema conversion (#3887)
夏川和 [Mon, 24 Jan 2022 19:55:20 +0000 (11:55 -0800)] 
AWS: fix Iceberg to Glue schema conversion (#3887)

5 months agoCore: Fix delete file index with manifests of only existing files (#3943)
Peidian li [Mon, 24 Jan 2022 19:07:53 +0000 (03:07 +0800)] 
Core: Fix delete file index with manifests of only existing files (#3943)

5 months agoCore: Allow removing and adding the same partition field as a noop (#3954)
Nan Zhu [Mon, 24 Jan 2022 19:05:46 +0000 (11:05 -0800)] 
Core: Allow removing and adding the same partition field as a noop (#3954)

5 months agoHive: Make Iceberg table filter optional in HiveCatalog (#3908)
vanliu [Mon, 24 Jan 2022 19:01:37 +0000 (03:01 +0800)] 
Hive: Make Iceberg table filter optional in HiveCatalog (#3908)

This adds an option to return all Hive tables, not just Iceberg tables to avoid loading metadata and slowing down the operation.

5 months agoParquet: NPE in Parquet Writer Metrics when data value max bound will overflow (...
Szehon Ho [Mon, 24 Jan 2022 18:38:16 +0000 (10:38 -0800)] 
Parquet: NPE in Parquet Writer Metrics when data value max bound will overflow (#3760)

Previously when writing metrics whose max bound would overflow when incremented. This would result in a null value for the metrics and cause an NPE when put in the Metrics array. Now instead the null values are ignored if returned from truncate.

5 months agoPython: Fix type for partition_type struct fields (#3939) (#3940)
cccs-eric [Mon, 24 Jan 2022 16:27:23 +0000 (11:27 -0500)] 
Python: Fix type for partition_type struct fields (#3939) (#3940)

Signed-off-by: cccs-eric <eric.ladouceur@cyber.gc.ca>
5 months agoData: Read metrics in parallel during TableMigration (#3876)
kingeasternsun [Mon, 24 Jan 2022 13:40:52 +0000 (21:40 +0800)] 
Data: Read metrics in parallel during TableMigration (#3876)

Adds a parameter for reading the metrics of files in parallel, rather than one at a time in TableMigrationUtils.

Co-authored-by: King <wangdongyang@deepexi.com>
5 months agoCore: Added no-arg constructor in ResolvingFileIO (#3923)
Rajarshi Sarkar [Mon, 24 Jan 2022 00:39:55 +0000 (06:09 +0530)] 
Core: Added no-arg constructor in ResolvingFileIO (#3923)

5 months agoPython: Add FileIO, InputFile, and OutputFile abstract base classes (#3691)
Samuel Redai [Mon, 24 Jan 2022 00:37:31 +0000 (16:37 -0800)] 
Python: Add FileIO, InputFile, and OutputFile abstract base classes (#3691)

5 months agoCore: Fix an error message in BinPackStrategy (#3919)
Zhangg7723 [Mon, 24 Jan 2022 00:31:57 +0000 (08:31 +0800)] 
Core: Fix an error message in BinPackStrategy (#3919)

5 months agoCore: Deprecate the MERGE cardinality check property (#3953)
Anton Okolnychyi [Sat, 22 Jan 2022 01:05:48 +0000 (17:05 -0800)] 
Core: Deprecate the MERGE cardinality check property (#3953)

5 months agoFix SparkCatalog time travel check. (#3942)
Ryan Blue [Sat, 22 Jan 2022 00:28:28 +0000 (16:28 -0800)] 
Fix SparkCatalog time travel check. (#3942)

5 months agoSpark 3.2: Revise distribution and ordering in copy-on-write UPDATE (#3949)
Anton Okolnychyi [Fri, 21 Jan 2022 23:46:24 +0000 (15:46 -0800)] 
Spark 3.2: Revise distribution and ordering in copy-on-write UPDATE (#3949)

5 months agoSpark: Backport Streaming Test Refactors (#3948)
Russell Spitzer [Fri, 21 Jan 2022 20:00:19 +0000 (14:00 -0600)] 
Spark: Backport Streaming Test Refactors (#3948)

Back-porting test refactor from #3775

5 months agoSpark 3.2: Revise distribution and ordering in copy-on-write DELETE (#3930)
Anton Okolnychyi [Fri, 21 Jan 2022 19:38:29 +0000 (11:38 -0800)] 
Spark 3.2: Revise distribution and ordering in copy-on-write DELETE (#3930)

5 months agoJMH: Parameterize spark project version for JHM Benchmarks (#3946)
Eduard Tudenhöfner [Fri, 21 Jan 2022 17:09:39 +0000 (18:09 +0100)] 
JMH: Parameterize spark project version for JHM Benchmarks (#3946)

Given that we have multiple Spark project versions in the codebase and
that users might want to run a particular Benchmark from a specific
Spark version, we should make the Spark project version a parameter of
the JMH Benchmark Action.

5 months agoHive: Do not skip IO config serialization for metadata queries (#3911) 3952/head
Marton Bod [Thu, 20 Jan 2022 10:22:50 +0000 (11:22 +0100)] 
Hive: Do not skip IO config serialization for metadata queries (#3911)

5 months ago[Spark] Backport rewrite data files are eliminated by deletes to Spark v3.0 and Spark...
xloya [Thu, 20 Jan 2022 05:47:02 +0000 (13:47 +0800)] 
[Spark] Backport rewrite data files are eliminated by deletes to Spark v3.0 and Spark v3.1 (#3935)

Backport of #3724 to Spark 3.0 and 3.1

Co-authored-by: Jiebao Xiao <xiaojiebao@xiaomi.com>
5 months agoFlink 1.14: Add tests to check whether should remove meta columns in source reader...
liliwei [Thu, 20 Jan 2022 02:34:43 +0000 (10:34 +0800)] 
Flink 1.14: Add tests to check whether should remove meta columns in source reader (#3893)

5 months agoFlink 1.12: Fix SerializableTable with Kryo (#3926)
openinx [Thu, 20 Jan 2022 01:52:10 +0000 (09:52 +0800)] 
Flink 1.12: Fix SerializableTable with Kryo (#3926)

5 months agoSpark: Add helper to register truncate UDF (#3708)
xiaotianzhang01 [Thu, 20 Jan 2022 00:42:26 +0000 (08:42 +0800)] 
Spark: Add helper to register truncate UDF (#3708)

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>
5 months agoDocs: Add compression codec options (#3892)
liliwei [Wed, 19 Jan 2022 23:23:59 +0000 (07:23 +0800)] 
Docs: Add compression codec options (#3892)

5 months agoPython: Fix incorrect single-value encoding for boolean (#3924) (#3927)
cccs-eric [Wed, 19 Jan 2022 23:22:39 +0000 (18:22 -0500)] 
Python: Fix incorrect single-value encoding for boolean (#3924) (#3927)

Signed-off-by: cccs-eric <eric.ladouceur@cyber.gc.ca>
5 months agoPython: Fix quote handling in expression parser (#3875)
Pucheng Yang [Wed, 19 Jan 2022 23:21:44 +0000 (15:21 -0800)] 
Python: Fix quote handling in expression parser (#3875)

5 months agoFlink 1.14: Add Kryo tests for SerializableTable (#3925)
openinx [Wed, 19 Jan 2022 21:53:17 +0000 (05:53 +0800)] 
Flink 1.14: Add Kryo tests for SerializableTable (#3925)

5 months agoFlink: Fix flaky tests that depend on row order (#3931)
Kyle Bendickson [Wed, 19 Jan 2022 19:05:21 +0000 (11:05 -0800)] 
Flink: Fix flaky tests that depend on row order (#3931)

5 months ago[Spark][Core]: Support RewriteDataFiles when Files are Completely Eliminated by Delet...
xloya [Wed, 19 Jan 2022 16:34:43 +0000 (00:34 +0800)] 
[Spark][Core]: Support RewriteDataFiles when Files are Completely Eliminated by Deletes (#3724)

Previously, RewriteDataFiles would fail if the outcome of a rewrite was the complete removal of all DataFiles, this is actually now a possibility given Merge on Read so it is now allowed.

Co-authored-by: Jiebao Xiao <xiaojiebao@xiaomi.com>
5 months agoFlink: Fix classloader in Avro ManifestReader (#3906)
Yi Tang [Wed, 19 Jan 2022 00:07:21 +0000 (08:07 +0800)] 
Flink: Fix classloader in Avro ManifestReader (#3906)

5 months agoPython: Expand primitive types to individual classes (#3839)
Nick Ouellet [Wed, 19 Jan 2022 00:03:34 +0000 (19:03 -0500)] 
Python: Expand primitive types to individual classes (#3839)

Co-authored-by: Sam Redai <sam@tabular.io>
5 months agoBuild: Fix source-release script in actions, add git remote validation (#3915)
Kyle Bendickson [Wed, 19 Jan 2022 00:02:41 +0000 (16:02 -0800)] 
Build: Fix source-release script in actions, add git remote validation (#3915)

5 months agoSpark 3.2: Add tests for resolving star actions in MERGE by name (#3918)
Anton Okolnychyi [Tue, 18 Jan 2022 20:17:52 +0000 (12:17 -0800)] 
Spark 3.2: Add tests for resolving star actions in MERGE by name (#3918)

Co-authored-by: Kyle Bendickson <kjbendickson@gmail.com>
5 months agoSpark 3.2: Add tests for multiple NOT MATCHED clauses (#3917)
Anton Okolnychyi [Tue, 18 Jan 2022 17:31:59 +0000 (09:31 -0800)] 
Spark 3.2: Add tests for multiple NOT MATCHED clauses (#3917)

5 months agoDocs: Add LOCALLY ORDERED BY and DISTRIBUTED BY clauses (#3820)
xiaotianzhang01 [Tue, 18 Jan 2022 16:55:59 +0000 (00:55 +0800)] 
Docs: Add LOCALLY ORDERED BY and DISTRIBUTED BY clauses (#3820)

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>
5 months agoDocs: Link expire_snapshots to table expiration properties (#3878)
liliwei [Tue, 18 Jan 2022 16:24:10 +0000 (00:24 +0800)] 
Docs: Link expire_snapshots to table expiration properties (#3878)

5 months agoSpark 3.2: Implement merge-on-read DELETE (#3763)
Anton Okolnychyi [Tue, 18 Jan 2022 07:19:44 +0000 (23:19 -0800)] 
Spark 3.2: Implement merge-on-read DELETE (#3763)

5 months agoCore: Split FileScanTasks on Offsets (#460) (#3292)
Russell Spitzer [Sat, 15 Jan 2022 05:42:26 +0000 (23:42 -0600)] 
Core: Split FileScanTasks on Offsets (#460) (#3292)

Previously FileScanTasks would only be split if the exceed the target split size of requested. This prevented the combination of tasks which were smaller than the split size, but could be combined to make a request closer to the requested split size. To fix this we split all files on their offsets when we are splitting, and then recombine them during the creation of scan tasks to try to hit the desired split sizes.

5 months agoSpark 3.2: Implement copy-on-write MERGE (#3804)
Anton Okolnychyi [Fri, 14 Jan 2022 19:21:13 +0000 (11:21 -0800)] 
Spark 3.2: Implement copy-on-write MERGE (#3804)

5 months agoSpark : Support parallelism in RemoveOrphanFiles (#3872)
Hongyue/Steve Zhang [Fri, 14 Jan 2022 18:21:33 +0000 (10:21 -0800)] 
Spark : Support parallelism in RemoveOrphanFiles (#3872)

Co-authored-by: Steve Zhang <hongyue_zhang@apple.com>
5 months agoParquet: Lazily initialize the underlying writer in ParquetWriter (#3780)
Ryan Blue [Thu, 13 Jan 2022 20:17:55 +0000 (12:17 -0800)] 
Parquet: Lazily initialize the underlying writer in ParquetWriter (#3780)

Co-authored-by: Tim Steinbach <tim.steinbach@shopify.com>
5 months agoAPI: Register existing tables in Iceberg HiveCatalog (#3851)
Anurag Mantripragada [Thu, 13 Jan 2022 17:46:10 +0000 (09:46 -0800)] 
API: Register existing tables in Iceberg HiveCatalog (#3851)

Co-authored-by: Anton Okolnychyi <aokolnychyi@apple.com>
5 months agoBump Nessie from 0.17.0 to 0.18.0 (#3890)
Robert Stupp [Thu, 13 Jan 2022 12:28:08 +0000 (13:28 +0100)] 
Bump Nessie from 0.17.0 to 0.18.0 (#3890)

5 months agoTest: Make sure to delete temp folders (#3790)
Rui Li [Thu, 13 Jan 2022 06:00:12 +0000 (14:00 +0800)] 
Test: Make sure to delete temp folders (#3790)

5 months agoSpark: Reduce requests from SparkSessionCatalog.invalidateTable (#3861)
smallx [Thu, 13 Jan 2022 01:11:44 +0000 (09:11 +0800)] 
Spark: Reduce requests from SparkSessionCatalog.invalidateTable (#3861)

5 months agoSpark 3.2: Push down partition filter when importing file tables (#3745)
Huaxin Gao [Wed, 12 Jan 2022 23:38:54 +0000 (15:38 -0800)] 
Spark 3.2: Push down partition filter when importing file tables (#3745)

5 months agoSpark 3.1: Add Spark UI metrics for merge into DynamicFileFilterExec (#3882)
Chen Zhang [Wed, 12 Jan 2022 16:58:03 +0000 (00:58 +0800)] 
Spark 3.1: Add Spark UI metrics for merge into DynamicFileFilterExec (#3882)

Co-authored-by: zhangchen351 <zhangchen351@jd.com>
5 months agoSpark 3.1: Fix binary literals in pushdown filters (#3728)
xiaotianzhang01 [Wed, 12 Jan 2022 16:55:44 +0000 (00:55 +0800)] 
Spark 3.1: Fix binary literals in pushdown filters (#3728)

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>
5 months agoSpark 3.0: Add Spark UI metrics for merge into DynamicFileFilterExec (#3863)
Chen Zhang [Tue, 11 Jan 2022 17:49:34 +0000 (01:49 +0800)] 
Spark 3.0: Add Spark UI metrics for merge into DynamicFileFilterExec (#3863)

Co-authored-by: zhangchen351 <zhangchen351@jd.com>
5 months agoAllow using a custom NessieClientBuilder implementation (#3877)
Robert Stupp [Tue, 11 Jan 2022 12:55:31 +0000 (13:55 +0100)] 
Allow using a custom NessieClientBuilder implementation (#3877)

Nessie defaults to use the HttpClientBuilder, but certain use cases
require a custom client builder implementation. This change allows
this by having a new configuration option.

5 months agoBuild: Suppress warning about Flink nanosecond access (#3868)
zhang chaoming [Mon, 10 Jan 2022 23:16:38 +0000 (07:16 +0800)] 
Build: Suppress warning about Flink nanosecond access (#3868)

Co-authored-by: zhangchaoming <zhangchaoming@360.com>
5 months agoBuild: Only use scalastyle plugin with scala modules (#3869)
Eduard Tudenhöfner [Mon, 10 Jan 2022 21:34:45 +0000 (22:34 +0100)] 
Build: Only use scalastyle plugin with scala modules (#3869)

5 months agoFlink 1.14: Add FLIP-27 Iceberg source split (#3870)
Steven Zhen Wu [Mon, 10 Jan 2022 21:31:38 +0000 (13:31 -0800)] 
Flink 1.14: Add FLIP-27 Iceberg source split (#3870)

5 months agoRevert "Flink: Add FLIP-27 Iceberg source split (#3501)" (#3871)
Steven Zhen Wu [Mon, 10 Jan 2022 21:30:44 +0000 (13:30 -0800)] 
Revert "Flink: Add FLIP-27 Iceberg source split (#3501)" (#3871)

This reverts commit d2c26a02190a16539c8c0621c4d8aac2e9e3ec6c.

5 months agoDocs: Update copyright year in site mkdocs file to 2022 (#3873)
Kyle Bendickson [Mon, 10 Jan 2022 21:30:18 +0000 (13:30 -0800)] 
Docs: Update copyright year in site mkdocs file to 2022 (#3873)

5 months agoSpark: Fix table UUID exceptions with CachingCatalog (#3837)
smallx [Sun, 9 Jan 2022 22:49:39 +0000 (06:49 +0800)] 
Spark: Fix table UUID exceptions with CachingCatalog (#3837)

5 months agoCore: Replace set with bitmap for faster delete filtering (#3535)
Yufei Gu [Sun, 9 Jan 2022 22:48:39 +0000 (14:48 -0800)] 
Core: Replace set with bitmap for faster delete filtering (#3535)

5 months agoBuild: Update NOTICE to include copyright to 2022 (#3855)
Kyle Bendickson [Sun, 9 Jan 2022 22:35:43 +0000 (14:35 -0800)] 
Build: Update NOTICE to include copyright to 2022 (#3855)

5 months agoFlink 1.13: Fix SerializableTable with Kryo (#3857)
openinx [Sun, 9 Jan 2022 22:23:58 +0000 (06:23 +0800)] 
Flink 1.13: Fix SerializableTable with Kryo (#3857)

5 months agoFlink: Add FLIP-27 Iceberg source split (#3501)
Steven Zhen Wu [Sun, 9 Jan 2022 18:15:08 +0000 (10:15 -0800)] 
Flink: Add FLIP-27 Iceberg source split (#3501)

5 months agoBuild: Upgrade gradle to 7.3.3 (#3793)
Karl Manong [Fri, 7 Jan 2022 22:26:37 +0000 (06:26 +0800)] 
Build: Upgrade gradle to 7.3.3 (#3793)

5 months agoCore: Fix partitions metadata table with a column named partition (#3845)
Huaxin Gao [Fri, 7 Jan 2022 22:12:49 +0000 (14:12 -0800)] 
Core: Fix partitions metadata table with a column named partition (#3845)

* Partition Metadata table breaks with a partition column named 'partitition'

* address comments

* fix style

* add checkConflicts

* remove TestTables.clearTables in the end of test

* address comments

* checkConflict => checkConflicts

5 months agoSpec: Initial OpenAPI template for a REST catalog (#3770)
Kyle Bendickson [Fri, 7 Jan 2022 21:57:04 +0000 (13:57 -0800)] 
Spec: Initial OpenAPI template for a REST catalog (#3770)

5 months agoDocs: Fix broken link of Dremio with Iceberg (#3856)
Ajantha Bhat [Fri, 7 Jan 2022 09:03:36 +0000 (14:33 +0530)] 
Docs: Fix broken link of Dremio with Iceberg (#3856)

6 months agoCore: Allow adding a dropped partition column name (#3632)
Nan Zhu [Tue, 4 Jan 2022 23:46:45 +0000 (15:46 -0800)] 
Core: Allow adding a dropped partition column name (#3632)