iceberg.git
4 months agoAdd version.txt for release 0.13.1 apache-iceberg-0.13.1
Jack Ye [Thu, 10 Feb 2022 23:21:35 +0000 (15:21 -0800)] 
Add version.txt for release 0.13.1

4 months agoFlink: Ensure temp manifest names are unique across tasks (#3986)
Peidian li [Fri, 4 Feb 2022 17:38:39 +0000 (01:38 +0800)] 
Flink: Ensure temp manifest names are unique across tasks (#3986)

4 months agoSpark: Fix create table in Hadoop catalog root namespace (#4024)
Cheng Pan [Wed, 2 Feb 2022 22:06:16 +0000 (06:06 +0800)] 
Spark: Fix create table in Hadoop catalog root namespace (#4024)

4 months agoSpark 3.2: Fix predicate pushdown in row-level operations (#4023)
Anton Okolnychyi [Tue, 1 Feb 2022 20:46:48 +0000 (12:46 -0800)] 
Spark 3.2: Fix predicate pushdown in row-level operations (#4023)

4 months agoAdd version.txt for release 0.13.0 apache-iceberg-0.13.0
Jack Ye [Fri, 28 Jan 2022 08:56:28 +0000 (00:56 -0800)] 
Add version.txt for release 0.13.0

4 months agoSpark 3.2: Fix cardinality check for alternative join implementations (#3992) release-base-0.13.0
Anton Okolnychyi [Fri, 28 Jan 2022 01:30:04 +0000 (17:30 -0800)] 
Spark 3.2: Fix cardinality check for alternative join implementations (#3992)

4 months agoDocs: Add s3.checksum-enabled to AWS (#3996)
Ashish Singh [Fri, 28 Jan 2022 00:17:56 +0000 (16:17 -0800)] 
Docs: Add s3.checksum-enabled to AWS (#3996)

4 months agoDocs: Fix MapType example (#3993)
liliwei [Fri, 28 Jan 2022 00:17:01 +0000 (08:17 +0800)] 
Docs: Fix MapType example (#3993)

4 months agoDocs: Update release instructions (#3982)
Jack Ye [Fri, 28 Jan 2022 00:14:47 +0000 (16:14 -0800)] 
Docs: Update release instructions (#3982)

4 months agoAWS: Support checksum validation with S3 eTags (#3813)
Ashish Singh [Tue, 25 Jan 2022 20:50:19 +0000 (12:50 -0800)] 
AWS: Support checksum validation with S3 eTags (#3813)

* [S3FileIO] Add capability to perform checksum validations using S3 eTags.

* fix checkstyle error

* Update to move checksum checks to s3 server side

* Enable s3 checksum checks in aws integration tests

* Catch protocol error and log helpful error message

* Use digest bytes instead of MessageDigest and update tests

* Fix checkstyle failure

* Use DigestOutputStream

* Remove redundant spaces

* rename etag to checksum in leftover places

* address

* Remove ununsed import

* Config name change

* minor updates

4 months agoDocs: Add section to include instructions for Hive on Tez (#3944)
0xffmeta [Tue, 25 Jan 2022 10:54:36 +0000 (18:54 +0800)] 
Docs: Add section to include instructions for Hive on Tez (#3944)

4 months agoSpark 3.2: Revise distribution and ordering for merge-on-read DELETE (#3970)
Anton Okolnychyi [Tue, 25 Jan 2022 08:03:36 +0000 (00:03 -0800)] 
Spark 3.2: Revise distribution and ordering for merge-on-read DELETE (#3970)

4 months agoDocs: Add Amazon EMR announcement (#3976)
Rajarshi Sarkar [Tue, 25 Jan 2022 05:46:57 +0000 (11:16 +0530)] 
Docs: Add Amazon EMR announcement (#3976)

4 months agoAWS: fix Glue catalog for unknown commit status (#3967)
夏川和 [Tue, 25 Jan 2022 01:14:43 +0000 (17:14 -0800)] 
AWS: fix Glue catalog for unknown commit status (#3967)

4 months agoSpark 3.2: Add tests for copy-on-write MERGE distribution and ordering (#3964)
Anton Okolnychyi [Mon, 24 Jan 2022 22:48:45 +0000 (14:48 -0800)] 
Spark 3.2: Add tests for copy-on-write MERGE distribution and ordering (#3964)

4 months agoAWS: show old fields in Glue table (#3888)
夏川和 [Mon, 24 Jan 2022 21:29:48 +0000 (13:29 -0800)] 
AWS: show old fields in Glue table (#3888)

4 months agoCore: Add reserved UUID Table Property and Expose in HMS. (#3914)
Yufei Gu [Mon, 24 Jan 2022 20:39:47 +0000 (12:39 -0800)] 
Core: Add reserved UUID Table Property and Expose in HMS. (#3914)

Co-authored-by: Karuppayya Rajendran <karuppayya.rajendran@apple.com>
Co-authored-by: Yufei Gu <yufei_gu@apple.com>
4 months agoAWS: fix Iceberg to Glue schema conversion (#3887)
夏川和 [Mon, 24 Jan 2022 19:55:20 +0000 (11:55 -0800)] 
AWS: fix Iceberg to Glue schema conversion (#3887)

4 months agoCore: Fix delete file index with manifests of only existing files (#3943)
Peidian li [Mon, 24 Jan 2022 19:07:53 +0000 (03:07 +0800)] 
Core: Fix delete file index with manifests of only existing files (#3943)

4 months agoCore: Allow removing and adding the same partition field as a noop (#3954)
Nan Zhu [Mon, 24 Jan 2022 19:05:46 +0000 (11:05 -0800)] 
Core: Allow removing and adding the same partition field as a noop (#3954)

4 months agoHive: Make Iceberg table filter optional in HiveCatalog (#3908)
vanliu [Mon, 24 Jan 2022 19:01:37 +0000 (03:01 +0800)] 
Hive: Make Iceberg table filter optional in HiveCatalog (#3908)

This adds an option to return all Hive tables, not just Iceberg tables to avoid loading metadata and slowing down the operation.

4 months agoParquet: NPE in Parquet Writer Metrics when data value max bound will overflow (...
Szehon Ho [Mon, 24 Jan 2022 18:38:16 +0000 (10:38 -0800)] 
Parquet: NPE in Parquet Writer Metrics when data value max bound will overflow (#3760)

Previously when writing metrics whose max bound would overflow when incremented. This would result in a null value for the metrics and cause an NPE when put in the Metrics array. Now instead the null values are ignored if returned from truncate.

4 months agoPython: Fix type for partition_type struct fields (#3939) (#3940)
cccs-eric [Mon, 24 Jan 2022 16:27:23 +0000 (11:27 -0500)] 
Python: Fix type for partition_type struct fields (#3939) (#3940)

Signed-off-by: cccs-eric <eric.ladouceur@cyber.gc.ca>
4 months agoData: Read metrics in parallel during TableMigration (#3876)
kingeasternsun [Mon, 24 Jan 2022 13:40:52 +0000 (21:40 +0800)] 
Data: Read metrics in parallel during TableMigration (#3876)

Adds a parameter for reading the metrics of files in parallel, rather than one at a time in TableMigrationUtils.

Co-authored-by: King <wangdongyang@deepexi.com>
4 months agoCore: Added no-arg constructor in ResolvingFileIO (#3923)
Rajarshi Sarkar [Mon, 24 Jan 2022 00:39:55 +0000 (06:09 +0530)] 
Core: Added no-arg constructor in ResolvingFileIO (#3923)

4 months agoPython: Add FileIO, InputFile, and OutputFile abstract base classes (#3691)
Samuel Redai [Mon, 24 Jan 2022 00:37:31 +0000 (16:37 -0800)] 
Python: Add FileIO, InputFile, and OutputFile abstract base classes (#3691)

4 months agoCore: Fix an error message in BinPackStrategy (#3919)
Zhangg7723 [Mon, 24 Jan 2022 00:31:57 +0000 (08:31 +0800)] 
Core: Fix an error message in BinPackStrategy (#3919)

5 months agoCore: Deprecate the MERGE cardinality check property (#3953)
Anton Okolnychyi [Sat, 22 Jan 2022 01:05:48 +0000 (17:05 -0800)] 
Core: Deprecate the MERGE cardinality check property (#3953)

5 months agoFix SparkCatalog time travel check. (#3942)
Ryan Blue [Sat, 22 Jan 2022 00:28:28 +0000 (16:28 -0800)] 
Fix SparkCatalog time travel check. (#3942)

5 months agoSpark 3.2: Revise distribution and ordering in copy-on-write UPDATE (#3949)
Anton Okolnychyi [Fri, 21 Jan 2022 23:46:24 +0000 (15:46 -0800)] 
Spark 3.2: Revise distribution and ordering in copy-on-write UPDATE (#3949)

5 months agoSpark: Backport Streaming Test Refactors (#3948)
Russell Spitzer [Fri, 21 Jan 2022 20:00:19 +0000 (14:00 -0600)] 
Spark: Backport Streaming Test Refactors (#3948)

Back-porting test refactor from #3775

5 months agoSpark 3.2: Revise distribution and ordering in copy-on-write DELETE (#3930)
Anton Okolnychyi [Fri, 21 Jan 2022 19:38:29 +0000 (11:38 -0800)] 
Spark 3.2: Revise distribution and ordering in copy-on-write DELETE (#3930)

5 months agoJMH: Parameterize spark project version for JHM Benchmarks (#3946)
Eduard Tudenhöfner [Fri, 21 Jan 2022 17:09:39 +0000 (18:09 +0100)] 
JMH: Parameterize spark project version for JHM Benchmarks (#3946)

Given that we have multiple Spark project versions in the codebase and
that users might want to run a particular Benchmark from a specific
Spark version, we should make the Spark project version a parameter of
the JMH Benchmark Action.

5 months agoHive: Do not skip IO config serialization for metadata queries (#3911) 3952/head
Marton Bod [Thu, 20 Jan 2022 10:22:50 +0000 (11:22 +0100)] 
Hive: Do not skip IO config serialization for metadata queries (#3911)

5 months ago[Spark] Backport rewrite data files are eliminated by deletes to Spark v3.0 and Spark...
xloya [Thu, 20 Jan 2022 05:47:02 +0000 (13:47 +0800)] 
[Spark] Backport rewrite data files are eliminated by deletes to Spark v3.0 and Spark v3.1 (#3935)

Backport of #3724 to Spark 3.0 and 3.1

Co-authored-by: Jiebao Xiao <xiaojiebao@xiaomi.com>
5 months agoFlink 1.14: Add tests to check whether should remove meta columns in source reader...
liliwei [Thu, 20 Jan 2022 02:34:43 +0000 (10:34 +0800)] 
Flink 1.14: Add tests to check whether should remove meta columns in source reader (#3893)

5 months agoFlink 1.12: Fix SerializableTable with Kryo (#3926)
openinx [Thu, 20 Jan 2022 01:52:10 +0000 (09:52 +0800)] 
Flink 1.12: Fix SerializableTable with Kryo (#3926)

5 months agoSpark: Add helper to register truncate UDF (#3708)
xiaotianzhang01 [Thu, 20 Jan 2022 00:42:26 +0000 (08:42 +0800)] 
Spark: Add helper to register truncate UDF (#3708)

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>
5 months agoDocs: Add compression codec options (#3892)
liliwei [Wed, 19 Jan 2022 23:23:59 +0000 (07:23 +0800)] 
Docs: Add compression codec options (#3892)

5 months agoPython: Fix incorrect single-value encoding for boolean (#3924) (#3927)
cccs-eric [Wed, 19 Jan 2022 23:22:39 +0000 (18:22 -0500)] 
Python: Fix incorrect single-value encoding for boolean (#3924) (#3927)

Signed-off-by: cccs-eric <eric.ladouceur@cyber.gc.ca>
5 months agoPython: Fix quote handling in expression parser (#3875)
Pucheng Yang [Wed, 19 Jan 2022 23:21:44 +0000 (15:21 -0800)] 
Python: Fix quote handling in expression parser (#3875)

5 months agoFlink 1.14: Add Kryo tests for SerializableTable (#3925)
openinx [Wed, 19 Jan 2022 21:53:17 +0000 (05:53 +0800)] 
Flink 1.14: Add Kryo tests for SerializableTable (#3925)

5 months agoFlink: Fix flaky tests that depend on row order (#3931)
Kyle Bendickson [Wed, 19 Jan 2022 19:05:21 +0000 (11:05 -0800)] 
Flink: Fix flaky tests that depend on row order (#3931)

5 months ago[Spark][Core]: Support RewriteDataFiles when Files are Completely Eliminated by Delet...
xloya [Wed, 19 Jan 2022 16:34:43 +0000 (00:34 +0800)] 
[Spark][Core]: Support RewriteDataFiles when Files are Completely Eliminated by Deletes (#3724)

Previously, RewriteDataFiles would fail if the outcome of a rewrite was the complete removal of all DataFiles, this is actually now a possibility given Merge on Read so it is now allowed.

Co-authored-by: Jiebao Xiao <xiaojiebao@xiaomi.com>
5 months agoFlink: Fix classloader in Avro ManifestReader (#3906)
Yi Tang [Wed, 19 Jan 2022 00:07:21 +0000 (08:07 +0800)] 
Flink: Fix classloader in Avro ManifestReader (#3906)

5 months agoPython: Expand primitive types to individual classes (#3839)
Nick Ouellet [Wed, 19 Jan 2022 00:03:34 +0000 (19:03 -0500)] 
Python: Expand primitive types to individual classes (#3839)

Co-authored-by: Sam Redai <sam@tabular.io>
5 months agoBuild: Fix source-release script in actions, add git remote validation (#3915)
Kyle Bendickson [Wed, 19 Jan 2022 00:02:41 +0000 (16:02 -0800)] 
Build: Fix source-release script in actions, add git remote validation (#3915)

5 months agoSpark 3.2: Add tests for resolving star actions in MERGE by name (#3918)
Anton Okolnychyi [Tue, 18 Jan 2022 20:17:52 +0000 (12:17 -0800)] 
Spark 3.2: Add tests for resolving star actions in MERGE by name (#3918)

Co-authored-by: Kyle Bendickson <kjbendickson@gmail.com>
5 months agoSpark 3.2: Add tests for multiple NOT MATCHED clauses (#3917)
Anton Okolnychyi [Tue, 18 Jan 2022 17:31:59 +0000 (09:31 -0800)] 
Spark 3.2: Add tests for multiple NOT MATCHED clauses (#3917)

5 months agoDocs: Add LOCALLY ORDERED BY and DISTRIBUTED BY clauses (#3820)
xiaotianzhang01 [Tue, 18 Jan 2022 16:55:59 +0000 (00:55 +0800)] 
Docs: Add LOCALLY ORDERED BY and DISTRIBUTED BY clauses (#3820)

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>
5 months agoDocs: Link expire_snapshots to table expiration properties (#3878)
liliwei [Tue, 18 Jan 2022 16:24:10 +0000 (00:24 +0800)] 
Docs: Link expire_snapshots to table expiration properties (#3878)

5 months agoSpark 3.2: Implement merge-on-read DELETE (#3763)
Anton Okolnychyi [Tue, 18 Jan 2022 07:19:44 +0000 (23:19 -0800)] 
Spark 3.2: Implement merge-on-read DELETE (#3763)

5 months agoCore: Split FileScanTasks on Offsets (#460) (#3292)
Russell Spitzer [Sat, 15 Jan 2022 05:42:26 +0000 (23:42 -0600)] 
Core: Split FileScanTasks on Offsets (#460) (#3292)

Previously FileScanTasks would only be split if the exceed the target split size of requested. This prevented the combination of tasks which were smaller than the split size, but could be combined to make a request closer to the requested split size. To fix this we split all files on their offsets when we are splitting, and then recombine them during the creation of scan tasks to try to hit the desired split sizes.

5 months agoSpark 3.2: Implement copy-on-write MERGE (#3804)
Anton Okolnychyi [Fri, 14 Jan 2022 19:21:13 +0000 (11:21 -0800)] 
Spark 3.2: Implement copy-on-write MERGE (#3804)

5 months agoSpark : Support parallelism in RemoveOrphanFiles (#3872)
Hongyue/Steve Zhang [Fri, 14 Jan 2022 18:21:33 +0000 (10:21 -0800)] 
Spark : Support parallelism in RemoveOrphanFiles (#3872)

Co-authored-by: Steve Zhang <hongyue_zhang@apple.com>
5 months agoParquet: Lazily initialize the underlying writer in ParquetWriter (#3780)
Ryan Blue [Thu, 13 Jan 2022 20:17:55 +0000 (12:17 -0800)] 
Parquet: Lazily initialize the underlying writer in ParquetWriter (#3780)

Co-authored-by: Tim Steinbach <tim.steinbach@shopify.com>
5 months agoAPI: Register existing tables in Iceberg HiveCatalog (#3851)
Anurag Mantripragada [Thu, 13 Jan 2022 17:46:10 +0000 (09:46 -0800)] 
API: Register existing tables in Iceberg HiveCatalog (#3851)

Co-authored-by: Anton Okolnychyi <aokolnychyi@apple.com>
5 months agoBump Nessie from 0.17.0 to 0.18.0 (#3890)
Robert Stupp [Thu, 13 Jan 2022 12:28:08 +0000 (13:28 +0100)] 
Bump Nessie from 0.17.0 to 0.18.0 (#3890)

5 months agoTest: Make sure to delete temp folders (#3790)
Rui Li [Thu, 13 Jan 2022 06:00:12 +0000 (14:00 +0800)] 
Test: Make sure to delete temp folders (#3790)

5 months agoSpark: Reduce requests from SparkSessionCatalog.invalidateTable (#3861)
smallx [Thu, 13 Jan 2022 01:11:44 +0000 (09:11 +0800)] 
Spark: Reduce requests from SparkSessionCatalog.invalidateTable (#3861)

5 months agoSpark 3.2: Push down partition filter when importing file tables (#3745)
Huaxin Gao [Wed, 12 Jan 2022 23:38:54 +0000 (15:38 -0800)] 
Spark 3.2: Push down partition filter when importing file tables (#3745)

5 months agoSpark 3.1: Add Spark UI metrics for merge into DynamicFileFilterExec (#3882)
Chen Zhang [Wed, 12 Jan 2022 16:58:03 +0000 (00:58 +0800)] 
Spark 3.1: Add Spark UI metrics for merge into DynamicFileFilterExec (#3882)

Co-authored-by: zhangchen351 <zhangchen351@jd.com>
5 months agoSpark 3.1: Fix binary literals in pushdown filters (#3728)
xiaotianzhang01 [Wed, 12 Jan 2022 16:55:44 +0000 (00:55 +0800)] 
Spark 3.1: Fix binary literals in pushdown filters (#3728)

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>
5 months agoSpark 3.0: Add Spark UI metrics for merge into DynamicFileFilterExec (#3863)
Chen Zhang [Tue, 11 Jan 2022 17:49:34 +0000 (01:49 +0800)] 
Spark 3.0: Add Spark UI metrics for merge into DynamicFileFilterExec (#3863)

Co-authored-by: zhangchen351 <zhangchen351@jd.com>
5 months agoAllow using a custom NessieClientBuilder implementation (#3877)
Robert Stupp [Tue, 11 Jan 2022 12:55:31 +0000 (13:55 +0100)] 
Allow using a custom NessieClientBuilder implementation (#3877)

Nessie defaults to use the HttpClientBuilder, but certain use cases
require a custom client builder implementation. This change allows
this by having a new configuration option.

5 months agoBuild: Suppress warning about Flink nanosecond access (#3868)
zhang chaoming [Mon, 10 Jan 2022 23:16:38 +0000 (07:16 +0800)] 
Build: Suppress warning about Flink nanosecond access (#3868)

Co-authored-by: zhangchaoming <zhangchaoming@360.com>
5 months agoBuild: Only use scalastyle plugin with scala modules (#3869)
Eduard Tudenhöfner [Mon, 10 Jan 2022 21:34:45 +0000 (22:34 +0100)] 
Build: Only use scalastyle plugin with scala modules (#3869)

5 months agoFlink 1.14: Add FLIP-27 Iceberg source split (#3870)
Steven Zhen Wu [Mon, 10 Jan 2022 21:31:38 +0000 (13:31 -0800)] 
Flink 1.14: Add FLIP-27 Iceberg source split (#3870)

5 months agoRevert "Flink: Add FLIP-27 Iceberg source split (#3501)" (#3871)
Steven Zhen Wu [Mon, 10 Jan 2022 21:30:44 +0000 (13:30 -0800)] 
Revert "Flink: Add FLIP-27 Iceberg source split (#3501)" (#3871)

This reverts commit d2c26a02190a16539c8c0621c4d8aac2e9e3ec6c.

5 months agoDocs: Update copyright year in site mkdocs file to 2022 (#3873)
Kyle Bendickson [Mon, 10 Jan 2022 21:30:18 +0000 (13:30 -0800)] 
Docs: Update copyright year in site mkdocs file to 2022 (#3873)

5 months agoSpark: Fix table UUID exceptions with CachingCatalog (#3837)
smallx [Sun, 9 Jan 2022 22:49:39 +0000 (06:49 +0800)] 
Spark: Fix table UUID exceptions with CachingCatalog (#3837)

5 months agoCore: Replace set with bitmap for faster delete filtering (#3535)
Yufei Gu [Sun, 9 Jan 2022 22:48:39 +0000 (14:48 -0800)] 
Core: Replace set with bitmap for faster delete filtering (#3535)

5 months agoBuild: Update NOTICE to include copyright to 2022 (#3855)
Kyle Bendickson [Sun, 9 Jan 2022 22:35:43 +0000 (14:35 -0800)] 
Build: Update NOTICE to include copyright to 2022 (#3855)

5 months agoFlink 1.13: Fix SerializableTable with Kryo (#3857)
openinx [Sun, 9 Jan 2022 22:23:58 +0000 (06:23 +0800)] 
Flink 1.13: Fix SerializableTable with Kryo (#3857)

5 months agoFlink: Add FLIP-27 Iceberg source split (#3501)
Steven Zhen Wu [Sun, 9 Jan 2022 18:15:08 +0000 (10:15 -0800)] 
Flink: Add FLIP-27 Iceberg source split (#3501)

5 months agoBuild: Upgrade gradle to 7.3.3 (#3793)
Karl Manong [Fri, 7 Jan 2022 22:26:37 +0000 (06:26 +0800)] 
Build: Upgrade gradle to 7.3.3 (#3793)

5 months agoCore: Fix partitions metadata table with a column named partition (#3845)
Huaxin Gao [Fri, 7 Jan 2022 22:12:49 +0000 (14:12 -0800)] 
Core: Fix partitions metadata table with a column named partition (#3845)

* Partition Metadata table breaks with a partition column named 'partitition'

* address comments

* fix style

* add checkConflicts

* remove TestTables.clearTables in the end of test

* address comments

* checkConflict => checkConflicts

5 months agoSpec: Initial OpenAPI template for a REST catalog (#3770)
Kyle Bendickson [Fri, 7 Jan 2022 21:57:04 +0000 (13:57 -0800)] 
Spec: Initial OpenAPI template for a REST catalog (#3770)

5 months agoDocs: Fix broken link of Dremio with Iceberg (#3856)
Ajantha Bhat [Fri, 7 Jan 2022 09:03:36 +0000 (14:33 +0530)] 
Docs: Fix broken link of Dremio with Iceberg (#3856)

5 months agoCore: Allow adding a dropped partition column name (#3632)
Nan Zhu [Tue, 4 Jan 2022 23:46:45 +0000 (15:46 -0800)] 
Core: Allow adding a dropped partition column name (#3632)

5 months agoCore: Fix lost sequence number when rewriting with manifest merge (#3842)
xloya [Tue, 4 Jan 2022 20:17:02 +0000 (04:17 +0800)] 
Core: Fix lost sequence number when rewriting with manifest merge (#3842)

Co-authored-by: Jiebao Xiao <xiaojiebao@xiaomi.com>
5 months agoSpark 3.0: Enable test that delete at a snapshot fails (#3840)
Chen Zhang [Tue, 4 Jan 2022 20:10:22 +0000 (04:10 +0800)] 
Spark 3.0: Enable test that delete at a snapshot fails (#3840)

Co-authored-by: zhangchen <zhangchen351@jd.com>
5 months agoDocs: Update Spark for incremental scans (#3796)
Ajantha Bhat [Tue, 4 Jan 2022 17:53:39 +0000 (23:23 +0530)] 
Docs: Update Spark for incremental scans (#3796)

5 months agoDocs: Add stream-from-timestamp to Spark read options (#3732)
Rajarshi Sarkar [Tue, 4 Jan 2022 16:24:57 +0000 (21:54 +0530)] 
Docs: Add stream-from-timestamp to Spark read options (#3732)

5 months agoSpark: Document stream reads with SparkMicroBatchStream (#3749)
liliwei [Mon, 3 Jan 2022 19:18:28 +0000 (03:18 +0800)] 
Spark: Document stream reads with SparkMicroBatchStream (#3749)

5 months agoDocs: Add format-version table property (#3809)
Ajantha Bhat [Mon, 3 Jan 2022 17:59:04 +0000 (23:29 +0530)] 
Docs: Add format-version table property (#3809)

5 months agoSpark-3.0: Fix UnresolvedException for some filters in rewrite_data_files (#3794)
Ajantha Bhat [Thu, 30 Dec 2021 22:46:54 +0000 (04:16 +0530)] 
Spark-3.0: Fix UnresolvedException for some filters in rewrite_data_files (#3794)

5 months agoSpark 3.1: Fix UnresolvedException for some filters in rewrite_data_files (#3795)
Ajantha Bhat [Thu, 30 Dec 2021 22:46:18 +0000 (04:16 +0530)] 
Spark 3.1: Fix UnresolvedException for some filters in rewrite_data_files (#3795)

5 months agoBuild: Fix the -g option in source-release.sh (#3824)
openinx [Thu, 30 Dec 2021 22:26:36 +0000 (06:26 +0800)] 
Build: Fix the -g option in source-release.sh (#3824)

5 months agoCore: Add LockManager to HadoopTableOperations (#3663)
Nan Zhu [Thu, 30 Dec 2021 21:55:33 +0000 (13:55 -0800)] 
Core: Add LockManager to HadoopTableOperations (#3663)

5 months agoDocs: Update compatibility table (#3761)
liliwei [Wed, 29 Dec 2021 22:07:52 +0000 (06:07 +0800)] 
Docs: Update compatibility table (#3761)

5 months agoHive: Add config to disable FileIO conf serialization (#3752)
Marton Bod [Wed, 29 Dec 2021 21:45:11 +0000 (22:45 +0100)] 
Hive: Add config to disable FileIO conf serialization (#3752)

5 months agoAPI: Fix startsWith NullPointerException (#3645)
hbg [Wed, 29 Dec 2021 20:43:03 +0000 (04:43 +0800)] 
API: Fix startsWith NullPointerException (#3645)

Co-authored-by: bghuang <bghuang@tencent.com>
5 months agoCore: Fix deadlock in CachingCatalog (#3801)
Rafael Acevedo [Wed, 29 Dec 2021 20:19:35 +0000 (17:19 -0300)] 
Core: Fix deadlock in CachingCatalog (#3801)

Uses caffeine's `RemovalListener` to expire metadata tables, avoiding modifying cache entries during `compute` HashMap functions (which cause deadlocks).

Also changes caffeine's executor to make `RemovalListener` run sync
For more details, check #3791

Fixes #3791

Co-authored-by: Kyle Bendickson <kjbendickson@gmail.com>
5 months agoCore: Fix metadata table scans when current snapshot is null (#3812)
Bryan Keller [Wed, 29 Dec 2021 20:13:39 +0000 (12:13 -0800)] 
Core: Fix metadata table scans when current snapshot is null (#3812)

5 months agoBuild: Remove extra dependencies on assertj (#3802)
Kyle Bendickson [Wed, 29 Dec 2021 20:09:56 +0000 (12:09 -0800)] 
Build: Remove extra dependencies on assertj (#3802)

5 months agoset the worker pool size to be at least 2 threads (#3811)
Bryan Keller [Tue, 28 Dec 2021 23:23:20 +0000 (15:23 -0800)] 
set the worker pool size to be at least 2 threads (#3811)

5 months agoParquet: Fix miss override in ApplyNameMapping (#3807)
felixYyu [Mon, 27 Dec 2021 19:39:42 +0000 (03:39 +0800)] 
Parquet: Fix miss override in ApplyNameMapping (#3807)

6 months agoFlink: Fix integer overflow in Avro time writer, #3738 (#3740)
chenzihao [Thu, 23 Dec 2021 20:38:43 +0000 (04:38 +0800)] 
Flink: Fix integer overflow in Avro time writer, #3738 (#3740)

Co-authored-by: chenzihao5 <chenzihao5@xiaomi.com>
6 months agoDocs: Added config naming (#3800)
sanyu daver [Thu, 23 Dec 2021 19:58:15 +0000 (01:28 +0530)] 
Docs: Added config naming (#3800)

Closes #3678