arrow-rs.git
2 months agoPrepare for 12.0.0 release: Update version and CHANGELOG (#1569) 12.0.0
Andrew Lamb [Fri, 15 Apr 2022 22:46:09 +0000 (18:46 -0400)] 
Prepare for 12.0.0 release: Update version and CHANGELOG  (#1569)

* Update version to 12.0.0

* Update changelog script

* Update changelog for 12.0.0

2 months agoSplit out ListArrayReader into separate module (#1483) (#1563)
Raphael Taylor-Davies [Fri, 15 Apr 2022 14:09:25 +0000 (15:09 +0100)] 
Split out ListArrayReader into separate module (#1483) (#1563)

* Split out ListArrayReader into separate module (#1483)

* Fix merge conflict

2 months agofix infinite loop in not fully packed bit-packed runs (#1555)
Raphael Taylor-Davies [Fri, 15 Apr 2022 14:09:12 +0000 (15:09 +0100)] 
fix infinite loop in not fully packed bit-packed runs (#1555)

* fix infinite loop in not fully packed bit-packed runs

* Add test and also fix get_batch_with_dict

Co-authored-by: Andrei Liakhovich <anliakho@microsoft.com>
2 months agoAdd test for creating FixedSizeBinaryArray::try_from_sparse_iter failed when given...
Andrew Lamb [Fri, 15 Apr 2022 13:46:50 +0000 (09:46 -0400)] 
Add test for creating FixedSizeBinaryArray::try_from_sparse_iter failed when given all Nones (#1551)

* Add test for creating FixedSizeBinaryArray::try_from_sparse_iter failed when given all Nones

* fix test

2 months agoRead/write nested dictionary in ipc stream reader/writer (#1566)
Liang-Chi Hsieh [Fri, 15 Apr 2022 13:05:23 +0000 (06:05 -0700)] 
Read/write nested dictionary in ipc stream reader/writer (#1566)

* Read dictionary inside dictionary

* Fix clippy

2 months agoinitial commit (#1564)
Chao Sun [Fri, 15 Apr 2022 12:52:03 +0000 (05:52 -0700)] 
initial commit (#1564)

2 months agoSplit out MapArray into separate module (#1483) (#1562)
Raphael Taylor-Davies [Fri, 15 Apr 2022 12:50:28 +0000 (13:50 +0100)] 
Split out MapArray into separate module (#1483) (#1562)

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 months agoSupport empty projection in ParquetRecordBatchReader (#1560)
Raphael Taylor-Davies [Fri, 15 Apr 2022 12:43:42 +0000 (13:43 +0100)] 
Support empty projection in ParquetRecordBatchReader (#1560)

* Support empty projection in ParquetRecordBatchReader

* Fix async reader

* Fix RAT

2 months agoFix incorrect `into_buffers` for UnionArray (#1567)
Liang-Chi Hsieh [Thu, 14 Apr 2022 18:42:29 +0000 (11:42 -0700)] 
Fix incorrect `into_buffers` for UnionArray (#1567)

* Fix incorrect buffers for UnionArray

* Add test

* Re-enable test_filter_union_array_sparse

2 months agoAdd CI check for full validation mode (#1546)
Andrew Lamb [Thu, 14 Apr 2022 17:30:16 +0000 (13:30 -0400)] 
Add CI check for full validation mode (#1546)

* Add force_validate feature

* Disable some redundant checks

* Add issue link

* Add test with force_validate feature flag

* fix up message

* disable due to https://github.com/apache/arrow-rs/issues/1547

* disable ipc test failure

* fix clippy

* Fix doctest to pass with force_validate enabled

2 months agoAdd option to skip decoding arrow metadata from parquet (#1459) (#1558)
Raphael Taylor-Davies [Thu, 14 Apr 2022 14:45:15 +0000 (15:45 +0100)] 
Add option to skip decoding arrow metadata from parquet (#1459) (#1558)

* Add option to skip decoding arrow metadata from parquet (#1459)

Fix inference from null logical type (#1557)

Replace some `&Option<T>` with `Option<&T>` (#1556)

* Update parquet/src/arrow/arrow_reader.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 months agoImprove JSON reader documentation (#1559)
Andrew Lamb [Wed, 13 Apr 2022 16:46:53 +0000 (12:46 -0400)] 
Improve JSON reader documentation (#1559)

* Improve JSON reader documentation

* Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2 months agoConslidate JSON reader options (#1539)
Andrew Lamb [Wed, 13 Apr 2022 12:51:15 +0000 (08:51 -0400)] 
Conslidate JSON reader options (#1539)

2 months agoFix reading dictionaries from nested structs in ipc `StreamReader` (#1550)
Thomas Peiselt [Wed, 13 Apr 2022 11:28:32 +0000 (13:28 +0200)] 
Fix reading dictionaries from nested structs in ipc `StreamReader` (#1550)

* Fix reading dictionaries from nested structs in ipc `StreamReader`

* Fix clippy error

* Apply review comment about field naming in test

2 months agoCreate RecordBatch With Non-Zero Row Count But No Columns (#1536) (#1552)
Raphael Taylor-Davies [Tue, 12 Apr 2022 23:44:22 +0000 (00:44 +0100)] 
Create RecordBatch With Non-Zero Row Count But No Columns (#1536) (#1552)

* Support empty RecordBatch (#1536)

* Placate clippy

* Review feedback

* Fix doc

* Fix create_record_batch_slice_empty_batch test

2 months agoAllow json reader/decoder to work with format_strings for each field (#1451)
Sumit [Tue, 12 Apr 2022 13:39:05 +0000 (15:39 +0200)] 
Allow json reader/decoder to work with format_strings for each field  (#1451)

* implement parser for remaining types used by json decoder

* added format strings (hashmap) to json reader

the format_string map's key is column name.
The value will be used to parse the date64/date32 types from json
if the read value is of string type

add tests for formatted parser for date{32,64}type for json readers

all-parsers start

fixup! added format strings (hashmap) to json reader

* add DecoderOptions struct for holding options for decoder

that way later extensions to the decoder can be added to this struct
without breaking API.

* Fixup some comments

* added test for string parsing json reader for time{32,64} types

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 months agominor: enable Date32/64 to String/LargeString cast (#1534)
Yijie Shen [Tue, 12 Apr 2022 13:34:18 +0000 (21:34 +0800)] 
minor: enable Date32/64 to String/LargeString cast (#1534)

2 months agoupdate the doc of `substring` (#1529)
Remzi Yang [Sun, 10 Apr 2022 11:15:27 +0000 (19:15 +0800)] 
update the doc of `substring` (#1529)

* update doc

Signed-off-by: remzi <13716567376yh@gmail.com>
* update doc

Signed-off-by: remzi <13716567376yh@gmail.com>
* Update arrow/src/compute/kernels/substring.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 months agofix clippy errors in 1.60 (#1527)
Andrew Lamb [Fri, 8 Apr 2022 10:19:56 +0000 (06:19 -0400)] 
fix clippy errors in 1.60 (#1527)

2 months agoAdd `new_from_strings` to create `MapArrays` (#1507)
Liang-Chi Hsieh [Thu, 7 Apr 2022 21:07:45 +0000 (14:07 -0700)] 
Add `new_from_strings` to create `MapArrays` (#1507)

* Add new_from_strings

* Fix clippy

* Update arrow/src/array/array_map.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Fix typo too

* For review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 months agoFix reading nested lists from parquet files (#1517)
Liang-Chi Hsieh [Thu, 7 Apr 2022 21:06:39 +0000 (14:06 -0700)] 
Fix reading nested lists from parquet files  (#1517)

* Fix

* Add test

2 months agoDecouple buffer deallocation from ffi and allow creating buffers from rust vec (...
Jörn Horstmann [Thu, 7 Apr 2022 21:01:21 +0000 (23:01 +0200)] 
Decouple buffer deallocation from ffi and allow creating buffers from rust vec (#1494)

* Decouple buffer deallocation from ffi and allow zero-copy buffer creation from rust vectors or strings

* Move allocation owner to alloc module

* Rename and comment Deallocation variants

* Fix doc link

* Explicitly assert that Buffer is UnwindSafe

* fix: doc comment

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 months agoSpeed up the `substring` kernel by about 2x (#1512)
Remzi Yang [Thu, 7 Apr 2022 20:31:30 +0000 (04:31 +0800)] 
Speed up the `substring` kernel by about 2x (#1512)

* speed up substring

Signed-off-by: remzi <13716567376yh@gmail.com>
* add comments

Signed-off-by: remzi <13716567376yh@gmail.com>
* reformat code

Signed-off-by: remzi <13716567376yh@gmail.com>
* use trait opject to simplify the code

Signed-off-by: remzi <13716567376yh@gmail.com>
* reformat code

Signed-off-by: remzi <13716567376yh@gmail.com>
* fmt code

Signed-off-by: remzi <13716567376yh@gmail.com>
2 months agochore: Update `prost`, `prost-derive` and `prost-types` to 0.10, `tonic`, and `tonic...
Andrew Lamb [Thu, 7 Apr 2022 19:30:46 +0000 (15:30 -0400)] 
chore: Update `prost`, `prost-derive` and `prost-types` to 0.10, `tonic`, and `tonic-build` to `0.7` (#1510)

* chore: Update prost, prost-derive and prost-types to 0.10

* Update tonic requirement from 0.6 to 0.7

Updates the requirements on [tonic](https://github.com/hyperium/tonic) to permit the latest version.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: tonic
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
* Update tonic-build requirement from 0.6 to 0.7

Updates the requirements on [tonic-build](https://github.com/hyperium/tonic) to permit the latest version.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: tonic-build
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
* Update generated code

* Try installing cmake dependencies for flight

* install cmake and protobuf

* Use --experimental_allow_proto3_optional flag

* fix apt-install

* try to install just protobuf compiler

* Add action to configure workspace

* Use prost enabled toolchain

* fixes

* fixups

* fix clippy

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 months agoFix for missing documentation of `GenericListBuilder` (#1525)
Sven Cattell [Wed, 6 Apr 2022 17:39:50 +0000 (13:39 -0400)] 
Fix for missing documentation of `GenericListBuilder` (#1525)

* Should fix  issue #1518

* Lint change.

2 months agoAdd a diagram to `take` kernel documentation (#1524)
Andrew Lamb [Tue, 5 Apr 2022 19:55:56 +0000 (15:55 -0400)] 
Add a diagram to `take` kernel documentation (#1524)

2 months agoMark remove-old-releases.sh executable (#1522)
Andrew Lamb [Tue, 5 Apr 2022 13:39:56 +0000 (09:39 -0400)] 
Mark remove-old-releases.sh executable (#1522)

2 months agoDelete duplicate code in the `sort` kernel (#1519)
Remzi Yang [Tue, 5 Apr 2022 13:32:52 +0000 (21:32 +0800)] 
Delete duplicate code in the `sort` kernel (#1519)

* remove repeated code

Signed-off-by: remzi <13716567376yh@gmail.com>
* delete weird file

Signed-off-by: remzi <13716567376yh@gmail.com>
2 months agoPrepare for release version `11.1.0` (#1514) 11.1.0
Andrew Lamb [Fri, 1 Apr 2022 15:23:38 +0000 (11:23 -0400)] 
Prepare for release version `11.1.0` (#1514)

* Update release version to 11.1.0

* draft: changelog

* more

* update

* Fixup

2 months ago Implement ArrayEqual for UnionArray (#1469)
Liang-Chi Hsieh [Thu, 31 Mar 2022 18:20:28 +0000 (11:20 -0700)] 
 Implement ArrayEqual for UnionArray (#1469)

* init

* more

* Remove dense/sparse case

* Fix clippy

* For review

* For review

2 months agoAdd FFI for Arrow C Stream Interface (#1384)
Liang-Chi Hsieh [Thu, 31 Mar 2022 01:07:06 +0000 (18:07 -0700)] 
Add FFI for Arrow C Stream Interface (#1384)

* Add FFI for Arrow C Stream Interface

* Add ArrowArrayStreamReader

* Add test

* Fix clippy

* fix format

* define error code

* Regenerate ffi binding using bindgen

* Rewrite test

* Remove CStreamInterface

* Fix clippy error

* Fix more clippy errors

* For review comment.

* Fix clippy error

* Fix clippy error

* not run example code in comment

* ignore doctest

* For review

* Fix clippy

* For review comment

* For review

* Add export_reader_into_raw

* For review

3 months agoClarify docs that SlicesIterator ignores null values (#1504)
Andrew Lamb [Wed, 30 Mar 2022 17:49:12 +0000 (13:49 -0400)] 
Clarify docs that SlicesIterator ignores null values (#1504)

* Clarify docs that SlicesIterator ignores null values

* Update arrow/src/compute/kernels/filter.rs

Co-authored-by: Yijie Shen <henry.yijieshen@gmail.com>
Co-authored-by: Yijie Shen <henry.yijieshen@gmail.com>
3 months agoUpdate release scripts to automatically clean up old release versions (#1467)
Andrew Lamb [Wed, 30 Mar 2022 17:48:33 +0000 (13:48 -0400)] 
Update release scripts to automatically clean up old release versions (#1467)

* Automatically clean up old release versions

* Update dev/release/release-tarball.sh

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* Add message to delete command

* fix submodules

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
3 months agoSupport calculating number of chars for `StringArray` (#1503)
Remzi Yang [Wed, 30 Mar 2022 17:47:29 +0000 (01:47 +0800)] 
Support calculating number of chars for `StringArray` (#1503)

* add functions, no tests yet

Signed-off-by: remzi <13716567376yh@gmail.com>
* add tests
delete unchecked fn
update doc

Signed-off-by: remzi <13716567376yh@gmail.com>
* use lib method
update doc and test

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoImplement `size_hint` and `ExactSizedIterator` for `DecimalArray` (#1506)
Andrew Lamb [Wed, 30 Mar 2022 17:46:20 +0000 (13:46 -0400)] 
Implement `size_hint` and `ExactSizedIterator` for `DecimalArray` (#1506)

* Implement size_hint and ExactSizedIterator for DecimalArray

* clippy

* Apply suggestions from code review

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
* Move DecimalIter to iterator.rs

* Bring back doc fixes

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
3 months agoupdate doc (#1491)
Remzi Yang [Tue, 29 Mar 2022 18:59:40 +0000 (02:59 +0800)] 
update doc (#1491)

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoUse Arrow take kernel within ListArrayReader (#1490)
Liang-Chi Hsieh [Tue, 29 Mar 2022 18:59:00 +0000 (11:59 -0700)] 
Use Arrow take kernel within ListArrayReader (#1490)

* Remove remove_indices

* For review

3 months agoFix miri error in try_from_trusted_len_iter (#1497)
Jörn Horstmann [Tue, 29 Mar 2022 16:48:37 +0000 (18:48 +0200)] 
Fix miri error in try_from_trusted_len_iter (#1497)

3 months agoAdd `length` kernel support for List Array (#1488)
Remzi Yang [Mon, 28 Mar 2022 20:47:45 +0000 (04:47 +0800)] 
Add `length` kernel support for List Array (#1488)

* add fn for list length
code format

Signed-off-by: remzi <13716567376yh@gmail.com>
* add list support into length function

Signed-off-by: remzi <13716567376yh@gmail.com>
* add tests

Signed-off-by: remzi <13716567376yh@gmail.com>
* update doc

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoSupport sort for decimal data type (#1487)
Yijie Shen [Mon, 28 Mar 2022 20:38:58 +0000 (04:38 +0800)] 
Support sort for decimal data type (#1487)

3 months agoFix generate_non_canonical_map_case, fix `MapArray` equality (#1476)
Liang-Chi Hsieh [Sun, 27 Mar 2022 10:46:56 +0000 (03:46 -0700)] 
Fix generate_non_canonical_map_case, fix `MapArray` equality  (#1476)

* Revamp list_equal for map type

* Canonicalize schema

* Add nullability and metadata

3 months agoFix reading/writing nested null arrays (#1480) (#1036) (#1399) (#1481)
Raphael Taylor-Davies [Fri, 25 Mar 2022 16:43:52 +0000 (16:43 +0000)] 
Fix reading/writing nested null arrays (#1480) (#1036) (#1399) (#1481)

3 months agoSplit ArrayReaderBuilder into its own module (#1483) (#1485)
Raphael Taylor-Davies [Fri, 25 Mar 2022 12:55:36 +0000 (12:55 +0000)] 
Split ArrayReaderBuilder into its own module (#1483) (#1485)

* Split ArrayReaderBuilder into its own module (#1483)

* Add license header

3 months agoSupport the `length` kernel on Binary Array (#1465)
Remzi Yang [Thu, 24 Mar 2022 18:32:34 +0000 (02:32 +0800)] 
Support the `length` kernel on Binary Array (#1465)

* support length on binary array (not test)
rewrite unary_offset using macro

Signed-off-by: remzi <13716567376yh@gmail.com>
* add tests

Signed-off-by: remzi <13716567376yh@gmail.com>
* add non-utf8 test cases

Signed-off-by: remzi <13716567376yh@gmail.com>
* fix some doc

Signed-off-by: remzi <13716567376yh@gmail.com>
* update doc
simplify the way to get offsets. No performance penalty

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agofix doc (#1471)
Remzi Yang [Wed, 23 Mar 2022 20:53:16 +0000 (04:53 +0800)] 
fix doc (#1471)

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoFix doc (#1463)
Liang-Chi Hsieh [Tue, 22 Mar 2022 11:11:10 +0000 (04:11 -0700)] 
Fix doc (#1463)

3 months agoFix generate_map_case (#1457)
Liang-Chi Hsieh [Tue, 22 Mar 2022 11:07:08 +0000 (04:07 -0700)] 
Fix generate_map_case (#1457)

3 months agoImprove performance of DictionaryArray::try_new()  (#1435)
jakevin [Tue, 22 Mar 2022 11:06:25 +0000 (19:06 +0800)] 
Improve performance of DictionaryArray::try_new()  (#1435)

* improve `DictionaryArray::try_new()` #1313

* *: fix typo

* *: add cheap validate and unit test

* *: polish the error

* Add safety note

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
3 months agoFix Parquet reader for null list (#1448)
Liang-Chi Hsieh [Tue, 22 Mar 2022 10:09:08 +0000 (03:09 -0700)] 
Fix Parquet reader for null list (#1448)

* Fix Parquet reader for null list

* Test on forked parquet-testing

* For review comments

* Fix clippy

3 months agoRemove Clone and copy source structs internally (#1449)
Liang-Chi Hsieh [Sat, 19 Mar 2022 06:26:50 +0000 (23:26 -0700)] 
Remove Clone and copy source structs internally (#1449)

* Remove Clone and copy source structs internally

* Remove drop_in_place and add more comment

* Add export_into_raw

* Fix format

* Fix clippy

* Move to export_array_into_raw

* Fix clippy

* Fix doc

* Use write_unaligned

3 months agoPrepare for 11.0.0 release (#1461) 11.0.0
Andrew Lamb [Fri, 18 Mar 2022 07:46:56 +0000 (03:46 -0400)] 
Prepare for 11.0.0 release (#1461)

* Update version to 11.0.0

* Update changelog

* update changelog

* fixup

* tweak

3 months agoFix generate_interval_case in integration test (#1446)
Liang-Chi Hsieh [Thu, 17 Mar 2022 12:45:56 +0000 (05:45 -0700)] 
Fix generate_interval_case in integration test (#1446)

* Fix generate_interval_case

* Fix

3 months agorewrite doc (#1450)
Remzi Yang [Thu, 17 Mar 2022 12:43:10 +0000 (20:43 +0800)] 
rewrite doc (#1450)

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoenhancement: remove redundant if/clamp_min/abs (#1428)
jakevin [Thu, 17 Mar 2022 07:35:28 +0000 (15:35 +0800)] 
enhancement: remove redundant if/clamp_min/abs (#1428)

3 months agoSet `default-features = false` for `zstd` in the parquet crate to support `wasm32...
Kyle Barron [Wed, 16 Mar 2022 12:23:01 +0000 (06:23 -0600)] 
Set `default-features = false` for `zstd` in the parquet crate to support `wasm32-unknown-unknown` (#1414)

* Update zstd version for wasm support

* Bump to 0.11.1

3 months ago`filter` kernel should work with FixedSizeListArrays (#1434)
Liang-Chi Hsieh [Wed, 16 Mar 2022 12:22:49 +0000 (05:22 -0700)] 
`filter` kernel should work with FixedSizeListArrays (#1434)

* filter kernel should work with FixedSizeListArrays

* Fix clippy

* Fix clippy

3 months agofilter kernel should work with UnionArray (#1412)
Liang-Chi Hsieh [Wed, 16 Mar 2022 07:40:44 +0000 (00:40 -0700)] 
filter kernel should work with UnionArray (#1412)

3 months agoRewrite doc example of ListArray and LargeListArray (#1447)
Remzi Yang [Wed, 16 Mar 2022 06:45:33 +0000 (14:45 +0800)] 
Rewrite doc example of ListArray and LargeListArray (#1447)

3 months agoFix generate_decimal128_case (#1440)
Liang-Chi Hsieh [Mon, 14 Mar 2022 12:12:14 +0000 (05:12 -0700)] 
Fix generate_decimal128_case (#1440)

3 months agoFix integration doc (#1438)
Liang-Chi Hsieh [Mon, 14 Mar 2022 07:11:02 +0000 (00:11 -0700)] 
Fix integration doc (#1438)

3 months agoFix DeltaBitPack MiniBlock Bit Width Padding (#1418)
Raphael Taylor-Davies [Mon, 14 Mar 2022 07:05:54 +0000 (07:05 +0000)] 
Fix DeltaBitPack MiniBlock Bit Width Padding (#1418)

* Consistent DeltaBitPackEncoder bit width padding (#1416)

Ignore non-zero padded bit widths in DeltaBitPackDecoder (#1417)

* chore: review feedback

* Add test of DeltaBitPackDecoder padding

* Revert formatting

3 months agoAdd doc example for creating `FixedSizeListArray` (#1426)
Remzi Yang [Sat, 12 Mar 2022 17:51:34 +0000 (01:51 +0800)] 
Add doc example for creating `FixedSizeListArray` (#1426)

3 months agoSupport nullable keys in DictionaryArray::try_new (#1430)
Jörn Horstmann [Fri, 11 Mar 2022 20:02:05 +0000 (21:02 +0100)] 
Support nullable keys in DictionaryArray::try_new (#1430)

* Support nullable keys in DictionaryArray::try_new

* Set null count so it does not have to be recalculated

3 months agoFix possibly unaligned writes in MutableBuffer (#1421)
Jörn Horstmann [Fri, 11 Mar 2022 19:22:18 +0000 (20:22 +0100)] 
Fix possibly unaligned writes in MutableBuffer (#1421)

* Fix possibly unaligned writes in MutableBuffer

* Remove debug output and make from_trusted_len_iter follow the same pattern

* Add comment in extend_from_slice

3 months agoAdd value_unchecked() for FixedSizeBinaryArray (#1420)
jakevin [Fri, 11 Mar 2022 18:31:08 +0000 (02:31 +0800)] 
Add value_unchecked() for FixedSizeBinaryArray (#1420)

3 months agoUpdate zstd requirement from 0.10 to 0.11 (#1415)
dependabot[bot] [Fri, 11 Mar 2022 12:19:41 +0000 (07:19 -0500)] 
Update zstd requirement from 0.10 to 0.11 (#1415)

Updates the requirements on [zstd](https://github.com/gyscos/zstd-rs) to permit the latest version.
- [Release notes](https://github.com/gyscos/zstd-rs/releases)
- [Commits](https://github.com/gyscos/zstd-rs/compare/v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: zstd
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
3 months agoImplement basic FlightSQL Server (#1386)
Wang Fenjin [Fri, 11 Mar 2022 11:19:18 +0000 (19:19 +0800)] 
Implement basic FlightSQL Server (#1386)

* init impl flight sql server mod

Change-Id: I108b2468b078470bb8b6f95c031035cc09227986

* update according to comments

Change-Id: Ibb381e105041b38e6402850a2338403f802568ec

* fix ci error

Change-Id: I9485e510f1a960b6e094e559c3679434f8474ec1

* format code

Change-Id: I7ef4ade3acc81ccf5df088c866d41b538cf6f4f2

* fix clippy issue

Change-Id: I35d108ef43f2c2245444cfd5ea82da00b4f694f9

* add more test

Change-Id: Ic159cea2c76b017e183d2946e2d24e6fd1f9b4c1

* improve error handling

Change-Id: I709c16613092fd42ccff827eed3e3ad3f28368e2

* delete unnecessary Sync

Change-Id: I03ed0f69ddb1203ecd75982815fa72eca4d81160

* add flight_sql_server example

Change-Id: Ia35d697aaac3c72feba9c3aaf380ee3930484c48

* get rid of type annotation in unpack

Change-Id: I6006702d424ac6595f58c66057df267c4fd24476

* fix comments

Change-Id: I740d3d4e5aabbb56219291381e6a6db6506eca28

* add feature flight-sql

Change-Id: I223cf76be10ff379fcc9000c730d99c9773c7c3d

* delete all-features flag as packed_simd_2 no supported

Change-Id: I50915b85b2f806bac5cd3207623e3f4e0e1974a1

* add feature flag for example

Change-Id: I562efcfa89a606b8061d2715ca1b6775e2a952a9

* fix do_put and do_action API

Change-Id: I80bef8c2b0a713a87c43487708ae721f5f8f9da9

* format code

Change-Id: Ie664a5fca965759dbba59ad9e34fc6e33150ddbf

* rename feature

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* rename to flight-sql-experimental

Change-Id: I4de4fe3768b0316e69ba6798406310632933d25d

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
3 months agoRemove duplicate bound check in the function shift (#1409)
Remzi Yang [Fri, 11 Mar 2022 02:40:15 +0000 (10:40 +0800)] 
Remove duplicate bound check in the function shift (#1409)

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoDirectly write to MutableBuffer in substring (#1423)
Liang-Chi Hsieh [Fri, 11 Mar 2022 02:37:06 +0000 (18:37 -0800)] 
Directly write to MutableBuffer in substring (#1423)

3 months agoImplement projection for arrow file / streams (#1339)
Daniël Heres [Wed, 9 Mar 2022 17:40:53 +0000 (18:40 +0100)] 
Implement projection for arrow file / streams (#1339)

* Implement projection for arrow file / streams

* Tests

* Fix

* Fix

* Add test

* Add test

* Add link

* Undo change to existing test

* Update arrow/src/ipc/reader.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Use project

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
3 months agoAdd dictionary support for C data interface (#1407)
Chao Sun [Wed, 9 Mar 2022 07:27:43 +0000 (23:27 -0800)] 
Add dictionary support for C data interface (#1407)

* initial commit

* add integration tests for python

* address comments

3 months agofix (#1406)
Remzi Yang [Tue, 8 Mar 2022 15:28:20 +0000 (23:28 +0800)] 
fix (#1406)

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoadd unit test to check all none (#1405)
jakevin [Mon, 7 Mar 2022 21:29:59 +0000 (05:29 +0800)] 
add unit test to check all none (#1405)

3 months agoImprove integration testing docs (#1403)
Andrew Lamb [Sun, 6 Mar 2022 18:41:11 +0000 (13:41 -0500)] 
Improve integration testing docs (#1403)

3 months agoMove csv Parser trait and its implementations to utils module (#1385)
Sumit [Sun, 6 Mar 2022 13:36:36 +0000 (14:36 +0100)] 
Move csv Parser trait and its implementations to utils module (#1385)

* move Parser trait to utils

this allow the parser capabilities to be re-used for json module

* implement parse_formatted for date32

* remove redundant checks

* Update arrow/src/util/mod.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* make Parser trait pub(crate) only and not pub

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
3 months agoIntroduce `ReadOptions` with builder API, for parquet filter row groups that satisfy...
Yijie Shen [Sun, 6 Mar 2022 11:44:54 +0000 (19:44 +0800)] 
Introduce `ReadOptions` with builder API, for parquet filter row groups that satisfy all filters, and enable filter row groups by range. (#1389)

* Filter row groups by comparing midpoint with offset range

* lint

* ReadOptions with builder API

* fix comments

* precise range doc

* tab to space

3 months agoAdd note in contributing guideline about types of contributions (#1396)
Andrew Lamb [Sun, 6 Mar 2022 11:41:58 +0000 (06:41 -0500)] 
Add note in contributing guideline about types of contributions (#1396)

* Add note in contributing guideline about types of contributions

* prettier

3 months agofix: Fix grpc schema hack in flight integration test (#1402)
Andrew Lamb [Sat, 5 Mar 2022 13:54:34 +0000 (08:54 -0500)] 
fix: Fix grpc schema hack in flight integration test (#1402)

3 months agoPrepare for the 10.0.0 release (#1395) 10.0.0
Andrew Lamb [Sat, 5 Mar 2022 11:38:10 +0000 (06:38 -0500)] 
Prepare for the 10.0.0 release (#1395)

* Update version to 10.0.0

* Initial 10.0.0 CHANGELOG

* Cleanup CHANGELOG

* Update for last change

3 months agoAdd extract month and day in temporal.rs (#1388)
Yang Jiang [Fri, 4 Mar 2022 11:42:22 +0000 (19:42 +0800)] 
Add extract month and day in temporal.rs (#1388)

* Add extract month in temporal.rs

* fix clippy

* implement day

* add ut

* fix clippy

3 months agoClarify release instructions about when to merge CHANGELOG update (#1370)
Andrew Lamb [Thu, 3 Mar 2022 18:26:06 +0000 (13:26 -0500)] 
Clarify release instructions about when to merge CHANGELOG update (#1370)

3 months agofeat: support maps in MutableArrayData (#1379)
Helgi Kristvin Sigurbjarnarson [Thu, 3 Mar 2022 18:15:46 +0000 (10:15 -0800)] 
feat: support maps in MutableArrayData (#1379)

Additionally, this allows the use fo `filter` on record batches and
arrays containing maps.

3 months agoAdd write method to Json Writer (#1383)
Matthew Turner [Thu, 3 Mar 2022 17:49:18 +0000 (12:49 -0500)] 
Add write method to Json Writer (#1383)

* Add write method

* Add docs

3 months agoSpeed up the function `min_max_string` (#1374)
Remzi Yang [Thu, 3 Mar 2022 17:47:35 +0000 (01:47 +0800)] 
Speed up the function `min_max_string` (#1374)

* clean up the code

Signed-off-by: remzi <13716567376yh@gmail.com>
* bring back the optimization when null count is zero

Signed-off-by: remzi <13716567376yh@gmail.com>
* pretty the trait bound and update comment

Signed-off-by: remzi <13716567376yh@gmail.com>
* use value_unchecked to replace array.value
10% extra speed up

Signed-off-by: remzi <13716567376yh@gmail.com>
* update the performance data

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoAllow primitive array creation from iterators of PrimitiveTypes (as well as `Option...
Liang-Chi Hsieh [Thu, 3 Mar 2022 14:04:43 +0000 (06:04 -0800)] 
Allow primitive array creation from iterators of PrimitiveTypes (as well as `Option`) (#1367)

* More idiomatic primitive array creation

* Use From instead for clippy

* Rename to NativeAdapter and add document

3 months agoImprove performance if dictionary kernels, add benchmark and add `take_iter_unchecked...
Liang-Chi Hsieh [Thu, 3 Mar 2022 11:34:00 +0000 (03:34 -0800)] 
Improve performance if dictionary kernels, add benchmark and add `take_iter_unchecked` (#1372)

* Add benchmark and take_iter_unchecked.

* Add Safety section for clippy

* Update arrow/src/compute/kernels/comparison.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
3 months agoSupport extract `week` in temporal.rs (#1376)
Yang Jiang [Wed, 2 Mar 2022 16:30:16 +0000 (00:30 +0800)] 
Support extract `week` in temporal.rs (#1376)

* Add extract week in temporal.rs

* add more test conditions

* add comments

3 months agoAdd Clone to IpcWriteOptions (#1382)
Matthew Turner [Wed, 2 Mar 2022 16:15:57 +0000 (11:15 -0500)] 
Add Clone to IpcWriteOptions (#1382)

3 months agoRefactor `RecordBatch::validate_new_batch` (#1361)
Remzi Yang [Wed, 2 Mar 2022 15:26:11 +0000 (23:26 +0800)] 
Refactor `RecordBatch::validate_new_batch` (#1361)

* refactor checking same row count

Signed-off-by: remzi <13716567376yh@gmail.com>
* refactor matching schema

Signed-off-by: remzi <13716567376yh@gmail.com>
* add more comments
simplify the iterator

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agorefactor (#1346)
Shani Solomon [Wed, 2 Mar 2022 15:25:25 +0000 (17:25 +0200)] 
refactor (#1346)

3 months agoImplement DictionaryArray support in neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn...
Liang-Chi Hsieh [Tue, 1 Mar 2022 11:37:48 +0000 (03:37 -0800)] 
Implement DictionaryArray support in neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn (#1326)

* Implement DictionaryArray support in neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn

* Fix clippy

* Fix format

* Add test

* For review comment and suggestion

* Allow reasonable boolean comparisons

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
3 months agoUpdate pyo3 requirement from 0.15 to 0.16 (#1369)
dependabot[bot] [Tue, 1 Mar 2022 04:10:49 +0000 (12:10 +0800)] 
Update pyo3 requirement from 0.15 to 0.16 (#1369)

3 months agosupport as_decimal_array api (#1356)
Kun Liu [Mon, 28 Feb 2022 21:02:06 +0000 (05:02 +0800)] 
support as_decimal_array api (#1356)

3 months agoUse DictionaryArray's iterator (#1330)
Liang-Chi Hsieh [Mon, 28 Feb 2022 20:55:21 +0000 (12:55 -0800)] 
Use DictionaryArray's iterator (#1330)

3 months agoUpdate contributing guide (#1368)
Remzi Yang [Mon, 28 Feb 2022 19:17:53 +0000 (03:17 +0800)] 
Update contributing guide (#1368)

* add build environment

Signed-off-by: remzi <13716567376yh@gmail.com>
* update the format

Signed-off-by: remzi <13716567376yh@gmail.com>
3 months agoUpdate flatbuffers requirement from =2.1.0 to =2.1.1 (#1364)
dependabot[bot] [Mon, 28 Feb 2022 19:17:32 +0000 (14:17 -0500)] 
Update flatbuffers requirement from =2.1.0 to =2.1.1 (#1364)

Updates the requirements on [flatbuffers](https://github.com/google/flatbuffers) to permit the latest version.
- [Release notes](https://github.com/google/flatbuffers/releases)
- [Commits](https://github.com/google/flatbuffers/commits)

---
updated-dependencies:
- dependency-name: flatbuffers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
4 months agoUpdate flatbuffers requirement from =2.0.0 to =2.1.0 (#1359)
dependabot[bot] [Fri, 25 Feb 2022 12:20:52 +0000 (20:20 +0800)] 
Update flatbuffers requirement from =2.0.0 to =2.1.0 (#1359)

Updates the requirements on [flatbuffers](https://github.com/google/flatbuffers) to permit the latest version.
- [Release notes](https://github.com/google/flatbuffers/releases)
- [Commits](https://github.com/google/flatbuffers/commits)

---
updated-dependencies:
- dependency-name: flatbuffers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
4 months agoFix clippy lints (#1363)
Remzi Yang [Fri, 25 Feb 2022 12:18:13 +0000 (20:18 +0800)] 
Fix clippy lints (#1363)

* fix some

Signed-off-by: remzi <13716567376yh@gmail.com>
* fix some warnings

Signed-off-by: remzi <13716567376yh@gmail.com>
* fix some warning

Signed-off-by: remzi <13716567376yh@gmail.com>
* fix all clippy lints

Signed-off-by: remzi <13716567376yh@gmail.com>
4 months agoRemove delimiter from csv Writer (#1342)
Sergey Glushchenko [Thu, 24 Feb 2022 07:50:29 +0000 (08:50 +0100)] 
Remove delimiter from csv Writer (#1342)

4 months agoPublicly export arrow::array::MapBuilder (#1355)
tjwilson90 [Thu, 24 Feb 2022 07:48:33 +0000 (23:48 -0800)] 
Publicly export arrow::array::MapBuilder (#1355)