arrow-rs.git
5 months agoUpdate readme to clarify versioning (#1142) active_release 7.0.0
Andrew Lamb [Sat, 8 Jan 2022 10:28:32 +0000 (05:28 -0500)] 
Update readme to clarify versioning (#1142)

5 months agoUpdate version to 7.0.0 and update CHANGELOG (#1141)
Andrew Lamb [Sat, 8 Jan 2022 10:19:08 +0000 (05:19 -0500)] 
Update version to 7.0.0 and update CHANGELOG (#1141)

* Update changelog generator

* Bring changelog from 6.5.0

* Update changelog

* Update version to 7.0.0

5 months agofeat(ipc): support for reading union arrays through IPC (#1140)
Helgi Kristvin Sigurbjarnarson [Thu, 6 Jan 2022 22:46:16 +0000 (14:46 -0800)] 
feat(ipc): support for reading union arrays through IPC (#1140)

5 months agoDyn comparison of interval arrays (#1106) (#1107)
Raphael Taylor-Davies [Thu, 6 Jan 2022 22:22:35 +0000 (22:22 +0000)] 
Dyn comparison of interval arrays (#1106) (#1107)

* Dyn comparison of interval arrays (#1106)

* fix fmt

* Skip test when simd is enabled

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agofeat: union schema serialization/deserialization for ipc (#1135)
Helgi Kristvin Sigurbjarnarson [Thu, 6 Jan 2022 22:12:40 +0000 (14:12 -0800)] 
feat: union schema serialization/deserialization for ipc (#1135)

5 months ago*_dyn_scalar kernels: Support Float32Array and Float64Array, (#1127)
Andrew Lamb [Thu, 6 Jan 2022 22:12:23 +0000 (17:12 -0500)] 
*_dyn_scalar kernels: Support Float32Array and Float64Array,  (#1127)

* *_dyn_scalar kernels: Support Float32Array and Float64Array, use ToPrimitive rather than `Into<i128>`m take take &dyn Array rather than `ArrayRef`

* Update APIs for *_dyn_bool_scalar kernels

5 months agoAdd more information on SIMD (#1138)
Benson Muite [Thu, 6 Jan 2022 22:11:01 +0000 (01:11 +0300)] 
Add more information on SIMD (#1138)

5 months agoAdd dyn boolean kernels (#1131)
Matthew Turner [Wed, 5 Jan 2022 21:53:32 +0000 (16:53 -0500)] 
Add dyn boolean kernels (#1131)

* Add dyn bool kernels

* Add tests

* Update error messages

* Update test

* Fix test

* Update doc strings

5 months agoFix reading of dictionary encoded pages with null values (#1111) (#1130)
Yordan Pavlov [Wed, 5 Jan 2022 19:57:19 +0000 (19:57 +0000)] 
Fix reading of dictionary encoded pages with null values (#1111) (#1130)

* fix reading of dictionary encoded pages with null values

* fix linting issues

5 months agoMake arrow::array_reader private (#1032) (#1133)
Raphael Taylor-Davies [Wed, 5 Jan 2022 16:27:27 +0000 (16:27 +0000)] 
Make arrow::array_reader private (#1032) (#1133)

5 months agoImplement Array for ArrayRef, Improve as_* kernels to take `&dyn Array` (#1129)
Andrew Lamb [Wed, 5 Jan 2022 13:45:27 +0000 (08:45 -0500)] 
Implement Array for ArrayRef, Improve as_* kernels to take `&dyn Array` (#1129)

* Implement Array for ArrayRef

* Improve as_* kernels to take &dyn Array

* remove uneeded pyarrow binding

5 months agoAdd Schema::with_metadata and Field::with_metadata (#1092)
Andrew Lamb [Wed, 5 Jan 2022 12:29:25 +0000 (07:29 -0500)] 
Add Schema::with_metadata and Field::with_metadata (#1092)

5 months agoallow using custom datetime format for inference and parsing csv file (#1112)
Sumit [Sun, 2 Jan 2022 16:42:43 +0000 (17:42 +0100)] 
allow using custom datetime format for inference and parsing csv file (#1112)

* allow using custom datetime format for inference and parsing csv file

The patch extends the current implementation to allow passing a custom
datetime_re and datetime_format to the ReaderBuilder.

datetime_re is used infer schema of the csv and then datetime_format is
used to parse the actual string to a Date64.
ofcourse  passing non-compatible datetime_re and datetime_format values
is going to fail the parsing or inference, however it is an expected but
hard-to-detect failure.

* Incorporate some clippy recommendations for limit count of call args

The patch adds a new struct to collect all these options together and
then passes the struct around. Ideally the struct could be embedded into
the reader but that can be done as separate exercise.

* Detect presence of timezone in format while parsing csv for date64

The patch decides on using NaiveDateTime or DateTime from chrono lib
based on presence of timezone components

chrono expects timezone to be presetn if DateTime is used, errors
otherwise. Whereas NaiveDateTime ignores timezone even if explicitly
provided.

5 months agoUpdate Union Array to add `UnionMode`, match latest Arrow Spec, and rename `new...
Andrew Lamb [Sun, 2 Jan 2022 14:37:45 +0000 (09:37 -0500)] 
Update Union Array to add `UnionMode`,  match latest Arrow Spec, and rename `new` -> `unsafe new_unchecked()` (#885)

* Update union array to new null handling

* Update arrow/src/array/array_union.rs

* correct comment

5 months agoAdd kernel and tests (#1125)
Matthew Turner [Sun, 2 Jan 2022 14:24:50 +0000 (09:24 -0500)] 
Add kernel and tests (#1125)

5 months agoAdd kernel and tests (#1123)
Matthew Turner [Sun, 2 Jan 2022 14:24:18 +0000 (09:24 -0500)] 
Add kernel and tests (#1123)

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoAdd kernel and tests (#1122)
Matthew Turner [Sun, 2 Jan 2022 13:45:08 +0000 (08:45 -0500)] 
Add kernel and tests (#1122)

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoAdd neq dyn scalar kernel (#1118)
Matthew Turner [Sun, 2 Jan 2022 13:02:22 +0000 (08:02 -0500)] 
Add neq dyn scalar kernel (#1118)

* Add lt_dyn_scalar and tests

* Add lt_eq_dyn_scalar kernel

* Add gt_dyn_scalar kernel

* Add gt_eq_dyn_scalar kernel

* Add neq_dyn_scalar kernel

* Add kernel to err message

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoAdd gt eq dyn scalar kernel (#1117)
Matthew Turner [Sun, 2 Jan 2022 12:08:24 +0000 (07:08 -0500)] 
Add gt eq dyn scalar kernel (#1117)

* Add lt_dyn_scalar and tests

* Add lt_eq_dyn_scalar kernel

* Add gt_dyn_scalar kernel

* Add gt_eq_dyn_scalar kernel

* Add kernel to err message

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoAdd gt dyn scalar kernel (#1116)
Matthew Turner [Sun, 2 Jan 2022 11:47:28 +0000 (06:47 -0500)] 
Add gt dyn scalar kernel (#1116)

* Add gt_dyn_scalar kernel

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoAdd lt eq dyn scalar kernel (#1115)
Matthew Turner [Sun, 2 Jan 2022 11:33:51 +0000 (06:33 -0500)] 
Add lt eq dyn scalar kernel (#1115)

* Add lt_dyn_scalar and tests

* Add lt_eq_dyn_scalar kernel

* Add kernel to error message

* fix merge problem

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoAdd kernel and tests (#1121)
Matthew Turner [Sun, 2 Jan 2022 11:25:00 +0000 (06:25 -0500)] 
Add kernel and tests (#1121)

5 months agoAdd kernel and tests (#1124)
Matthew Turner [Sun, 2 Jan 2022 11:11:00 +0000 (06:11 -0500)] 
Add kernel and tests (#1124)

5 months agoAdd lt dyn scalar kernel (#1114)
Matthew Turner [Sun, 2 Jan 2022 11:08:36 +0000 (06:08 -0500)] 
Add lt dyn scalar kernel (#1114)

* Add lt_dyn_scalar and tests

* Add kernel to error message

5 months agofix bug: error type for BufferBuilder (#1104)
Kun Liu [Sun, 2 Jan 2022 11:06:40 +0000 (19:06 +0800)] 
fix bug: error type for BufferBuilder (#1104)

* fix bug: error type for BufferBuilder

* fix clippy

5 months agoDefine eq_dyn_scalar API (#1074)
Matthew Turner [Sat, 1 Jan 2022 12:06:08 +0000 (07:06 -0500)] 
Define eq_dyn_scalar API (#1074)

* Squash

* Cleanup error messages

5 months agoMutableArrayData support extend decimal data type (#1100)
Kun Liu [Wed, 29 Dec 2021 19:51:05 +0000 (03:51 +0800)] 
MutableArrayData support extend decimal data type (#1100)

* support extend decimal data type

* add more test

5 months agoPrint the 'FixedSizeBinaryArray' like a normal 'BinaryArray' (#1097)
Francis Le Roy [Wed, 29 Dec 2021 19:50:40 +0000 (20:50 +0100)] 
Print the 'FixedSizeBinaryArray' like a normal 'BinaryArray' (#1097)

* Print the 'FixedBinaryArray' like a normal 'BinaryArray'

* apply cargo fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
5 months agoimplement eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn for timestamp types...
Liang-Chi Hsieh [Wed, 29 Dec 2021 13:20:27 +0000 (05:20 -0800)] 
implement eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn for timestamp types (#1095)

* implement eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn for timestamp types

* Simplify test code

5 months agoAllow proc-macro2 dependency to be flexible (#1102)
Andrew Lamb [Wed, 29 Dec 2021 12:01:22 +0000 (07:01 -0500)] 
Allow proc-macro2 dependency to be flexible (#1102)

6 months agosupport cast decimal to decimal (#1084)
Kun Liu [Thu, 23 Dec 2021 13:53:00 +0000 (21:53 +0800)] 
support cast decimal to decimal (#1084)

* support cast decimal to decimal

* add test case

* remove meaningless code

6 months agoFix like regex escaping (#1085)
Daniël Heres [Wed, 22 Dec 2021 17:54:30 +0000 (18:54 +0100)] 
Fix like regex escaping (#1085)

* Fix like regex escaping

* Fix like regex escaping

* Fix doctest

* Simplify

6 months agosupport cast decimal to signed numeric (#1073)
Kun Liu [Wed, 22 Dec 2021 16:43:44 +0000 (00:43 +0800)] 
support cast decimal to signed numeric (#1073)

* add cast test macro function; refactor other type to decimal type; add decimal to signed numeric type
support decimal to unsigned numeric

* address the comments and fix the clippy

6 months agoUpdate pyo3 to 0.15 (#1076)
dbr/Ben [Wed, 22 Dec 2021 16:36:06 +0000 (03:36 +1100)] 
Update pyo3 to 0.15 (#1076)

* Update pyo3 to 0.15

* Update pyo3 in integration tests also

6 months agoparquet: Use constant for RLE decoder buffer size (#1070)
Andrew Lamb [Tue, 21 Dec 2021 11:52:56 +0000 (06:52 -0500)] 
parquet: Use constant for RLE decoder buffer size (#1070)

6 months agoAdd Schema::project and RecordBatch::project functions (#1033)
Stephen Carman [Mon, 20 Dec 2021 16:48:43 +0000 (11:48 -0500)] 
Add Schema::project and RecordBatch::project functions  (#1033)

* Allow Schema and RecordBatch to project schemas on specific columns returning a new schema with those columns only

* Addressing PR updates and adding a test for out of range projection

* switch to &[usize]

* fix: clippy and fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
6 months agoBox RleDecoder index buffer (#1061) (#1062)
Raphael Taylor-Davies [Mon, 20 Dec 2021 16:46:23 +0000 (16:46 +0000)] 
Box RleDecoder index buffer (#1061) (#1062)

* Box RleDecoder index buffer (#1061)

* Format

6 months agosupport cast signed numeric to decimal (#1044)
Kun Liu [Mon, 20 Dec 2021 16:36:53 +0000 (00:36 +0800)] 
support cast signed numeric to decimal (#1044)

* support cast signed numeric to decimal

* add test for i8,i16,i32,i64,f32,f64 casted to decimal

* change format of float64

* add none test; merge integer test together

6 months agofix(compute): LIKE escape parenthesis (#1042)
Dmitry Patsura [Mon, 20 Dec 2021 16:31:52 +0000 (19:31 +0300)] 
fix(compute): LIKE escape parenthesis (#1042)

Signed-off-by: Dmitry Patsura <talk@dmtry.me>
6 months agoAdd MONTH_DAY_NANO interval type, impl `ArrowNativeType` for `i128` (#779)
baishen [Mon, 20 Dec 2021 14:41:00 +0000 (22:41 +0800)] 
Add MONTH_DAY_NANO interval type, impl `ArrowNativeType` for `i128` (#779)

* support interval MonthDayNano

* fix

* fix

* fix

* fix test

* add IPC integration test

* fix rat

* update patch

* fix

* fmt

* fix

* fix

* fix

* fix

* fix

* fix

* remove integration-testing/unskip.patch

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
6 months agoBooleanBufferBuilder correct buffer length (#1051) (#1052)
Raphael Taylor-Davies [Mon, 20 Dec 2021 13:28:17 +0000 (13:28 +0000)] 
BooleanBufferBuilder correct buffer length (#1051) (#1052)

6 months agoAddress benchmarks that aren't compiling (#1001)
Carol (Nichols || Goulding) [Fri, 17 Dec 2021 19:23:33 +0000 (14:23 -0500)] 
Address benchmarks that aren't compiling (#1001)

* Add a CI job that checks benchmarks (but doesn't run them)

* The feature test_common must be turned on to build parquet benchmarks

* Align cache keys

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
6 months agoRemove outdated safety example from doc (#1050)
Andrew Lamb [Fri, 17 Dec 2021 15:38:07 +0000 (10:38 -0500)] 
Remove outdated safety example from doc (#1050)

6 months agoUse existing array type in `take` kernel (#1046)
Max Burke [Fri, 17 Dec 2021 11:20:49 +0000 (03:20 -0800)] 
Use existing array type in `take` kernel (#1046)

* Need to use type from data so that we do not lose, for example, timezone information

* add test for take preseving timezone

6 months agoAvoid allocating vector of indices in lexicographical_partition_ranges (#998)
Jörn Horstmann [Wed, 15 Dec 2021 19:56:45 +0000 (20:56 +0100)] 
Avoid allocating vector of indices in lexicographical_partition_ranges (#998)

* Avoid allocating vector of indices in lexicographical_partition_ranges

* Adjust comments

* Improve comments and remove one unneeded parameter

6 months agoMark `MutableBuffer::typed_data_mut` unsafe (#1029)
Andrew Lamb [Wed, 15 Dec 2021 19:56:16 +0000 (14:56 -0500)] 
Mark `MutableBuffer::typed_data_mut` unsafe (#1029)

* Mark `MutableBuffer::typed_data_mut` unsafe

* fmt

* Mark use of `typed_data_but` as unsafe in simd kernels

6 months agoExtract method to drive PageIterator -> RecordReader (#1031)
Raphael Taylor-Davies [Tue, 14 Dec 2021 19:41:18 +0000 (19:41 +0000)] 
Extract method to drive PageIterator -> RecordReader (#1031)

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
6 months agoSimplify parquet arror `RecordReader` (#1021)
Raphael Taylor-Davies [Mon, 13 Dec 2021 21:44:47 +0000 (21:44 +0000)] 
Simplify parquet arror `RecordReader` (#1021)

6 months agoClarify governance of arrow crate (#1030)
Andrew Lamb [Sun, 12 Dec 2021 21:04:33 +0000 (16:04 -0500)] 
Clarify governance of arrow crate (#1030)

6 months agoForce new cargo and target caching to fix CI (#1023)
Andrew Lamb [Fri, 10 Dec 2021 15:11:45 +0000 (10:11 -0500)] 
Force new cargo and target caching to fix CI (#1023)

6 months agoFix: fixes a broken link and some missing styling in the main arrow crate docs (...
Adam Gutglick [Thu, 9 Dec 2021 18:18:49 +0000 (20:18 +0200)] 
Fix: fixes a broken link and some missing styling in the main arrow crate docs (#1013)

6 months agoRemove out of date comment (#1008)
Andrew Lamb [Mon, 6 Dec 2021 20:39:52 +0000 (15:39 -0500)] 
Remove out of date comment (#1008)

6 months agoMinimize features of indexmap and chrono (#1000)
Carol (Nichols || Goulding) [Sat, 4 Dec 2021 15:28:30 +0000 (10:28 -0500)] 
Minimize features of indexmap and chrono (#1000)

* Disable default features of chrono; only enable features needed

Chrono's default features contain "oldtime", which is deprecated.
According to [the docs](https://docs.rs/chrono/0.4.19/chrono/#duration),

> new code should disable the oldtime feature and use the
> chrono::Duration type instead. The oldtime feature is enabled by
> default for backwards compatibility, but future versions of Chrono
> are likely to remove the feature entirely.

so follow that recommendation by setting default-features to false. And
actually, only Arrow needs the "clock" feature, so all the other
features can stay off too to minimize the feature set that projects
depending on arrow or parquet are forced to enable.

* Explicitly enable indexmap's "std" feature

The indexmap crate uses the autocfg crate to do target detection to
determine whether `std` is available. Arrow isn't targeting `no_std`
environments, so the target detection isn't necessary. This might save
some build time.

https://github.com/bluss/indexmap/pull/145

6 months agoDocstrings for Timestamp*Array. (#988)
Navin [Sat, 4 Dec 2021 15:21:47 +0000 (02:21 +1100)] 
Docstrings for Timestamp*Array. (#988)

* Docstrings for TimestampSecondArray.

* fixup! Docstrings for TimestampSecondArray.

6 months agoUpdate rust version to 1.57 (#1003)
Carlos [Sat, 4 Dec 2021 15:13:19 +0000 (23:13 +0800)] 
Update rust version to 1.57 (#1003)

6 months agoAdd full data validation for ArrayData::try_new() (#921)
Andrew Lamb [Sat, 4 Dec 2021 11:43:45 +0000 (06:43 -0500)] 
Add full data validation for ArrayData::try_new() (#921)

* Add full data validation for ArrayData::try_new()

* Only look at offset+len indexes

Co-authored-by: Jörn Horstmann <git@jhorstmann.net>
* fix test

* fmt

* test for array indexes

Co-authored-by: Jörn Horstmann <git@jhorstmann.net>
6 months agoRemove unneeded `rc` feature of serde (#990)
Carol (Nichols || Goulding) [Fri, 3 Dec 2021 21:58:04 +0000 (16:58 -0500)] 
Remove unneeded `rc` feature of serde (#990)

Fixes #989.

This feature opts into impls for `Rc` and `Arc`, but none of the data
structures that use Serialize/Deserialize actually contain `Rc` or
`Arc`s.

See:

- [Serde docs](https://serde.rs/feature-flags.html#-features-rc)
- [PR adding this](https://github.com/apache/arrow/pull/3016)

6 months agoFix warnings introduced by Rust/Clippy 1.57.0 (#992)
Carol (Nichols || Goulding) [Fri, 3 Dec 2021 15:14:32 +0000 (10:14 -0500)] 
Fix warnings introduced by Rust/Clippy 1.57.0 (#992)

* Remove needless borrows identified by clippy

https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

* Remove muts that are no longer needed

* Derive Default instead of using an equivalent manual impl

Identified by clippy.

https://rust-lang.github.io/rust-clippy/master/index.html#derivable_impls

* Remove redundant closures

Identified by clippy.

https://rust-lang.github.io/rust-clippy/master/index.html#redundant_closure

* Allow dead code on a field Rust now identifies as never read

6 months agoChange `pretty_format_batches` to return `Result<impl Display>` (#975)
Matthew Turner [Wed, 1 Dec 2021 14:33:14 +0000 (09:33 -0500)] 
Change `pretty_format_batches` to return `Result<impl Display>` (#975)

* Create write_batches function

* Update pretty_format_batches and pretty_format_columns to return Display

* fix import warning

* Fix compile error

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
6 months agofix some typos in code and comments (#985)
Jiayu Liu [Mon, 29 Nov 2021 21:10:45 +0000 (05:10 +0800)] 
fix some typos in code and comments (#985)

6 months agoUpgrading parquet-format to 4.0.0 (#979)
Kesav Kolla [Mon, 29 Nov 2021 21:09:46 +0000 (02:39 +0530)] 
Upgrading parquet-format to 4.0.0 (#979)

* Upgrading parquet-format to 4.0.0
Fixes issue #978

* Update basic.rs

* Update parquet/src/basic.rs

fix clippy

Co-authored-by: Kesav Kumar Kolla <kesav@314ecorp.com>
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
6 months agoadd support for f16 (#888)
Jiayu Liu [Mon, 29 Nov 2021 13:08:13 +0000 (21:08 +0800)] 
add support for f16 (#888)

6 months agoAdd boolean comparison to scalar kernels for less then, greater than (#977)
Carlos [Sat, 27 Nov 2021 11:21:29 +0000 (19:21 +0800)] 
Add boolean comparison to scalar kernels for less then, greater than (#977)

7 months agoFix CI for latest nightly (#970)
Andrew Lamb [Tue, 23 Nov 2021 03:37:57 +0000 (22:37 -0500)] 
Fix CI for latest nightly (#970)

* Fix arrow doc examples

* more cleanup

7 months agoAdding Pretty Print Support For Fixed Size List (#958)
Brian Rackle [Mon, 22 Nov 2021 12:01:47 +0000 (04:01 -0800)] 
Adding Pretty Print Support For Fixed Size List (#958)

* Inferring 2. as Float64 for issue #929

* Adding pretty print support for fixed size list array

* fixing linting errors

* adding null row to test

7 months agoFix bug in temporal utilities due to DST being ignored. (#955)
Navin [Sat, 20 Nov 2021 12:04:58 +0000 (23:04 +1100)] 
Fix bug in temporal utilities due to DST being ignored. (#955)

* Check behaviour of temporal utilities for DST.

* Fix temporal util bug ignoring dst.

* Refactor macro for efficiency.

7 months agoFix primitive sort when input contains more nulls than the given sort limit (#954)
Jörn Horstmann [Thu, 18 Nov 2021 22:26:24 +0000 (23:26 +0100)] 
Fix primitive sort when input contains more nulls than the given sort limit (#954)

7 months agoUpdate comfy-table to 5.0 (#957)
Carol (Nichols || Goulding) [Thu, 18 Nov 2021 18:10:19 +0000 (13:10 -0500)] 
Update comfy-table to 5.0 (#957)

7 months agoadd more error test case and change the code style (#952)
Kun Liu [Wed, 17 Nov 2021 12:15:18 +0000 (20:15 +0800)] 
add more error test case and change the code style (#952)

7 months agoSupport read decimal data from csv reader if user provide the schema with decimal...
Kun Liu [Tue, 16 Nov 2021 16:43:29 +0000 (00:43 +0800)] 
Support read decimal data from csv reader if user provide the schema with decimal data type (#941)

* support decimal data type for csv reader

* format code and fix lint check

* fix the clippy error

* enchance the parse csv to decimal and add more test

7 months agoFix csv writing of timestamps to show timezone. (#849)
Navin [Tue, 16 Nov 2021 11:11:19 +0000 (22:11 +1100)] 
Fix csv writing of timestamps to show timezone. (#849)

* Write timestamps (in csvs) with timezone.

* More tests and more verbose naming.

* Please linter.

* Please clippy.

* Cleanup based on review feedback.

7 months agoInferring 2. as Float64 for issue #929 (#950)
Brian Rackle [Sat, 13 Nov 2021 13:00:47 +0000 (05:00 -0800)] 
Inferring 2. as Float64 for issue #929 (#950)

7 months agoFix validation for offsets of StructArrays (#942)
Andrew Lamb [Fri, 12 Nov 2021 11:16:35 +0000 (06:16 -0500)] 
Fix validation for offsets of StructArrays (#942)

* reproduce validation error

* Fix validation bug

Co-authored-by: Ben Chambers <bjchambers@gmail.com>
7 months agoimplement take kernel for null arrays (#939)
Ben Chambers [Thu, 11 Nov 2021 11:47:33 +0000 (03:47 -0800)] 
implement take kernel for null arrays (#939)

7 months agoadd checker for appending i128 to decimal builder (#928)
Kun Liu [Wed, 10 Nov 2021 11:40:29 +0000 (19:40 +0800)] 
add checker for appending i128 to decimal builder (#928)

* add check for appending i128 to decimal builder

* remove the ArrowError(DecimalError)

7 months agoAdding ability to parse float from number with leading decimal (#831)
Brian Rackle [Tue, 9 Nov 2021 22:59:37 +0000 (14:59 -0800)] 
Adding ability to parse float from number with leading decimal (#831)

* Adding ability to parse float from number with leading decimal

* Fixing deprecated std::usize::MAX constant per https://doc.rust-lang.org/core/usize/constant.MAX.html and making consistent with other usages

* Add test case for 2. and issue link

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
7 months agoadd ilike comparitor (#874)
Jordan Deitch [Tue, 9 Nov 2021 13:56:18 +0000 (08:56 -0500)] 
add ilike comparitor (#874)

* add ilike comparitor

* add ilike comparitor

Co-authored-by: Jordan Deitch <jdeitch@digitalocean.com>
7 months agofeat(ipc): add support for deserializing messages with nested dictionary fields ...
Helgi Kristvin Sigurbjarnarson [Mon, 8 Nov 2021 21:32:33 +0000 (13:32 -0800)] 
feat(ipc): add support for deserializing messages with nested dictionary fields (#923)

* feat(ipc): read a message containing nested dictionary fields

* Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* address lints

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
7 months agoValidate arguments to ArrayData::new and null bit buffer and buffers (#810)
Andrew Lamb [Mon, 8 Nov 2021 19:11:10 +0000 (14:11 -0500)] 
Validate arguments to ArrayData::new and null bit buffer and buffers (#810)

* Validate arguments to ArrayData::new: null bit buffer and buffers

* REname is_int_type to is_dictionary_key_type()

* Correctly handle self.offset in offsets buffer

* Consolidate checks

* Fix test output

7 months agoAutomatically retry failed MIRI runs to work around intermittent failures (#922)
Andrew Lamb [Sat, 6 Nov 2021 09:55:53 +0000 (05:55 -0400)] 
Automatically retry failed MIRI runs to work around intermittent failures (#922)

* Move MIRI checks into a shell script

* add retry loop

7 months agoMark boolean kernels public (#913)
Andrew Lamb [Thu, 4 Nov 2021 20:56:54 +0000 (16:56 -0400)] 
Mark boolean kernels public (#913)

7 months agodoc example mistype (#904)
kingeasternsun [Wed, 3 Nov 2021 19:54:29 +0000 (03:54 +0800)] 
doc example  mistype (#904)

7 months agoUpdate mod.rs (#909)
kingeasternsun [Wed, 3 Nov 2021 19:53:48 +0000 (03:53 +0800)] 
Update mod.rs (#909)

7 months agoBump deps (#864)
Chojan Shang [Tue, 2 Nov 2021 20:33:06 +0000 (04:33 +0800)] 
Bump deps (#864)

* Bump deps

Signed-off-by: Chojan Shang <psiace@outlook.com>
* Setup lastest cargo-tarpaulin

Signed-off-by: Chojan Shang <psiace@outlook.com>
* Try to use the lastest cargo

Signed-off-by: Chojan Shang <psiace@outlook.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
7 months agofix some clippy warnings (#896)
Jiayu Liu [Tue, 2 Nov 2021 13:10:25 +0000 (21:10 +0800)] 
fix some clippy warnings (#896)

7 months agoFix clippy (#900)
Andrew Lamb [Mon, 1 Nov 2021 14:43:32 +0000 (10:43 -0400)] 
Fix clippy (#900)

7 months agoDisable cargo / build caching for MIRI runs (#899)
Andrew Lamb [Mon, 1 Nov 2021 11:22:11 +0000 (07:22 -0400)] 
Disable cargo / build caching for MIRI runs (#899)

7 months agoFix instances of UB that cause tests to not pass under miri (#878)
Ben Kimock [Mon, 1 Nov 2021 11:09:57 +0000 (07:09 -0400)] 
Fix instances of UB that cause tests to not pass under miri (#878)

* Fix unaligned access in bit-packing

* Fix creation of unaligned reference in murmur_hash2_64a

* Remove now-unnecessary unsafe

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
7 months agotest moving out (#895)
Jiayu Liu [Mon, 1 Nov 2021 10:44:44 +0000 (18:44 +0800)] 
test moving out (#895)

7 months agocasting kernel can combine multi-match patterns (#883)
Jiayu Liu [Mon, 1 Nov 2021 03:25:56 +0000 (11:25 +0800)] 
casting kernel can combine multi-match patterns (#883)

7 months agofix some warning about unused variables in panic tests (#894)
Jiayu Liu [Mon, 1 Nov 2021 03:24:56 +0000 (11:24 +0800)] 
fix some warning about unused variables in panic tests (#894)

7 months agofix ffi warning on failed to drop (#893)
Jiayu Liu [Mon, 1 Nov 2021 03:24:39 +0000 (11:24 +0800)] 
fix ffi warning on failed to drop (#893)

7 months agoportable check for shasums (#887)
Benson Muite [Mon, 1 Nov 2021 01:14:57 +0000 (04:14 +0300)] 
portable check for shasums (#887)

similar to https://github.com/apache/arrow/pull/11531

7 months agoallow null array to be cased to all other types (#884)
Jiayu Liu [Mon, 1 Nov 2021 01:14:18 +0000 (09:14 +0800)] 
allow null array to be cased to all other types (#884)

7 months ago2018 -> 2021 (#591)
Jiayu Liu [Mon, 1 Nov 2021 01:13:45 +0000 (09:13 +0800)] 
2018 -> 2021 (#591)

7 months agoUse different caching for MIRI runs (#892)
Andrew Lamb [Sun, 31 Oct 2021 12:35:16 +0000 (08:35 -0400)] 
Use different caching for MIRI runs (#892)

* Use different caching for MIRI runs

* Also cache cargo

7 months agoRemove unpassable cargo publish check from verify-release-candidate.sh (#882)
Andrew Lamb [Fri, 29 Oct 2021 13:58:43 +0000 (09:58 -0400)] 
Remove unpassable cargo publish check from verify-release-candidate.sh (#882)

7 months agofeat(ipc): Support writing dictionaries nested in structs and unions (#870)
Helgi Kristvin Sigurbjarnarson [Fri, 29 Oct 2021 13:11:40 +0000 (06:11 -0700)] 
feat(ipc): Support writing dictionaries nested in structs and unions (#870)

* feat(ipc): Support for writing dictionaries nested in structs and unions

Dictionaries are lost when serializing a RecordBatch for IPC, producing
invalid arrow data. This PR changes encoded_batch to recursively find
all dictionary fields within the schema (currently only in structs and
unions) so nested dictionaries are properly serialized.

* address lint and clippy

7 months agobump .github/workflows/miri.yaml (#875)
Jiayu Liu [Wed, 27 Oct 2021 15:00:55 +0000 (23:00 +0800)] 
bump .github/workflows/miri.yaml (#875)

7 months agofix: fix a bug in offset calculation for unions (#863)
Helgi Kristvin Sigurbjarnarson [Tue, 26 Oct 2021 20:29:27 +0000 (13:29 -0700)] 
fix: fix a bug in offset calculation for unions (#863)

The `value_offset` function only read the least significant byte in the
offset array, causing issues with unions with more than 255 rows of any
given variant. Fix the issue by reading the entire i32 offset and add a
unit test.