Andrew Lamb [Fri, 21 Jan 2022 12:06:57 +0000 (07:06 -0500)]
Prepare for 8.0.0 release: Update CHANGELOG and versions (#1212)
* Update version to 8.0.0
* Update Changelog for 8.0.0
* restore RAT
* Remove items that were released in 7.0.0
Andrew Lamb [Fri, 21 Jan 2022 11:58:45 +0000 (06:58 -0500)]
Improve changelog generator script settings (#1210)
* Update changelog script
* fix typo
Yang [Wed, 19 Jan 2022 21:28:52 +0000 (05:28 +0800)]
Return error from JSON writer rather than panic (#1205)
* Return error from JSON writer rather than panic
* fix comment
Helgi Kristvin Sigurbjarnarson [Wed, 19 Jan 2022 21:28:31 +0000 (13:28 -0800)]
fix a bug in variable sized equality (#1209)
A missing validity buffer was being treated as all values being null,
rather than all values being valid, causing equality to fail on some
equivalent string and binary arrays.
Andrew Lamb [Wed, 19 Jan 2022 18:17:48 +0000 (13:17 -0500)]
Update parquet crate readme (#1192)
* Update parquet crate readme
* prettier
Remzi Yang [Wed, 19 Jan 2022 18:06:47 +0000 (02:06 +0800)]
Add comparison support for fully qualified BinaryArray (#1195)
* add eq_dyn for BinaryArray
Signed-off-by: remzi <13716567376yh@gmail.com>
* correct the code formatting
Signed-off-by: remzi <13716567376yh@gmail.com>
* add comparison support for fully qualified binary array
delete dyn comparison which will be added in successive PRs
Signed-off-by: remzi <13716567376yh@gmail.com>
* add tests for comparison of fully qualified BinaryArray
Signed-off-by: remzi <13716567376yh@gmail.com>
* add 2 missed tests
Signed-off-by: remzi <13716567376yh@gmail.com>
* move 2 functions
Signed-off-by: remzi <13716567376yh@gmail.com>
* fix reference error
Signed-off-by: remzi <13716567376yh@gmail.com>
Edd Robinson [Wed, 19 Jan 2022 12:19:29 +0000 (12:19 +0000)]
feat: add support for casting Duration/Interval to Int64Array (#1196)
* feat: add support for casting Duration to Int64Array
* feat: cast from Interval to Int64
Andrew Lamb [Wed, 19 Jan 2022 02:35:43 +0000 (21:35 -0500)]
Pin WASM / packed SIMD tests to nightly-2022-01-17 (#1204)
Helgi Kristvin Sigurbjarnarson [Tue, 18 Jan 2022 21:59:27 +0000 (13:59 -0800)]
bugfix in display of float16 array (#1194)
Due to a typo the float16 array was being cast to a float32 array,
causing a crash when pretty printing a record batch containing float16.
Raphael Taylor-Davies [Tue, 18 Jan 2022 12:13:21 +0000 (12:13 +0000)]
parquet: Optimized ByteArrayReader, Add UTF-8 Validation (#1040) (#1082)
* Optimized ByteArrayReader (#1040)
UTF-8 Validation (#786)
* Fix arrow_array_reader benchmark
* Allow running subset of arrow_array_reader benchmarks
* Faster UTF-8 validation
* Tweak null handling
* Add license
* Refine `ValuesBuffer::pad_nulls`
* Tweak error handling
* Use page null count if available
* Doc comments
* Test DELTA_BYTE_ARRAY encoding
* Support legacy Encoding::PLAIN_DICTIONARY
* Add OffsetBuffer unit tests
Review feedback
* More tests
* Fix lint
* Review feedback
Helgi Kristvin Sigurbjarnarson [Tue, 18 Jan 2022 12:09:12 +0000 (04:09 -0800)]
feat(parquet): support for reading structs nested within lists (#1187)
* feat(parquet): support for reading structs nested within lists
* fix: logical conflict
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Jiayu Liu [Tue, 18 Jan 2022 01:52:01 +0000 (09:52 +0800)]
update nightly version for miri (#1189)
Raphael Taylor-Davies [Mon, 17 Jan 2022 15:51:21 +0000 (15:51 +0000)]
Truncate bitmask on split (#1183)
* Truncate bitmask on split
* Fix BooleanBufferBuilder::resize
* Format
Helgi Kristvin Sigurbjarnarson [Mon, 17 Jan 2022 15:49:14 +0000 (07:49 -0800)]
fix: Fix a bug in how filter indices are calculated (#1185)
* fix: Fix a bug in how filter indices are calculated
Using the definition level and the nullability of the column only
produces the correct indices if max_definition - 1 is the list level.
For deeper nesting (struct in a list) this produces incorrect indices,
silently causing incorrect data to be written.
This fix uses the array offsets to compute the indices instead.
* add assertions
Kun Liu [Mon, 17 Jan 2022 15:42:45 +0000 (23:42 +0800)]
Support DecimalType in sort and take kernels (#1172)
Jiayu Liu [Mon, 17 Jan 2022 15:32:56 +0000 (23:32 +0800)]
add from_iter_values for binary array (#1188)
Raphael Taylor-Davies [Sun, 16 Jan 2022 12:08:35 +0000 (12:08 +0000)]
Use tempfile for parquet tests (#1165)
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Raphael Taylor-Davies [Sat, 15 Jan 2022 18:49:09 +0000 (18:49 +0000)]
Serialize i128 as JSON string (#1175)
Andrew Lamb [Sat, 15 Jan 2022 18:48:15 +0000 (13:48 -0500)]
Add ticket reference for false positive (#1181)
Raphael Taylor-Davies [Sat, 15 Jan 2022 11:34:44 +0000 (11:34 +0000)]
Fix record formatting in 1.58 (#1178)
Helgi Kristvin Sigurbjarnarson [Fri, 14 Jan 2022 18:09:51 +0000 (10:09 -0800)]
Bugfix in parquet writing empty lists of structs (#1166)
Fix a bug in the definition level calculation for fields nested within a
struct and a list. When a list is empty or null in parquet the nested
field gets a null value. However, in arrow, the value is simply missing.
When serializing an immediate child of the list, the list offsets are
used to calculate the correct definition level for its children, but it
is not carried further to fields nested deeper (e.g., fields on a struct
within a list). This (somewhat hacky) fix treats a struct within a list
as if it were a list.
Jörn Horstmann [Fri, 14 Jan 2022 16:17:44 +0000 (17:17 +0100)]
Fix compilation error with simd feature (#1169)
Andrew Lamb [Fri, 14 Jan 2022 16:17:36 +0000 (11:17 -0500)]
Fix new clippy lints introduced in Rust 1.58 (#1170)
Jörn Horstmann [Thu, 13 Jan 2022 18:27:52 +0000 (19:27 +0100)]
Simplify and reduce code duplication in arithmetic kernels (#1161)
* Simplify and reduce code duplication in arithmetic kernels
* Update comments
Andrew Lamb [Thu, 13 Jan 2022 18:14:39 +0000 (13:14 -0500)]
Update dev/release/README for master releases, remove supporting scripts (#1143)
* Update dev/release/README for master releases
* remove cherry pick script
Raphael Taylor-Davies [Thu, 13 Jan 2022 15:18:55 +0000 (15:18 +0000)]
Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037) (#1054)
* Preserve bitmask (#1037)
* Remove now unnecessary box (#1061)
* Fix handling of empty bitmasks
* More docs
* Add nested nullability test case
* Add packed decoder test
Andrew Lamb [Thu, 13 Jan 2022 06:33:24 +0000 (01:33 -0500)]
Remove left over readme file from arrow/arrow-rs split (#1162)
Raphael Taylor-Davies [Wed, 12 Jan 2022 14:44:07 +0000 (14:44 +0000)]
Fuzz test different parquet encodings (#1156)
Liang-Chi Hsieh [Wed, 12 Jan 2022 12:22:42 +0000 (04:22 -0800)]
Add subtract_scalar kernel (#1152)
* Add subtract_scalar
* Rebase
Liang-Chi Hsieh [Wed, 12 Jan 2022 12:21:37 +0000 (04:21 -0800)]
Add multiply_scalar (#1159)
Helgi Kristvin Sigurbjarnarson [Tue, 11 Jan 2022 19:19:38 +0000 (11:19 -0800)]
feat(json): support for map arrays in json writer (#1149)
Liang-Chi Hsieh [Tue, 11 Jan 2022 19:18:52 +0000 (11:18 -0800)]
Add add_scalar kernel (#1151)
* Add add_scalar
* move simd_float_unary_math_op to simd_unary_math_op
Andrew Lamb [Tue, 11 Jan 2022 19:10:01 +0000 (14:10 -0500)]
Document safety justification of some uses of `from_trusted_len_iter` (#1148)
Liang-Chi Hsieh [Tue, 11 Jan 2022 19:08:52 +0000 (11:08 -0800)]
Move simd right out of for_each loop (#1150)
Raphael Taylor-Davies [Tue, 11 Jan 2022 18:01:15 +0000 (18:01 +0000)]
Generify ColumnReaderImpl and RecordReader (#1040) (#1041)
* Simplify record reader
* Generify ColumnReaderImpl and RecordReader (#1040)
* Tweak count_records predicate
* Pre-allocate bitmask
* fix: TypedBuffer::split update len
* Simplify GenericRecordReader
* Move column decoders into module
* Remove `RecordBuffer::create` method
* Remove `TypedBuffer<i16>::count_records`
* Pass null count to `ColumnValueDecoder::read`
* Pull null padding out of column reader
* Review feedback
* Format
* License headers
* Further doc tweaks
* Further docs
* Restrict ScalarBuffer types
Andrew Lamb [Tue, 11 Jan 2022 17:58:02 +0000 (12:58 -0500)]
Remove `GenericStringArray::from_vec` and `GenericStringArray::from_opt_vec` (#1147)
Raphael Taylor-Davies [Tue, 11 Jan 2022 15:12:30 +0000 (15:12 +0000)]
BooleanBufferBuilder::append_packed (#1038) (#1039)
* BooleanBufferBuilder::append_packed (#1038)
* Update docstring
* Add packed_append_range
* Fix capacity
* Use set_bits from transform::util
* Add license
* Format
Raphael Taylor-Davies [Tue, 11 Jan 2022 14:05:03 +0000 (14:05 +0000)]
Improve parquet performance: Skip levels computation for required struct arrays in parquet (#1035)
* Skip levels computation for required struct arrays (#1034)
* Review feedback
Raphael Taylor-Davies [Tue, 11 Jan 2022 14:02:30 +0000 (14:02 +0000)]
Restrict RecordReader and friends to scalar types (#1132) (#1155)
Raphael Taylor-Davies [Tue, 11 Jan 2022 14:00:50 +0000 (14:00 +0000)]
Extends parquet fuzz tests to also tests nulls, dictionaries and row groups with multiple pages (#1053) (#1110)
* Parquet fuzz tests (#1053)
* Test multiple WriterVersions
* Revert array_reader change
Raphael Taylor-Davies [Mon, 10 Jan 2022 21:50:31 +0000 (21:50 +0000)]
Move more parquet functionality behind experimental feature flag (#1032) (#1134)
* Move more parquet functionality behind experimental feature flag (#1032)
* Fix logical conflicts
Jörn Horstmann [Mon, 10 Jan 2022 21:49:46 +0000 (22:49 +0100)]
Implement SIMD comparison operations for types with less than 4 lanes (i128) (#1146)
* Implement simd mask creation for 128 bit types
* Adjust comparison kernels to always append 64 bit chunks
* Only append minimal number of bytes
* Add benchmark for MonthDayNano comparison
* Fix typo in comment
Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>
* Fix typo in comment
Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>
Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>
Andrew Lamb [Mon, 10 Jan 2022 21:47:06 +0000 (16:47 -0500)]
Fix undefined behavor in GenericStringArray::from_iter_values (#1145)
* Fix undefined behavor in GenericStringArray::from_iter_values
* Cleanup code and tests
* clippy
* Fix test
Andrew Lamb [Sat, 8 Jan 2022 10:28:32 +0000 (05:28 -0500)]
Update readme to clarify versioning (#1142)
Andrew Lamb [Sat, 8 Jan 2022 10:19:08 +0000 (05:19 -0500)]
Update version to 7.0.0 and update CHANGELOG (#1141)
* Update changelog generator
* Bring changelog from 6.5.0
* Update changelog
* Update version to 7.0.0
Helgi Kristvin Sigurbjarnarson [Thu, 6 Jan 2022 22:46:16 +0000 (14:46 -0800)]
feat(ipc): support for reading union arrays through IPC (#1140)
Raphael Taylor-Davies [Thu, 6 Jan 2022 22:22:35 +0000 (22:22 +0000)]
Dyn comparison of interval arrays (#1106) (#1107)
* Dyn comparison of interval arrays (#1106)
* fix fmt
* Skip test when simd is enabled
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Helgi Kristvin Sigurbjarnarson [Thu, 6 Jan 2022 22:12:40 +0000 (14:12 -0800)]
feat: union schema serialization/deserialization for ipc (#1135)
Andrew Lamb [Thu, 6 Jan 2022 22:12:23 +0000 (17:12 -0500)]
*_dyn_scalar kernels: Support Float32Array and Float64Array, (#1127)
* *_dyn_scalar kernels: Support Float32Array and Float64Array, use ToPrimitive rather than `Into<i128>`m take take &dyn Array rather than `ArrayRef`
* Update APIs for *_dyn_bool_scalar kernels
Benson Muite [Thu, 6 Jan 2022 22:11:01 +0000 (01:11 +0300)]
Add more information on SIMD (#1138)
Matthew Turner [Wed, 5 Jan 2022 21:53:32 +0000 (16:53 -0500)]
Add dyn boolean kernels (#1131)
* Add dyn bool kernels
* Add tests
* Update error messages
* Update test
* Fix test
* Update doc strings
Yordan Pavlov [Wed, 5 Jan 2022 19:57:19 +0000 (19:57 +0000)]
Fix reading of dictionary encoded pages with null values (#1111) (#1130)
* fix reading of dictionary encoded pages with null values
* fix linting issues
Raphael Taylor-Davies [Wed, 5 Jan 2022 16:27:27 +0000 (16:27 +0000)]
Make arrow::array_reader private (#1032) (#1133)
Andrew Lamb [Wed, 5 Jan 2022 13:45:27 +0000 (08:45 -0500)]
Implement Array for ArrayRef, Improve as_* kernels to take `&dyn Array` (#1129)
* Implement Array for ArrayRef
* Improve as_* kernels to take &dyn Array
* remove uneeded pyarrow binding
Andrew Lamb [Wed, 5 Jan 2022 12:29:25 +0000 (07:29 -0500)]
Add Schema::with_metadata and Field::with_metadata (#1092)
Sumit [Sun, 2 Jan 2022 16:42:43 +0000 (17:42 +0100)]
allow using custom datetime format for inference and parsing csv file (#1112)
* allow using custom datetime format for inference and parsing csv file
The patch extends the current implementation to allow passing a custom
datetime_re and datetime_format to the ReaderBuilder.
datetime_re is used infer schema of the csv and then datetime_format is
used to parse the actual string to a Date64.
ofcourse passing non-compatible datetime_re and datetime_format values
is going to fail the parsing or inference, however it is an expected but
hard-to-detect failure.
* Incorporate some clippy recommendations for limit count of call args
The patch adds a new struct to collect all these options together and
then passes the struct around. Ideally the struct could be embedded into
the reader but that can be done as separate exercise.
* Detect presence of timezone in format while parsing csv for date64
The patch decides on using NaiveDateTime or DateTime from chrono lib
based on presence of timezone components
chrono expects timezone to be presetn if DateTime is used, errors
otherwise. Whereas NaiveDateTime ignores timezone even if explicitly
provided.
Andrew Lamb [Sun, 2 Jan 2022 14:37:45 +0000 (09:37 -0500)]
Update Union Array to add `UnionMode`, match latest Arrow Spec, and rename `new` -> `unsafe new_unchecked()` (#885)
* Update union array to new null handling
* Update arrow/src/array/array_union.rs
* correct comment
Matthew Turner [Sun, 2 Jan 2022 14:24:50 +0000 (09:24 -0500)]
Add kernel and tests (#1125)
Matthew Turner [Sun, 2 Jan 2022 14:24:18 +0000 (09:24 -0500)]
Add kernel and tests (#1123)
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Matthew Turner [Sun, 2 Jan 2022 13:45:08 +0000 (08:45 -0500)]
Add kernel and tests (#1122)
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Matthew Turner [Sun, 2 Jan 2022 13:02:22 +0000 (08:02 -0500)]
Add neq dyn scalar kernel (#1118)
* Add lt_dyn_scalar and tests
* Add lt_eq_dyn_scalar kernel
* Add gt_dyn_scalar kernel
* Add gt_eq_dyn_scalar kernel
* Add neq_dyn_scalar kernel
* Add kernel to err message
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Matthew Turner [Sun, 2 Jan 2022 12:08:24 +0000 (07:08 -0500)]
Add gt eq dyn scalar kernel (#1117)
* Add lt_dyn_scalar and tests
* Add lt_eq_dyn_scalar kernel
* Add gt_dyn_scalar kernel
* Add gt_eq_dyn_scalar kernel
* Add kernel to err message
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Matthew Turner [Sun, 2 Jan 2022 11:47:28 +0000 (06:47 -0500)]
Add gt dyn scalar kernel (#1116)
* Add gt_dyn_scalar kernel
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Matthew Turner [Sun, 2 Jan 2022 11:33:51 +0000 (06:33 -0500)]
Add lt eq dyn scalar kernel (#1115)
* Add lt_dyn_scalar and tests
* Add lt_eq_dyn_scalar kernel
* Add kernel to error message
* fix merge problem
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Matthew Turner [Sun, 2 Jan 2022 11:25:00 +0000 (06:25 -0500)]
Add kernel and tests (#1121)
Matthew Turner [Sun, 2 Jan 2022 11:11:00 +0000 (06:11 -0500)]
Add kernel and tests (#1124)
Matthew Turner [Sun, 2 Jan 2022 11:08:36 +0000 (06:08 -0500)]
Add lt dyn scalar kernel (#1114)
* Add lt_dyn_scalar and tests
* Add kernel to error message
Kun Liu [Sun, 2 Jan 2022 11:06:40 +0000 (19:06 +0800)]
fix bug: error type for BufferBuilder (#1104)
* fix bug: error type for BufferBuilder
* fix clippy
Matthew Turner [Sat, 1 Jan 2022 12:06:08 +0000 (07:06 -0500)]
Define eq_dyn_scalar API (#1074)
* Squash
* Cleanup error messages
Kun Liu [Wed, 29 Dec 2021 19:51:05 +0000 (03:51 +0800)]
MutableArrayData support extend decimal data type (#1100)
* support extend decimal data type
* add more test
Francis Le Roy [Wed, 29 Dec 2021 19:50:40 +0000 (20:50 +0100)]
Print the 'FixedSizeBinaryArray' like a normal 'BinaryArray' (#1097)
* Print the 'FixedBinaryArray' like a normal 'BinaryArray'
* apply cargo fmt
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Liang-Chi Hsieh [Wed, 29 Dec 2021 13:20:27 +0000 (05:20 -0800)]
implement eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn for timestamp types (#1095)
* implement eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn for timestamp types
* Simplify test code
Andrew Lamb [Wed, 29 Dec 2021 12:01:22 +0000 (07:01 -0500)]
Allow proc-macro2 dependency to be flexible (#1102)
Kun Liu [Thu, 23 Dec 2021 13:53:00 +0000 (21:53 +0800)]
support cast decimal to decimal (#1084)
* support cast decimal to decimal
* add test case
* remove meaningless code
Daniël Heres [Wed, 22 Dec 2021 17:54:30 +0000 (18:54 +0100)]
Fix like regex escaping (#1085)
* Fix like regex escaping
* Fix like regex escaping
* Fix doctest
* Simplify
Kun Liu [Wed, 22 Dec 2021 16:43:44 +0000 (00:43 +0800)]
support cast decimal to signed numeric (#1073)
* add cast test macro function; refactor other type to decimal type; add decimal to signed numeric type
support decimal to unsigned numeric
* address the comments and fix the clippy
dbr/Ben [Wed, 22 Dec 2021 16:36:06 +0000 (03:36 +1100)]
Update pyo3 to 0.15 (#1076)
* Update pyo3 to 0.15
* Update pyo3 in integration tests also
Andrew Lamb [Tue, 21 Dec 2021 11:52:56 +0000 (06:52 -0500)]
parquet: Use constant for RLE decoder buffer size (#1070)
Stephen Carman [Mon, 20 Dec 2021 16:48:43 +0000 (11:48 -0500)]
Add Schema::project and RecordBatch::project functions (#1033)
* Allow Schema and RecordBatch to project schemas on specific columns returning a new schema with those columns only
* Addressing PR updates and adding a test for out of range projection
* switch to &[usize]
* fix: clippy and fmt
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Raphael Taylor-Davies [Mon, 20 Dec 2021 16:46:23 +0000 (16:46 +0000)]
Box RleDecoder index buffer (#1061) (#1062)
* Box RleDecoder index buffer (#1061)
* Format
Kun Liu [Mon, 20 Dec 2021 16:36:53 +0000 (00:36 +0800)]
support cast signed numeric to decimal (#1044)
* support cast signed numeric to decimal
* add test for i8,i16,i32,i64,f32,f64 casted to decimal
* change format of float64
* add none test; merge integer test together
Dmitry Patsura [Mon, 20 Dec 2021 16:31:52 +0000 (19:31 +0300)]
fix(compute): LIKE escape parenthesis (#1042)
Signed-off-by: Dmitry Patsura <talk@dmtry.me>
baishen [Mon, 20 Dec 2021 14:41:00 +0000 (22:41 +0800)]
Add MONTH_DAY_NANO interval type, impl `ArrowNativeType` for `i128` (#779)
* support interval MonthDayNano
* fix
* fix
* fix
* fix test
* add IPC integration test
* fix rat
* update patch
* fix
* fmt
* fix
* fix
* fix
* fix
* fix
* fix
* remove integration-testing/unskip.patch
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Raphael Taylor-Davies [Mon, 20 Dec 2021 13:28:17 +0000 (13:28 +0000)]
BooleanBufferBuilder correct buffer length (#1051) (#1052)
Carol (Nichols || Goulding) [Fri, 17 Dec 2021 19:23:33 +0000 (14:23 -0500)]
Address benchmarks that aren't compiling (#1001)
* Add a CI job that checks benchmarks (but doesn't run them)
* The feature test_common must be turned on to build parquet benchmarks
* Align cache keys
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Andrew Lamb [Fri, 17 Dec 2021 15:38:07 +0000 (10:38 -0500)]
Remove outdated safety example from doc (#1050)
Max Burke [Fri, 17 Dec 2021 11:20:49 +0000 (03:20 -0800)]
Use existing array type in `take` kernel (#1046)
* Need to use type from data so that we do not lose, for example, timezone information
* add test for take preseving timezone
Jörn Horstmann [Wed, 15 Dec 2021 19:56:45 +0000 (20:56 +0100)]
Avoid allocating vector of indices in lexicographical_partition_ranges (#998)
* Avoid allocating vector of indices in lexicographical_partition_ranges
* Adjust comments
* Improve comments and remove one unneeded parameter
Andrew Lamb [Wed, 15 Dec 2021 19:56:16 +0000 (14:56 -0500)]
Mark `MutableBuffer::typed_data_mut` unsafe (#1029)
* Mark `MutableBuffer::typed_data_mut` unsafe
* fmt
* Mark use of `typed_data_but` as unsafe in simd kernels
Raphael Taylor-Davies [Tue, 14 Dec 2021 19:41:18 +0000 (19:41 +0000)]
Extract method to drive PageIterator -> RecordReader (#1031)
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Raphael Taylor-Davies [Mon, 13 Dec 2021 21:44:47 +0000 (21:44 +0000)]
Simplify parquet arror `RecordReader` (#1021)
Andrew Lamb [Sun, 12 Dec 2021 21:04:33 +0000 (16:04 -0500)]
Clarify governance of arrow crate (#1030)
Andrew Lamb [Fri, 10 Dec 2021 15:11:45 +0000 (10:11 -0500)]
Force new cargo and target caching to fix CI (#1023)
Adam Gutglick [Thu, 9 Dec 2021 18:18:49 +0000 (20:18 +0200)]
Fix: fixes a broken link and some missing styling in the main arrow crate docs (#1013)
Andrew Lamb [Mon, 6 Dec 2021 20:39:52 +0000 (15:39 -0500)]
Remove out of date comment (#1008)
Carol (Nichols || Goulding) [Sat, 4 Dec 2021 15:28:30 +0000 (10:28 -0500)]
Minimize features of indexmap and chrono (#1000)
* Disable default features of chrono; only enable features needed
Chrono's default features contain "oldtime", which is deprecated.
According to [the docs](https://docs.rs/chrono/0.4.19/chrono/#duration),
> new code should disable the oldtime feature and use the
> chrono::Duration type instead. The oldtime feature is enabled by
> default for backwards compatibility, but future versions of Chrono
> are likely to remove the feature entirely.
so follow that recommendation by setting default-features to false. And
actually, only Arrow needs the "clock" feature, so all the other
features can stay off too to minimize the feature set that projects
depending on arrow or parquet are forced to enable.
* Explicitly enable indexmap's "std" feature
The indexmap crate uses the autocfg crate to do target detection to
determine whether `std` is available. Arrow isn't targeting `no_std`
environments, so the target detection isn't necessary. This might save
some build time.
https://github.com/bluss/indexmap/pull/145
Navin [Sat, 4 Dec 2021 15:21:47 +0000 (02:21 +1100)]
Docstrings for Timestamp*Array. (#988)
* Docstrings for TimestampSecondArray.
* fixup! Docstrings for TimestampSecondArray.
Carlos [Sat, 4 Dec 2021 15:13:19 +0000 (23:13 +0800)]
Update rust version to 1.57 (#1003)
Andrew Lamb [Sat, 4 Dec 2021 11:43:45 +0000 (06:43 -0500)]
Add full data validation for ArrayData::try_new() (#921)
* Add full data validation for ArrayData::try_new()
* Only look at offset+len indexes
Co-authored-by: Jörn Horstmann <git@jhorstmann.net>
* fix test
* fmt
* test for array indexes
Co-authored-by: Jörn Horstmann <git@jhorstmann.net>
Carol (Nichols || Goulding) [Fri, 3 Dec 2021 21:58:04 +0000 (16:58 -0500)]
Remove unneeded `rc` feature of serde (#990)
Fixes #989.
This feature opts into impls for `Rc` and `Arc`, but none of the data
structures that use Serialize/Deserialize actually contain `Rc` or
`Arc`s.
See:
- [Serde docs](https://serde.rs/feature-flags.html#-features-rc)
- [PR adding this](https://github.com/apache/arrow/pull/3016)