Andrew Lamb [Fri, 12 Nov 2021 11:56:10 +0000 (06:56 -0500)]
Prepare for 6.2.0 release (#947)
* Update version to 6.2.0
* Add CHANGELOG for 6.2.0
Andrew Lamb [Fri, 12 Nov 2021 11:49:08 +0000 (06:49 -0500)]
Fix validation for offsets of StructArrays (#942) (#946)
* reproduce validation error
* Fix validation bug
Co-authored-by: Ben Chambers <bjchambers@gmail.com>
Co-authored-by: Ben Chambers <bjchambers@gmail.com>
Andrew Lamb [Fri, 12 Nov 2021 11:18:19 +0000 (06:18 -0500)]
implement take kernel for null arrays (#939) (#944)
Co-authored-by: Ben Chambers <35960+bjchambers@users.noreply.github.com>
Andrew Lamb [Fri, 12 Nov 2021 11:18:09 +0000 (06:18 -0500)]
add checker for appending i128 to decimal builder (#928) (#943)
* add check for appending i128 to decimal builder
* remove the ArrowError(DecimalError)
Co-authored-by: Kun Liu <liukun@apache.org>
Andrew Lamb [Tue, 9 Nov 2021 13:58:02 +0000 (08:58 -0500)]
Validate arguments to ArrayData::new and null bit buffer and buffers (#810) (#936)
* Validate arguments to ArrayData::new: null bit buffer and buffers
* REname is_int_type to is_dictionary_key_type()
* Correctly handle self.offset in offsets buffer
* Consolidate checks
* Fix test output
Andrew Lamb [Tue, 9 Nov 2021 12:25:05 +0000 (07:25 -0500)]
fix some warning about unused variables in panic tests (#894) (#933)
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Tue, 9 Nov 2021 12:24:34 +0000 (07:24 -0500)]
fix some clippy warnings (#896) (#930)
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Tue, 9 Nov 2021 12:24:20 +0000 (07:24 -0500)]
feat(ipc): add support for deserializing messages with nested dictionary fields (#923) (#931)
* feat(ipc): read a message containing nested dictionary fields
* Apply suggestions from code review
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* address lints
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Helgi Kristvin Sigurbjarnarson <helgi@lacework.net>
Andrew Lamb [Tue, 9 Nov 2021 12:24:09 +0000 (07:24 -0500)]
test moving out (#895) (#932)
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Tue, 9 Nov 2021 12:23:42 +0000 (07:23 -0500)]
Cherry pick Automatically retry failed MIRI runs to work around intermittent failures (#934)
* Automatically retry failed MIRI runs to work around intermittent failures (#922)
* Move MIRI checks into a shell script
* add retry loop
* Do not use cache for miri
Andrew Lamb [Fri, 5 Nov 2021 17:46:37 +0000 (13:46 -0400)]
Update mod.rs (#909) (#919)
Co-authored-by: kingeasternsun <kingeasternsun@gmail.com>
Andrew Lamb [Fri, 5 Nov 2021 17:46:29 +0000 (13:46 -0400)]
Mark boolean kernels public (#913) (#920)
Andrew Lamb [Fri, 5 Nov 2021 10:52:06 +0000 (06:52 -0400)]
doc example mistype (#904) (#918)
Co-authored-by: kingeasternsun <kingeasternsun@gmail.com>
Andrew Lamb [Fri, 5 Nov 2021 10:51:42 +0000 (06:51 -0400)]
allow null array to be cased to all other types (#884) (#917)
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Fri, 5 Nov 2021 10:51:30 +0000 (06:51 -0400)]
Fix instances of UB that cause tests to not pass under miri (#878) (#916)
* Fix unaligned access in bit-packing
* Fix creation of unaligned reference in murmur_hash2_64a
* Remove now-unnecessary unsafe
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Ben Kimock <kimockb@gmail.com>
Andrew Lamb [Fri, 5 Nov 2021 10:51:22 +0000 (06:51 -0400)]
feat(ipc): Support writing dictionaries nested in structs and unions (#870) (#915)
* feat(ipc): Support for writing dictionaries nested in structs and unions
Dictionaries are lost when serializing a RecordBatch for IPC, producing
invalid arrow data. This PR changes encoded_batch to recursively find
all dictionary fields within the schema (currently only in structs and
unions) so nested dictionaries are properly serialized.
* address lint and clippy
Co-authored-by: Helgi Kristvin Sigurbjarnarson <helgikrs@gmail.com>
Andrew Lamb [Tue, 2 Nov 2021 12:59:16 +0000 (08:59 -0400)]
Fix references to changelog (#905)
Andrew Lamb [Fri, 29 Oct 2021 13:27:02 +0000 (09:27 -0400)]
Release 6.1.0 (#880)
* Update changelog for 6.1 release
* Update version to 6.1.0
Andrew Lamb [Wed, 27 Oct 2021 12:44:25 +0000 (08:44 -0400)]
implement eq_dyn and neq_dyn (#858) (#867)
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Wed, 27 Oct 2021 11:29:42 +0000 (07:29 -0400)]
fix: fix a bug in offset calculation for unions (#863) (#871)
The `value_offset` function only read the least significant byte in the
offset array, causing issues with unions with more than 255 rows of any
given variant. Fix the issue by reading the entire i32 offset and add a
unit test.
Co-authored-by: Helgi Kristvin Sigurbjarnarson <helgikrs@gmail.com>
Andrew Lamb [Wed, 27 Oct 2021 11:29:33 +0000 (07:29 -0400)]
add lt_bool, lt_eq_bool, gt_bool, gt_eq_bool (#860) (#868)
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Wed, 27 Oct 2021 10:46:17 +0000 (06:46 -0400)]
Test out new tarpaulin version (#852) (#866)
Andrew Lamb [Wed, 27 Oct 2021 10:46:11 +0000 (06:46 -0400)]
fix(ipc): Support serializing structs containing dictionaries (#848) (#865)
* fix(ipc): Support serializing structs containing dictionaries
Dictionary fields nested in structs were not properly marked as
dictionary fields when serializing to fb.
* style: cargo fmt
Co-authored-by: Helgi Kristvin Sigurbjarnarson <helgikrs@gmail.com>
Andrew Lamb [Mon, 25 Oct 2021 10:51:41 +0000 (06:51 -0400)]
Implement boolean equality kernels (#844) (#857)
* Implement boolean equality kernels
* Respect offset
* Simplify
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Andrew Lamb [Mon, 25 Oct 2021 10:51:22 +0000 (06:51 -0400)]
Cherry pick fix parquet_derive with default features (and fix cargo publish) (#856)
* fix parquet_derive with default features (and fix `cargo publish`) (#837)
* Run all tests and do dry runs of cargo publish
* Add test for building parquet derive with default features'
* fix feature flags in parquet crate
* fixup rat
* fix default feature test
* Update parquet_derive/test/dependency/default-features/Cargo.toml
* Remove merge issue
Andrew Lamb [Sun, 24 Oct 2021 11:09:43 +0000 (07:09 -0400)]
Use kernel utility for parsing timestamps in csv reader. (#832) (#853)
* Use kernel utility for parsing timestamps in csvs.
* Remove cruft.
* Cleanup.
* Lint.
* Remove erroneous stringify.
Co-authored-by: Navin <navin@novemberkilo.com>
Andrew Lamb [Sun, 24 Oct 2021 11:09:26 +0000 (07:09 -0400)]
Update README.md (#834) (#854)
fix readme with invalid markdown syntax
Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Andrew Lamb [Sun, 24 Oct 2021 11:08:51 +0000 (07:08 -0400)]
[MINOR] Delete temp file from docs (#836) (#855)
* Delete temp file from docs
* fix
* Use gitignore instead
Andrew Lamb [Sat, 23 Oct 2021 21:18:57 +0000 (17:18 -0400)]
Force fresh cargo cache key in CI (#839) (#851)
Andrew Lamb [Sat, 23 Oct 2021 12:34:02 +0000 (08:34 -0400)]
[Minor] Fix clippy errors with new rust version (1.56) and float formatting with nightly (#845) (#850)
* Clippy fixes
* Test formatting fixes
* Test formatting fixes
* Fixup
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Andrew Lamb [Wed, 13 Oct 2021 19:14:49 +0000 (15:14 -0400)]
Update version to 6.0.0 (#828)
Andrew Lamb [Wed, 13 Oct 2021 19:05:26 +0000 (15:05 -0400)]
Add Changelog for 6.0.0 (#827)
* Add Changelog
* Cleanup Changelog
Andrew Lamb [Wed, 13 Oct 2021 17:17:32 +0000 (13:17 -0400)]
Replace `ArrayData::new()` with `ArrayData::try_new()` and `unsafe ArrayData::new_unchecked` (#822)
* Replace `ArrayData::new()` with `ArrayData::try_new()` and `unsafe ArrayData::new_unchecked`
* Fix compile for simd
* remove unsafe in benches
Wakahisa [Wed, 13 Oct 2021 13:46:07 +0000 (15:46 +0200)]
JSON reader - empty nested list should not create child value (#826)
* JSON reader - empty nested list should not create child value
* PR review
Sumit [Wed, 13 Oct 2021 12:59:33 +0000 (14:59 +0200)]
Add support for parsing timezone using chrono-tz (#824)
- add chrono-tz as an optional depedancy
- try parse using chrono for the numeric format
- if not then try using chrono-tz if present
- return error if neither result in FixedOffset
Sumit [Mon, 11 Oct 2021 20:16:28 +0000 (22:16 +0200)]
handle tz while extractiing second/minute/hour from temporal arrays (#771)
The patch rewrites the behaviour using macros to indicate the
repetitive nutate of operations
Wakahisa [Mon, 11 Oct 2021 19:59:11 +0000 (21:59 +0200)]
Fewer ByteArray allocations when writing binary columns (#820)
* split benchmarks of primitive arrays
* add list benches
* Allocate one ByteArray per row group write
* enumerate
Jiayu Liu [Fri, 8 Oct 2021 17:52:49 +0000 (01:52 +0800)]
[nit] update readme.md and reformat (#821)
* update readme.md and reformat
* update arrow crate
Wakahisa [Thu, 7 Oct 2021 10:47:38 +0000 (12:47 +0200)]
Separate parquet writer benchmarks (#818)
* split benchmarks of primitive arrays
* add list benches
Andrew Lamb [Thu, 7 Oct 2021 00:29:04 +0000 (20:29 -0400)]
Fix null count when casting ListArray (#816)
Matthew Turner [Thu, 30 Sep 2021 19:10:13 +0000 (15:10 -0400)]
Add Parquet writer example to docs (#797)
* First example parquet writer
* Add WriterProp examples
* Add missing imports
* Remove options and run doctest
* One more section to run
* no_run on read example
* Make reader run test
* Fix get_schema_by_cols
Ben Chambers [Thu, 30 Sep 2021 19:09:46 +0000 (12:09 -0700)]
expose buffer ops (#809)
Kornelijus Survila [Thu, 30 Sep 2021 10:43:57 +0000 (04:43 -0600)]
parquet: Avoid NaN check for non-floats (#798)
It was especially expensive for `ByteArray` columns, potentially taking as
long as the rest of encoding.
Andrew Lamb [Sun, 26 Sep 2021 11:10:58 +0000 (07:10 -0400)]
Remove extra quote in release instructions (#804)
Navin [Sun, 26 Sep 2021 11:04:57 +0000 (21:04 +1000)]
Doctests for DictionaryArrays. (#805)
msalib [Fri, 24 Sep 2021 15:38:13 +0000 (11:38 -0400)]
Make parquet's optional arrow dependency skip the default features (#801)
* Make parquet only depend on minimal arrow features
parquet depends on arrow but arrow by default has a large number of features. That means that users who depend on parquet get the full arrow feature set, even if they don't need it. But parquet itself only needs the ipc feature.
* ipc is not even needed
Mike Seddon [Tue, 21 Sep 2021 16:16:46 +0000 (02:16 +1000)]
add wasm32 to hash, fix wasm32 build (#787)
* add wasm32 to hash
* cargo fmt
Navin [Tue, 21 Sep 2021 16:15:40 +0000 (02:15 +1000)]
Doctests for arrays - via collect method. (#785)
Boaz [Sun, 19 Sep 2021 15:05:39 +0000 (18:05 +0300)]
Make BooleanBufferBuilder get_bit not require mutable reference (#784)
Ilya Biryukov [Fri, 17 Sep 2021 16:06:35 +0000 (19:06 +0300)]
fix: nanosecond timestamp scaling during string conversion (#780) (#781)
Some datetime formats passed to `string_to_timestamp_nanos` were parsing
milliseconds as nanoseconds.
E.g. `1970-01-01 00:00:00.123` would parse as `123` nanoseconds instead
of milliseconds.
Felix Yan [Thu, 16 Sep 2021 21:34:38 +0000 (05:34 +0800)]
Add support for riscv64 (#769)
* Fix riscv64 target_arch
This should be defined for riscv64 instead, as `riscv` doesn't match it.
I have no idea for riscv32 though.
* parquet: Use murmur_hash2_64a for riscv64
Markus Westerlind [Mon, 13 Sep 2021 16:55:56 +0000 (18:55 +0200)]
chore: Reduce the amount of code generated by monomorphization (#715)
* chore: Reduce the number of instantiations of take* (-3%)
Many types have the same native type, so simplifying these functions to
work directly with native types reduces the number of instantiations.
Reduces the number of llvm lines generated by ~3%
* chore: Shrink try_from_trusted_len_iter (-0.5%)
* chore: Only compile sort_primitive per native type (-8.5%)
* chore: Make the inner take_ functions less generic (-3.5%)
* chore: Don't duplicate sort_list (-13%)
* chore: Extract the "valid" sorting (-7%)
* chore: Extract the array sorter (-1%)
Ben Chambers [Sun, 12 Sep 2021 11:02:34 +0000 (04:02 -0700)]
fix: Support length on slices with null (#745)
* fix: Support length on slices with null
* actually test length
Matthew Turner [Sat, 11 Sep 2021 16:52:23 +0000 (12:52 -0400)]
Added PartialEq to RecordBatch (#750)
* Added PartialEq to RecordBatch
* derive PartialEq and add tests
Richard [Sat, 11 Sep 2021 07:53:16 +0000 (15:53 +0800)]
Export `RowColumnIter` to fix doc (#763)
* Export RowColumnIter to fix doc
* Add documentation for RowColumnIter
* Improve documentation for RowColumnIter
Jorge Leitao [Fri, 10 Sep 2021 17:16:55 +0000 (18:16 +0100)]
Use latest nightly in CI to Fix CI for SIMD (#767)
* Fixed CI for SIMD
* Updated nightly for wasm
Matthew Turner [Thu, 9 Sep 2021 21:34:01 +0000 (17:34 -0400)]
Update Bitmap::len to return bits (#749)
mathiaspeters-sig [Thu, 9 Sep 2021 21:31:59 +0000 (23:31 +0200)]
Optimize array::transform::utils::set_bits (#716)
* Added tests
* Updated tests and improved implementation
* Cleanup
* Stopped collecting bytes before writing to write_data
* Added tests
* Cleanup and comments
* Fixed clippy warning
* Fixed an endianess issue
* Fixed comments and naming
* Made tests less prone to off-by-n errors
Ben Chambers [Thu, 9 Sep 2021 20:25:46 +0000 (13:25 -0700)]
fix: Scalar math operations on slices (#743)
* fix: Scalar math operations on slices
* remove conditional
Ben Chambers [Thu, 9 Sep 2021 20:02:10 +0000 (13:02 -0700)]
fix: new_null_array for structs (#736)
Markus Westerlind [Thu, 9 Sep 2021 20:00:05 +0000 (22:00 +0200)]
fix: Allow parquet to be compiled without arrow (fix --no-default-features) (#731)
* fix: Allow parquet to be compiled without arrow
`--no-default-features` is currently broken in the parquet crate due to
arrow being required. With some small tweaks it can be made entirely
optional.
Added some extra steps to catch when `--no-default-features` does not
work on CI as well.
* Fix CI
* Fix path on CI
* --features test_common is needed for clippy
Ben Chambers [Thu, 9 Sep 2021 19:58:39 +0000 (12:58 -0700)]
Add `append_nulls` and `append_trusted_len_iter` to `PrimitiveBuilder` (#728)
* stub out impl
* mark unsafe
* add tests
Daniël Heres [Sun, 5 Sep 2021 10:21:16 +0000 (12:21 +0200)]
Upgrade lexical-core to 0.8 (#748)
* Upgrade lexical-core
* Use num instead
Ben Chambers [Fri, 3 Sep 2021 00:15:09 +0000 (17:15 -0700)]
fix: Comparisons against scalar slices (#741)
Ben Chambers [Fri, 3 Sep 2021 00:12:47 +0000 (17:12 -0700)]
fix: Handle slices in unary kernel (#739)
Krisztián Szűcs [Thu, 2 Sep 2021 19:54:50 +0000 (21:54 +0200)]
Remove optional prettytable-rs dependency (#737)
Krisztián Szűcs [Wed, 1 Sep 2021 10:37:35 +0000 (12:37 +0200)]
PyO3 bridge for pyarrow interoperability (#691)
* PyO3 bridge for pyarrow interoperability
* Fix clippy warnings
* Simplify error handling
* Fix clippy warnings
* Fix integration test workflow
* Address review comments
* Virtualenv
* Fix integration test
Sergii Mikhtoniuk [Tue, 31 Aug 2021 11:36:58 +0000 (04:36 -0700)]
Fix decimal repr in schema (#721)
Fixes #713
Sergii Mikhtoniuk [Sun, 29 Aug 2021 10:22:16 +0000 (03:22 -0700)]
Fix decimal value_as_string (#722)
Fixes #710
Andrew Lamb [Sat, 28 Aug 2021 16:22:24 +0000 (12:22 -0400)]
Add a note on rust compiler testing and compatibility (#726)
* Add a note on rust compiler testing and compatibility
* prettier
Xavier Lange [Sat, 28 Aug 2021 11:17:39 +0000 (07:17 -0400)]
Parquet Derive: remove obscure feature flags, make chrono time emit converted type (#712)
* remove feature flags, make timestamp emit converted types
* remove tracking numbers
* NaiveDateTime emits converted type
* formatting
* formatting
Ilya Biryukov [Thu, 26 Aug 2021 11:54:44 +0000 (14:54 +0300)]
Support arrow readers for strings with DELTA_BYTE_ARRAY encoding (#709)
* Support arrow readers for strings with DELTA_BYTE_ARRAY encoding
* Review fixes
1. move slice init out of the loop,
2. add tests for nulls,
3. use `debug_assert` for programming error assertion.
Jiayu Liu [Thu, 26 Aug 2021 11:50:59 +0000 (19:50 +0800)]
fix edition 2021 (#714)
baishen [Thu, 26 Aug 2021 11:50:04 +0000 (06:50 -0500)]
Implement `regexp_matches_utf8` (#706)
* impl regexp_matches_utf8
* fix clippy
* add bench
* optimize
Yuan Zhou [Sat, 21 Aug 2021 10:33:11 +0000 (18:33 +0800)]
Support binary data type in `build_struct_array`. (#702)
* Support binary data type in `build_struct_array`.
* Modify test case.
* cargo fmt
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Navin [Thu, 19 Aug 2021 20:45:54 +0000 (06:45 +1000)]
Doctest for PrimitiveArray using from_iter_values. (#694)
* Doctest for PrimitiveArray using from_iter_values.
* Better example for building a PrimitiveArray.
Andrew Lamb [Wed, 18 Aug 2021 16:33:47 +0000 (12:33 -0400)]
Update dev README with fancier regular expression for maintenance release notes (#687)
* Update dev README with fancier regular expression
I am trying to incrementally improve the release notes
* Add bullet
* prettier
* tweak
Chojan Shang [Mon, 16 Aug 2021 21:11:05 +0000 (05:11 +0800)]
Change to comfy-table from prettytable-rs (#656)
* Change to comfy-table
Signed-off-by: Chojan Shang <psiace@outlook.com>
* Apply review
Signed-off-by: Chojan Shang <psiace@outlook.com>
Sumit [Mon, 16 Aug 2021 21:09:46 +0000 (23:09 +0200)]
allow casting from Timestamp based arrays to utf8 (#664)
the change adds uses the existing `PrimitiveArray::value_as_datetime` to
support casting from `Timestamp(_,_)` to ``[Large]Utf8`.
Boaz [Sun, 15 Aug 2021 18:58:14 +0000 (21:58 +0300)]
Add get_bit to BooleanBufferBuilder (#693)
* Add get_bit to BooleanBufferBuilder
* fix clippy
Pete Koomen [Thu, 12 Aug 2021 15:42:28 +0000 (08:42 -0700)]
Allow creation of String arrays from &Option<&str> iterators (#680)
* Allow creation of String arrays from &Option<&str> iterators
* Add links in doc comments
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
Andrew Lamb [Tue, 10 Aug 2021 00:58:03 +0000 (20:58 -0400)]
Write FixedLenByteArray stats for FixedLenByteArray columns (not ByteArray stats) (#662)
Roee Shlomo [Mon, 9 Aug 2021 11:31:07 +0000 (14:31 +0300)]
Make rand an optional dependency (#674)
Closes #671
Signed-off-by: roee88 <roee88@gmail.com>
Andrew Lamb [Sun, 8 Aug 2021 12:32:47 +0000 (08:32 -0400)]
Write boolean stats for boolean columns (not i32 stats) (#661)
Navin [Sun, 8 Aug 2021 10:40:42 +0000 (20:40 +1000)]
Doctests for DictionaryArray::from_iter, PrimitiveDictionaryBuilder and DecimalBuilder. (#673)
* Doctest for PrimitiveDictionaryBuilder.
* Doctests for DictionaryArray::from_iter.
* Documentation for DecimalBuilder.
Andrew Lamb [Sun, 8 Aug 2021 10:36:24 +0000 (06:36 -0400)]
Add some do comments to parquet bit_util (#663)
Ben Chambers [Sun, 8 Aug 2021 07:57:17 +0000 (00:57 -0700)]
allocate enough bytes when writing booleans (#658)
* allocate enough bytes when writing booleans
* round up to nearest multiple of 256
Andrew Lamb [Sun, 8 Aug 2021 07:46:14 +0000 (03:46 -0400)]
Fix parquet string statistics generation (#643)
* Fix string statistics generation, add tests
* fix Int96 stats test
* Add notes for additional tickets
Andrew Lamb [Sat, 7 Aug 2021 00:38:01 +0000 (20:38 -0400)]
Add a note about arrow crate security / safety (#628)
* Add note about safety to arrow README.md
* Prettier
* Remove note about making modules private
Andrew Lamb [Fri, 6 Aug 2021 11:43:39 +0000 (07:43 -0400)]
Tiny tweaks to release readme (#670)
Daniël Heres [Tue, 3 Aug 2021 07:11:24 +0000 (09:11 +0200)]
Remove undefined behavior in `value` method of boolean and primitive arrays (#644)
* Remove UB in `value`
* Add safety note
Navin [Mon, 2 Aug 2021 20:47:16 +0000 (06:47 +1000)]
Doctests for from_iter for BooleanArray & for BooleanBuilder. (#647)
Ruihang Xia [Mon, 2 Aug 2021 19:43:46 +0000 (03:43 +0800)]
draft question template (#649)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Ruihang Xia [Mon, 2 Aug 2021 17:32:41 +0000 (01:32 +0800)]
update documentation (#648)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Christian Williams [Mon, 2 Aug 2021 17:29:09 +0000 (13:29 -0400)]
Fix data corruption in json decoder f64-to-i64 cast (#652)
* Add failing test for JSON writer i64 bug
* Add special handling for i64/u64 to json decoder array builder
* Fix linter error - linter wants .flatten on a new line
Andrew Lamb [Sat, 31 Jul 2021 15:02:20 +0000 (11:02 -0400)]
Add human readable Format for parquet ByteArray (#642)
Wakahisa [Sat, 31 Jul 2021 05:20:56 +0000 (07:20 +0200)]
Minimal MapArray support (#491)
* add DataType::Map to datatypes
* barebones MapArray and MapBuilder
This commit adds the MapArray and MapBuilder.
The interfaces are however incomplete at this stage.
* minimal IPC read and write
* barebones MapArray (missed)
* add equality for map, relying on list
A map is a list with some specific rules, so for equality it is the same as a list
* json reader for MapArray
* add schema roundtrip
* read and write maps from/to arrow map
* clippy
* Calculate map levels separately
Avoids the generic case of list > struct > [ley, value], which adds overhead
* Fix map reader context and path
* Map array tests
* add doc comments and clean up code
* wip: review feedback
* add test for map
* fix clippy 1.54 lints
Daniël Heres [Fri, 30 Jul 2021 19:30:33 +0000 (21:30 +0200)]
Speed up filter_record_batch with one array (#637)
* Speed up filter_record_batch with one array
* Don't into()
Andrew Lamb [Fri, 30 Jul 2021 11:57:37 +0000 (07:57 -0400)]
Add note about changelog generation to README (#639)
* Add note about changelog generation to README
* make it prettier
Andrew Lamb [Thu, 29 Jul 2021 20:03:10 +0000 (16:03 -0400)]
Fix clippy lints for Rust 1.54 (#631)