arrow-experimental-rs-parquet2.git
10 months agoUpdate .asf.yaml (#2) master
Andrew Lamb [Thu, 8 Jul 2021 15:58:24 +0000 (11:58 -0400)] 
Update .asf.yaml (#2)

11 months agoRemoved all.
Jorge C. Leitao [Sat, 19 Jun 2021 05:40:27 +0000 (05:40 +0000)] 
Removed all.

This will allow to start a project from scratch without losing
the contributions' stats and others.

11 months agoRemoved tooling related to arrow crate.
Jorge C. Leitao [Sat, 19 Jun 2021 05:35:54 +0000 (05:35 +0000)] 
Removed tooling related to arrow crate.

11 months agoKept parquet
Jorge C. Leitao [Sat, 19 Jun 2021 05:22:12 +0000 (05:22 +0000)] 
Kept parquet

11 months agoparquet: improve BOOLEAN writing logic and report error on encoding fail (#443)
Gary Pennington [Wed, 16 Jun 2021 16:37:02 +0000 (17:37 +0100)] 
parquet: improve BOOLEAN writing logic and report error on encoding fail (#443)

* improve BOOLEAN writing logic and report error on encoding fail

When writing BOOLEAN data, writing more than 2048 rows of data will
overflow the hard-coded 256 buffer set for the bit-writer in the
PlainEncoder. Once this occurs, further attempts to write to the encoder
fail, becuase capacity is exceeded, but the errors are silently ignored.

This fix improves the error detection and reporting at the point of
encoding and modifies the logic for bit_writing (BOOLEANS). The
bit_writer is initially allocated 256 bytes (as at present), then each
time the capacity is exceeded the capacity is incremented by another
256 bytes.

This certainly resolves the current problem, but it's not exactly a
great fix because the capacity of the bit_writer could now grow
substantially.

Other data types seem to have a more sophisticated mechanism for writing
data which doesn't involve growing or having a fixed size buffer. It
would be desirable to make the BOOLEAN type use this same mechanism if
possible, but that level of change is more intrusive and probably
requires greater knowledge of the implementation than I possess.

resolves: #349

* only manipulate the bit_writer for BOOLEAN data

Tacky, but I can't think of better way to do this without
specialization.

* better isolation of changes

Remove the byte tracking from the PlainEncoder and use the existing
bytes_written() method in BitWriter.

This is neater.

* add test for boolean writer

The test ensures that we can write > 2048 rows to a parquet file and
that when we read the data back, it finishes without hanging (defined as
taking < 5 seconds).

If we don't want that extra complexity, we could remove the
thread/channel stuff and just try to read the file and let the test
runner terminate hanging tests.

* fix capacity calculation error in bool encoding

The values.len() reports the number of values to be encoded and so must
be divided by 8 (bits in a bytes) to determine the effect on the byte
capacity of the bit_writer.

11 months agoUnvendor Archery (#459)
Krisztián Szűcs [Wed, 16 Jun 2021 04:39:19 +0000 (06:39 +0200)] 
Unvendor Archery (#459)

11 months agoDoctests for DecimalArray. (#414)
Navin [Mon, 14 Jun 2021 13:40:39 +0000 (23:40 +1000)] 
Doctests for DecimalArray. (#414)

* Doctests for DecimalArray.

* fixup! Doctests for DecimalArray.

* fixup! fixup! Doctests for DecimalArray.

11 months agouse iterator for partition kernel implementation (#438)
Jiayu Liu [Sun, 13 Jun 2021 10:30:51 +0000 (18:30 +0800)] 
use iterator for partition kernel implementation (#438)

11 months agoUpdate docs + email template (#450)
Andrew Lamb [Sun, 13 Jun 2021 10:24:14 +0000 (06:24 -0400)] 
Update docs + email template (#450)

11 months agoImplement the Iterator trait for the json Reader. (#451)
Laurent Mazare [Sun, 13 Jun 2021 00:22:38 +0000 (08:22 +0800)] 
Implement the Iterator trait for the json Reader. (#451)

* Implement the Iterator trait for the json Reader.

* Use transpose.

11 months agoAdd Decimal to CsvWriter and improve debug display (#406)
Ádám Lippai [Sun, 13 Jun 2021 00:20:08 +0000 (02:20 +0200)] 
Add Decimal to CsvWriter and improve debug display (#406)

* Add Decimal to CsvWriter and improve debug display

* Measure CSV writer instead of file and data creation

* Re-use decimal formatting

11 months agoremove unnecessary wraps in sortk (#445)
Jiayu Liu [Sun, 13 Jun 2021 00:00:35 +0000 (08:00 +0800)] 
remove unnecessary wraps in sortk (#445)

11 months agoremove clippy unnecessary wraps (#449)
Jiayu Liu [Sat, 12 Jun 2021 12:59:35 +0000 (20:59 +0800)] 
remove clippy unnecessary wraps (#449)

11 months agoRemove DictionaryArray::keys_array method and replace usages by the keys method ...
Jörn Horstmann [Sat, 12 Jun 2021 12:46:27 +0000 (14:46 +0200)] 
Remove DictionaryArray::keys_array method and replace usages by the keys method (#419)

11 months agoImplement faster arrow array reader (#384)
Yordan Pavlov [Thu, 10 Jun 2021 22:10:53 +0000 (23:10 +0100)] 
Implement faster arrow array reader (#384)

* implement ArrowArrayReader

* change StringArrayConverter to use push_unchecked for offsets

* add ASF license header to new files

* fix clippy issues

* cleanup arrow_array_reader benches

* cleanup arrow_array_reader

* change util module to limit public exports from test_common sub-module

* fix rustfmt issues

11 months agorefactor lexico sort (#424)
Jiayu Liu [Wed, 9 Jun 2021 18:16:42 +0000 (02:16 +0800)] 
refactor lexico sort (#424)

11 months agoUpdate release readme.md (#436)
Andrew Lamb [Wed, 9 Jun 2021 18:12:17 +0000 (14:12 -0400)] 
Update release readme.md (#436)

Don't start search on page 2, make link nicer looking

11 months agoReenable MIRI check (#421)
Andrew Lamb [Wed, 9 Jun 2021 18:11:38 +0000 (14:11 -0400)] 
Reenable MIRI check (#421)

11 months agowindow::shift to work for all array types (#388)
Jiayu Liu [Tue, 8 Jun 2021 21:54:46 +0000 (05:54 +0800)] 
window::shift to work for all array types (#388)

* add more doc test for window::shift

* use Ok(make_array(array.data_ref().clone()))

* shift array for not only primitive cases

* include more test cases

* add back copied

* fix renaming

11 months agomake sure that only concat preallocates buffers (#382)
Ritchie Vink [Tue, 8 Jun 2021 21:16:18 +0000 (23:16 +0200)] 
make sure that only concat preallocates buffers (#382)

* MutableArrayData::with_capacities

* better pattern matching

* add binary capacities

* add list child data

* add struct capacities

* add panic for dictionary type

* change dictionary capacity enum variant

11 months agorefactor lexico sort (#423)
Jiayu Liu [Tue, 8 Jun 2021 21:09:44 +0000 (05:09 +0800)] 
refactor lexico sort (#423)

11 months agoSort by float lists (#420)
Michael Edwards [Tue, 8 Jun 2021 20:55:17 +0000 (22:55 +0200)] 
Sort by float lists (#420)

11 months agoFix bug with null buffer offset in boolean not kernel (#418)
Jörn Horstmann [Tue, 8 Jun 2021 20:51:01 +0000 (22:51 +0200)] 
Fix bug with null buffer offset in boolean not kernel (#418)

11 months agoDerive Eq and PartialEq for SortOptions (#425)
Raphael Taylor-Davies [Tue, 8 Jun 2021 17:27:30 +0000 (18:27 +0100)] 
Derive Eq and PartialEq for SortOptions (#425)

11 months agoFix out of bounds read in bit chunk iterator (#416)
Jörn Horstmann [Tue, 8 Jun 2021 15:34:39 +0000 (17:34 +0200)] 
Fix out of bounds read in bit chunk iterator (#416)

* Fix out of bounds read in bit chunk iterator

* Add comment why reading one additional byte is enough

11 months agoAdd set_bit to BooleanBufferBuilder to allow mutating bit in index (#383)
Boaz [Tue, 8 Jun 2021 07:13:39 +0000 (10:13 +0300)] 
Add set_bit to BooleanBufferBuilder to allow mutating bit in index (#383)

* Add set_bit to BooleanBufferBuilder to allow mutating bits in the builder

* Fix tests

* Update builder.rs

* Update builder.rs

* Fix clippy failures

Co-authored-by: Boaz Berman <boaz@codota.com>
11 months agoAdd labels to cherry pick scripts + writeup (#409)
Andrew Lamb [Sat, 5 Jun 2021 13:14:20 +0000 (09:14 -0400)] 
Add labels to cherry pick scripts + writeup (#409)

* Add labels when cherry picking with script

* fixup

* document tags

* add note

* prettier

11 months agouse prettiery to auto format md files (#398)
Jiayu Liu [Sat, 5 Jun 2021 05:01:58 +0000 (13:01 +0800)] 
use prettiery to auto format md files (#398)

11 months agoMINOR: update install instruction (#400)
Ádám Lippai [Sat, 5 Jun 2021 04:54:32 +0000 (06:54 +0200)] 
MINOR: update install instruction (#400)

We have frequent releases and honoring semver, removed minor and patch version pinning

11 months agoAdd (simd) modulus op (#317)
Gang Liao [Fri, 4 Jun 2021 15:17:07 +0000 (08:17 -0700)] 
Add (simd) modulus op (#317)

* Add (simd) modulus op

* fix typo

* fix feature = "simd"

* revert ModulusByZero

11 months agoadd more tests for window::shift and handle boundary cases (#386)
Jiayu Liu [Thu, 3 Jun 2021 21:54:00 +0000 (05:54 +0800)] 
add more tests for window::shift and handle boundary cases (#386)

* add more doc test for window::shift

* handle i64::MIN first

* use Ok(make_array(array.data_ref().clone()))

11 months agoAutomatic cherry-pick script (#339)
Andrew Lamb [Thu, 3 Jun 2021 21:50:43 +0000 (17:50 -0400)] 
Automatic cherry-pick script (#339)

* Automatic cherry-pick script

* switch from alamb to apache

* autopep8

* flake8

* add rat

* tweaks

* Add some docs to the README

11 months agoRespect max rowgroup size in Arrow writer (#381)
Wakahisa [Wed, 2 Jun 2021 19:31:50 +0000 (21:31 +0200)] 
Respect max rowgroup size in Arrow writer (#381)

* Respect max rowgroup size in Arrow writer

* simplify while loop

* address review feedback

11 months agoFix typo in release script, update release location (#380)
Andrew Lamb [Sun, 30 May 2021 06:25:18 +0000 (02:25 -0400)] 
Fix typo in release script, update release location (#380)

* Fix typo in release script

* release to `arrow-rs-{version}` directory

11 months agoAdd doctest for ArrayBuilder (#367)
Ádám Lippai [Sat, 29 May 2021 11:02:21 +0000 (13:02 +0200)] 
Add doctest for ArrayBuilder (#367)

11 months agoDoctests for FixedSizeBinaryArray (#378)
Navin [Sat, 29 May 2021 10:54:13 +0000 (20:54 +1000)] 
Doctests for FixedSizeBinaryArray (#378)

* Doctests for BooleanArray.

* Update arrow/src/array/array_boolean.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Doctests for FixedSizeBinaryArray.

* Fix formatting.

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
11 months agoReduce memory usage of concat (large)utf8 (#348)
Ritchie Vink [Sat, 29 May 2021 10:45:55 +0000 (12:45 +0200)] 
Reduce memory usage of concat (large)utf8 (#348)

* reduce memory needed for concat

* reuse code for str allocation buffer

11 months agoSimplify window using null array (#370)
Daniël Heres [Sat, 29 May 2021 07:42:49 +0000 (09:42 +0200)] 
Simplify window using null array (#370)

Co-authored-by: Daniel Heres <danielheres@MBP-van-Olaf.home>
11 months agoFix version in readme (#365)
Dominik Moritz [Thu, 27 May 2021 21:34:15 +0000 (14:34 -0700)] 
Fix version in readme (#365)

Closes #364

12 months agoRemove superfluous space (#363)
Dominik Moritz [Wed, 26 May 2021 20:23:30 +0000 (13:23 -0700)] 
Remove superfluous space (#363)

12 months agoOnly register Flight.proto with cargo if it exists (#351)
Raphael Taylor-Davies [Wed, 26 May 2021 20:22:50 +0000 (21:22 +0100)] 
Only register Flight.proto with cargo if it exists (#351)

12 months agoAdd crate badges (#362)
Dominik Moritz [Wed, 26 May 2021 20:20:04 +0000 (13:20 -0700)] 
Add crate badges (#362)

* Add crate badges

* Format markdown

12 months agoFix filter UB and add fast path (#341)
Ritchie Vink [Wed, 26 May 2021 20:07:19 +0000 (22:07 +0200)] 
Fix filter UB and add fast path (#341)

* fix ub in filter record_batch

* filter fast path

* add all false fast path

* use new_empty_array

* rename filter kernel argument

rename argument: 'filter' to 'predicate'
to reduce name collissions.

12 months agoallow `SliceableCursor` to be constructed from an `Arc` directly (#369)
Marco Neumann [Wed, 26 May 2021 20:00:25 +0000 (22:00 +0200)] 
allow `SliceableCursor` to be constructed from an `Arc` directly (#369)

This is backwards-compatible since we change the argument from `Vec<u8>`
to `impl Into<Arc<Vec<u8>>>` and the following implementations exists in
std:

- `impl<T, U> Into<U> for T where U: From<T>` (reverse direction)
- `impl<T> From<T> for Arc<T>` (create `Arc` from any type)

Furthermore `Arc<Vec<u8>>` can be passed directly now because the following
implementations exists:

- `impl<T> From<T> for T` (identity)

Closes #368.

12 months agoDisable MIRI check until it runs cleanly on CI (#360)
Andrew Lamb [Wed, 26 May 2021 19:59:32 +0000 (15:59 -0400)] 
Disable MIRI check until it runs cleanly on CI (#360)

12 months agoensure null-counts are written for all-null columns (#307)
Marco Neumann [Tue, 25 May 2021 22:03:01 +0000 (00:03 +0200)] 
ensure null-counts are written for all-null columns (#307)

Fixes #306.

12 months agoallow to read non-standard CSV (#326)
kazuhiko kikuchi [Tue, 25 May 2021 21:44:03 +0000 (06:44 +0900)] 
allow to read non-standard CSV (#326)

* refactor Reader::from_reader

split into build_csv_reader, from_csv_reader
add escape, quote, terminator arg to build_csv_reader

* add escape,quote,terminator field to ReaderBuilder

schema inference support for non-standard CSV

  add fn infer_file_schema_with_csv_options
  add fn infer_reader_schema_with_csv_options

ReaderBuilder support for non-standard CSV

add escape, quote, terminator field
add fn with_escape, with_quote, with_terminator
change ReaderBuilder::build for non-standard CSV

* minimize API change

* add tests

add #[test] fn test_non_std_quote
add #[test] fn test_non_std_escape
add #[test] fn test_non_std_terminator

* apply cargo fmt

12 months agoDoctests for BooleanArray. (#338)
Navin [Mon, 24 May 2021 21:43:01 +0000 (07:43 +1000)] 
Doctests for BooleanArray. (#338)

* Doctests for BooleanArray.

* Update arrow/src/array/array_boolean.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
12 months agoDocument and automate new release process (#299)
Andrew Lamb [Mon, 24 May 2021 18:02:34 +0000 (14:02 -0400)] 
Document and automate new release process (#299)

* Add Release README and Scripts to create and release tarballs

* Suggestions from Andy Grove

12 months agorespect offset in utf8 and list casts (#335)
Ritchie Vink [Mon, 24 May 2021 12:44:27 +0000 (14:44 +0200)] 
respect offset in utf8 and list casts (#335)

12 months agofeature gate ipc reader/writer (#336)
Ritchie Vink [Mon, 24 May 2021 12:43:07 +0000 (14:43 +0200)] 
feature gate ipc reader/writer (#336)

12 months agoparquet: Speed up `BitReader`/`DeltaBitPackDecoder` (#325)
Kornelijus Survila [Mon, 24 May 2021 01:00:42 +0000 (19:00 -0600)] 
parquet: Speed up `BitReader`/`DeltaBitPackDecoder` (#325)

* parquet: Avoid temporary `BufferPtr`s in `BitReader`

From a quick test, this speeds up reading delta-packed int columns by
over 30%.

* parquet: Avoid some allocations in `DeltaBitPackDecoder`

From a quick test, it seems to decode around 10% faster overall.

12 months agoEnable wasm32 as a target architecture for the SIMD feature (#324)
Roee Shlomo [Sun, 23 May 2021 11:11:22 +0000 (14:11 +0300)] 
Enable wasm32 as a target architecture for the SIMD feature  (#324)

* Add wasm32 as target_arch for simd

Signed-off-by: roee88 <roee88@gmail.com>
* Allow wasm32 as a target arch for SIMD

Signed-off-by: roee88 <roee88@gmail.com>
12 months agofix comparison of dictionaries with different values arrays (#332) (#333)
Raphael Taylor-Davies [Sun, 23 May 2021 09:48:03 +0000 (10:48 +0100)] 
fix comparison of dictionaries with different values arrays (#332) (#333)

12 months agoFix undefined behavior in FFI (#323)
Roee Shlomo [Sun, 23 May 2021 09:46:33 +0000 (12:46 +0300)] 
Fix undefined behavior in FFI (#323)

- Fix UB due to aliasing
- Enable MIRI in CI for most tests in arrow crate

Signed-off-by: roee88 <roee88@gmail.com>
12 months agoAdd ported Rust release verification script (#331)
Wes McKinney [Sat, 22 May 2021 11:06:25 +0000 (06:06 -0500)] 
Add ported Rust release verification script (#331)

* Add ported Rust release verification script

* Minor simplifications. (#1)

Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
12 months agoreturn reference from DictionaryArray::values() (#313) (#314)
Raphael Taylor-Davies [Fri, 21 May 2021 18:30:23 +0000 (19:30 +0100)] 
return reference from DictionaryArray::values() (#313) (#314)

Signed-off-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
12 months agofeature gate csv functionality (#312)
Ritchie Vink [Fri, 21 May 2021 18:30:07 +0000 (20:30 +0200)] 
feature gate csv functionality (#312)

* feature gate csv functionality

* mock read_csv example

* clippy

* mock read_csv_infer_schema example

* add tests of --no-default-features to CI

12 months agofix invalid null handling in filter (#296)
Ritchie Vink [Fri, 21 May 2021 12:31:17 +0000 (14:31 +0200)] 
fix invalid null handling in filter (#296)

* fix invalid null handling in filter

* take offset into account

* remove incorrect UB warning

12 months agoDoctests for StringArray and LargeStringArray. (#330)
Navin [Thu, 20 May 2021 15:40:05 +0000 (01:40 +1000)] 
Doctests for StringArray and LargeStringArray. (#330)

12 months agoinline PrimitiveArray::value (#329)
Ritchie Vink [Thu, 20 May 2021 15:30:03 +0000 (17:30 +0200)] 
inline PrimitiveArray::value (#329)

12 months agoMutablebuffer::shrink_to_fit (#318)
Ritchie Vink [Wed, 19 May 2021 13:37:57 +0000 (15:37 +0200)] 
Mutablebuffer::shrink_to_fit (#318)

* Mutablebuffer::shrink_to_fit

* add shrink_to_fit explicit test

12 months agoFix FFI and add support for Struct type (#287)
Roee Shlomo [Mon, 17 May 2021 11:08:24 +0000 (14:08 +0300)] 
Fix FFI and add support for Struct type (#287)

* fix: support nested types in FFI

Ported from https://github.com/jorgecarleitao/arrow2

Fix #20
Fix #251

Signed-off-by: roee88 <roee88@gmail.com>
* Removed Clone from FFI_ArrowArray

Signed-off-by: roee88 <roee88@gmail.com>
* Add nesting to FFI struct test

Signed-off-by: roee88 <roee88@gmail.com>
12 months agoAdded changelog generator script and configuration. (#289)
Jorge Leitao [Mon, 17 May 2021 11:07:27 +0000 (13:07 +0200)] 
Added changelog generator script and configuration. (#289)

12 months agoAdd Send to the ArrayBuilder trait (#291)
Max Meldrum [Mon, 17 May 2021 10:45:11 +0000 (12:45 +0200)] 
Add Send to the ArrayBuilder trait (#291)

12 months agoVersion upgrades (#304)
Daniël Heres [Mon, 17 May 2021 06:09:38 +0000 (08:09 +0200)] 
Version upgrades (#304)

12 months agoRemove old release scripts (#293)
Andrew Lamb [Sun, 16 May 2021 11:36:39 +0000 (07:36 -0400)] 
Remove old release scripts (#293)

* Remove old release scripts

* Add rat files back in

12 months agomanually bump development version (#288)
Wakahisa [Sat, 15 May 2021 07:21:39 +0000 (09:21 +0200)] 
manually bump development version (#288)

12 months agoAdded Decimal support to pretty-print display utility (#230) (#273)
Manish Gill [Fri, 14 May 2021 18:42:47 +0000 (20:42 +0200)] 
Added Decimal support to pretty-print display utility (#230) (#273)

* Added Decimal support to pretty-print display utility (#230)

* Applied cargo fmt to fix linting errors

* Added proper printing for decimals based on scale, moved tests to pretty.rs

* Applied cargo fmt on pretty test

Co-authored-by: Manish Gill <manish.gill@tomtom.com>
12 months agoFix subtraction underflow when sorting string arrays with many nulls (#285)
Michael Edwards [Thu, 13 May 2021 11:28:46 +0000 (13:28 +0200)] 
Fix subtraction underflow when sorting string arrays with many nulls (#285)

12 months agoFix null struct and list roundtrip (#270)
Wakahisa [Tue, 11 May 2021 05:42:41 +0000 (07:42 +0200)] 
Fix null struct and list roundtrip (#270)

* fix null struct and list inconsistencies in writer

* fix list reader null and empty slot calculation

* remove stray TODOs

12 months agoSpeed up bound checking in `take` (#281)
Daniël Heres [Tue, 11 May 2021 05:35:05 +0000 (07:35 +0200)] 
Speed up bound checking in `take` (#281)

* WIP improve take performance

* WIP

* Bound checking speed

* Simplify

* fmt

* Improve formatting

12 months agoUpdate PR template by commenting out instructions (#278)
Wakahisa [Mon, 10 May 2021 22:26:36 +0000 (00:26 +0200)] 
Update PR template by commenting out instructions (#278)

Some contributors don't remove the guidelines when creating PRs, so it might be more convenient if we hide them behind comments.
The comments are still visible when editing, but are not displayed when the markdown is rendered

12 months agosupport full u32 and u64 roundtrip through parquet (#258)
Marco Neumann [Mon, 10 May 2021 16:44:58 +0000 (18:44 +0200)] 
support full u32 and u64 roundtrip through parquet (#258)

* re-export arity kernels in `arrow::compute`

Seems logical since all other kernels are re-exported as well under this
flat hierarchy.

* return file from `parquet::arrow::arrow_writer::tests::[one_column]_roundtrip`

* support full arrow u64 through parquet

- updates arrow to parquet type mapping to use reinterpret/overflow cast
  for u64<->i64 similar to what the C++ stack does
- changes statistics calculation to account for the fact that u64 should
  be compared unsigned (as per spec)

Fixes #254.

* avoid copying array when reading u64 from parquet

* support full arrow u32 through parquet

This is idential to the solution we now have for u64.

12 months ago1.52 clippy fixes (#267)
Wakahisa [Fri, 7 May 2021 18:46:24 +0000 (20:46 +0200)] 
1.52 clippy fixes (#267)

12 months agoFix typo in csv/reader.rs (#265)
Dominik Moritz [Fri, 7 May 2021 05:36:56 +0000 (22:36 -0700)] 
Fix typo in csv/reader.rs (#265)

12 months agoFix empty Schema::metadata deserialization error (#260)
hulunbier [Fri, 7 May 2021 05:32:32 +0000 (13:32 +0800)] 
Fix empty Schema::metadata deserialization error (#260)

* Fix empty Schema::metadata deserialization error

Hope this fixes issue #241

* Rename UT name to `test_ser_de_metadata`

Co-authored-by: hulunbier <hulunbier>
12 months agoupdate datafusion and ballista links (#259)
Jiayu Liu [Thu, 6 May 2021 07:43:09 +0000 (15:43 +0800)] 
update datafusion and ballista links (#259)

12 months agoAdded env to run rust in integration. (#253)
Jorge Leitao [Wed, 5 May 2021 04:47:26 +0000 (06:47 +0200)] 
Added env to run rust in integration. (#253)

12 months agofix NaN handling in parquet statistics (#256)
Marco Neumann [Wed, 5 May 2021 04:46:24 +0000 (06:46 +0200)] 
fix NaN handling in parquet statistics (#256)

Closes #255.

12 months agofix parquet max_definition for non-null structs (#246)
Wakahisa [Tue, 4 May 2021 04:43:57 +0000 (06:43 +0200)] 
fix parquet max_definition for non-null structs (#246)

* fix parquet max_definition for non-null structs

* clippy: needless reference

* Update parquet/src/arrow/levels.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
12 months agoMade integration tests always run. (#248)
Jorge Leitao [Mon, 3 May 2021 14:42:45 +0000 (16:42 +0200)] 
Made integration tests always run. (#248)

12 months agoImprove docs for NullArray, new_null_array and new_empty_array (#240)
Andrew Lamb [Mon, 3 May 2021 14:41:30 +0000 (10:41 -0400)] 
Improve docs for NullArray, new_null_array and new_empty_array (#240)

* Update docs in null_array.rs so they are discoverable

* Add doc examples for new_null_array and new_empty_array

12 months agosort_primitive result is capped to the min of limit or values.len (#236)
Michael Edwards [Sat, 1 May 2021 10:19:22 +0000 (12:19 +0200)] 
sort_primitive result is capped to the min of limit or values.len (#236)

* sort_primitive result is capped to the min of limit or values.len

fixes #235

* Fixed length calculation of nulls to include

* Add more sort_primitive tests for sorts /w limit

12 months agoDisabled rebase needed until demonstrate working. (#243)
Jorge Leitao [Fri, 30 Apr 2021 11:26:41 +0000 (13:26 +0200)] 
Disabled rebase needed until demonstrate working. (#243)

12 months agopin flatbuffers to 0.8.4 (#239)
Ritchie Vink [Thu, 29 Apr 2021 17:42:08 +0000 (19:42 +0200)] 
pin flatbuffers to 0.8.4 (#239)

* pin flatbuffer to 0.8.4

* =0.8.2

12 months ago[Parquet] Read list field correctly (#234)
Wakahisa [Thu, 29 Apr 2021 16:26:40 +0000 (18:26 +0200)] 
[Parquet] Read list field correctly (#234)

12 months agoFix code examples for RecordBatch::try_from_iter (#231)
Andrew Lamb [Tue, 27 Apr 2021 12:48:46 +0000 (08:48 -0400)] 
Fix code examples for RecordBatch::try_from_iter (#231)

12 months agoSupport string dictionaries in csv reader (#228) (#229)
Raphael Taylor-Davies [Tue, 27 Apr 2021 11:25:18 +0000 (12:25 +0100)] 
Support string dictionaries in csv reader (#228) (#229)

12 months agoARROW-12411: [Rust] Create RecordBatches from Iterators (#7)
Andrew Lamb [Tue, 27 Apr 2021 10:44:22 +0000 (06:44 -0400)] 
ARROW-12411: [Rust] Create RecordBatches from Iterators (#7)

12 months agosupport LargeUtf8 in sort kernel (#26)
Ritchie Vink [Mon, 26 Apr 2021 21:49:52 +0000 (23:49 +0200)] 
support LargeUtf8 in sort kernel (#26)

12 months agoRemoved unused files (#22)
Jorge Leitao [Mon, 26 Apr 2021 21:48:56 +0000 (23:48 +0200)] 
Removed unused files (#22)

* Removed unused files.

* Removed un-used files.

13 months agoSupport auto-vectorization for min/max using multiversion (#9)
Daniël Heres [Fri, 23 Apr 2021 07:59:34 +0000 (09:59 +0200)] 
Support auto-vectorization for min/max using multiversion (#9)

13 months agoAdd GitHub templates (#17)
Andy Grove [Thu, 22 Apr 2021 16:08:23 +0000 (10:08 -0600)] 
Add GitHub templates (#17)

13 months agoAdded rebase-needed bot (#13)
Jorge Leitao [Thu, 22 Apr 2021 13:57:26 +0000 (15:57 +0200)] 
Added rebase-needed bot (#13)

13 months agoBuffer::from_slice_ref set correct capacity (#18)
Raphael Taylor-Davies [Thu, 22 Apr 2021 13:09:59 +0000 (14:09 +0100)] 
Buffer::from_slice_ref set correct capacity (#18)

Fixed ARROW-12504

13 months agoARROW-12493: Add support for writing dictionary arrays to CSV and JSON (#16)
Raphael Taylor-Davies [Thu, 22 Apr 2021 12:49:14 +0000 (13:49 +0100)] 
ARROW-12493: Add support for writing dictionary arrays to CSV and JSON (#16)

13 months agoARROW-12426: [Rust] Fix concatentation of arrow dictionaries (#15)
Raphael Taylor-Davies [Thu, 22 Apr 2021 12:44:34 +0000 (13:44 +0100)] 
ARROW-12426: [Rust] Fix concatentation of arrow dictionaries (#15)

13 months agoAdded Integration tests against arrow (#10)
Jorge Leitao [Wed, 21 Apr 2021 22:08:43 +0000 (00:08 +0200)] 
Added Integration tests against arrow (#10)

* fix indent

* Made CI run on any change. (#5)

* Removed bot comment about title and JIRA. (#4)

* Allow creating issues. (#6)

* Trying running integration tests.

Co-authored-by: Andy Grove <andygrove73@gmail.com>
Co-authored-by: Andy Grove <andygrove@users.noreply.github.com>
13 months agoUpdate URLs (#14)
Daniël Heres [Wed, 21 Apr 2021 18:19:58 +0000 (20:19 +0200)] 
Update URLs (#14)