arrow-cookbook.git
3 months agoAdd highlighting to testcode nodes ARROW-15713 164/head
Alessandro Molina [Wed, 16 Mar 2022 15:16:13 +0000 (16:16 +0100)] 
Add highlighting to testcode nodes

3 months ago[CI] Run apt update before apt install (#163)
David Li [Tue, 15 Mar 2022 17:12:39 +0000 (13:12 -0400)] 
[CI] Run apt update before apt install (#163)

3 months ago[Java] Add Java recipes for Arrow Flight (#137)
david dali susanibar arce [Mon, 14 Mar 2022 19:48:17 +0000 (14:48 -0500)] 
[Java] Add Java recipes for Arrow Flight (#137)

* Adding java cookbook for creating arrow flight

* Adding java cookbook for creating arrow flight

* Solving pr comments

* Change text to flightscope

* Solving PR comments

* Solving PR comments

* Solving PR comments

* Solving PR comments

* Solving PR comments

* Solving PR comments

* Solving PR comments

* Solving PR comments

3 months ago[Python] Add a Python Cookbook recipe on group_by + sort (#155)
Vibhatha Lakmal Abeykoon [Thu, 10 Mar 2022 18:43:06 +0000 (00:13 +0530)] 
[Python] Add a Python Cookbook recipe on group_by + sort (#155)

4 months ago[C++] gRPC settings and custom endpoint examples (#153)
Will Jones [Fri, 25 Feb 2022 00:03:36 +0000 (16:03 -0800)] 
[C++] gRPC settings and custom endpoint examples (#153)

* Draft options example

* Get flight examples to compile

* Start creating client for gRPC custom service

* Fix test

* Cleanup

* PR feedback

* Further cleanup

* Remove references to other ports

4 months agoMake the sphinx docs wide enough for code examples (#151)
Will Jones [Thu, 24 Feb 2022 14:01:59 +0000 (06:01 -0800)] 
Make the sphinx docs wide enough for code examples (#151)

4 months ago[Python] Add a recipe on how to replace an existing column in Table (#154)
Vibhatha Lakmal Abeykoon [Thu, 24 Feb 2022 10:04:05 +0000 (15:34 +0530)] 
[Python] Add a recipe on how to replace an existing column in Table (#154)

4 months ago[C++] Small fixes for c++ (#150)
Will Jones [Wed, 23 Feb 2022 03:24:23 +0000 (19:24 -0800)] 
[C++] Small fixes for c++ (#150)

* Small fixes for c++

* Edits to contributing

* Upgrade to conda-lock

* Specify python in environment

* Fix instructions

4 months ago[Java] Java cookbook for create arrow jni dataset (#138)
david dali susanibar arce [Tue, 22 Feb 2022 13:01:56 +0000 (08:01 -0500)] 
[Java] Java cookbook for create arrow jni dataset (#138)

* Adding java cookbook for creating arrow jni

* JNI library dependencies

* Adding java cookbook for creating arrow jni

* Adding java cookbook for creating arrow jni

* Adding java cookbook for creating arrow jni

* Testing problem with download dependencies

* Debug jni errors

* Debug jni errors

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Solving jni errors for jni *.dylib and *.so library dependencies

* Custom protobuf.rb formula

* Adding parquet files

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Configure ci worflow for jni and not jni

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding github cache for protobuf lib

* Adding JNI testing cookbooks

* Arrow jni dataset for version 7.0.0

* Arrow jni dataset for version 7.0.0

* Solving error: Failed to collect dependencies

* Solving error: Failed to collect dependencies

* Update java/source/dataset.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Solving pr comments

* Update java/source/dataset.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Update java/source/dataset.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Update java/source/dataset.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Update java/source/dataset.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Solving pr comments

* Solving pr comments

* Solving pr comments

* Solving pr comments

Co-authored-by: David Li <li.davidm96@gmail.com>
4 months ago[Java] Java cookbook for create arrow read/write IPC format (#136)
david dali susanibar arce [Mon, 21 Feb 2022 14:50:20 +0000 (09:50 -0500)] 
[Java] Java cookbook for create arrow read/write IPC format (#136)

* Adding java cookbook for creating arrow read/write IPC format

* Adding java cookbook for creating arrow read/write IPC format

* Update java/source/io.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Update java/source/io.rst

Co-authored-by: David Li <li.davidm96@gmail.com>
* Solving pr comments

* Solving pr comments

* Solving pr comments

* Arrow java io cookbook

* Solving pr comments

* Solving pr comments

* Solving pr comments

* Solving pr comments

* Solving pr comments

Co-authored-by: David Li <li.davidm96@gmail.com>
4 months ago[Python][Flight] Add an 'HTTP basic auth' example (#123)
David Li [Wed, 9 Feb 2022 16:58:27 +0000 (11:58 -0500)] 
[Python][Flight] Add an 'HTTP basic auth' example (#123)

4 months ago[Python] Make Flight tests idempotent (#145)
David Li [Wed, 9 Feb 2022 16:39:45 +0000 (11:39 -0500)] 
[Python] Make Flight tests idempotent (#145)

Use a temporary directory instead of the current working directory.
Also, remove calls to FlightServerBase.serve() (they're unnecessary
since the server is active as soon as it's created).

4 months ago[C++][Flight] Add basic Flight service (#124)
David Li [Wed, 9 Feb 2022 16:36:44 +0000 (11:36 -0500)] 
[C++][Flight] Add basic Flight service (#124)

4 months ago[Java]: Java cookbook for create arrow data manipulation (#135)
david dali susanibar arce [Tue, 8 Feb 2022 14:28:14 +0000 (09:28 -0500)] 
[Java]: Java cookbook for create arrow data manipulation (#135)

* Adding java cookbook for arrow data manipulation

* Adding java cookbook for creating arrow data manipulation

* Solving pr comments

Co-authored-by: David Li <li.davidm96@gmail.com>
4 months agoDropped the googletest dependency down to 1.10. This lets us get it from conda....
Weston Pace [Tue, 8 Feb 2022 10:45:58 +0000 (00:45 -1000)] 
Dropped the googletest dependency down to 1.10.  This lets us get it from conda.  Upped the arrow version to 6.0.0.  Added UseAsync back in, as its removal was based on 7.0.0, which is a touch premature as 7.0.0 is not in conda yet. (#144)

4 months agoRemoved call to UseAsync (#143)
Weston Pace [Fri, 4 Feb 2022 22:14:52 +0000 (12:14 -1000)] 
Removed call to UseAsync (#143)

In Arrow 7.0.0 the `arrow::dataset::ScannerBuilder::UseAsync` call was removed.

4 months ago[Java]: Java cookbook for create arrow object (#131)
david dali susanibar arce [Fri, 4 Feb 2022 16:52:59 +0000 (11:52 -0500)] 
[Java]: Java cookbook for create arrow object (#131)

* Adding java cookbook for creating arrow object

* Solving pr comments

* Solving pr comments

Co-authored-by: David Li <li.davidm96@gmail.com>
Co-authored-by: Alessandro Molina <amol@turbogears.org>
5 months ago[Java]: Java cookbook for create arrow schema (#134)
david dali susanibar arce [Thu, 27 Jan 2022 10:19:04 +0000 (05:19 -0500)] 
[Java]: Java cookbook for create arrow schema (#134)

* Adding java cookbook for creating arrow schema

* minor change to trigger ci

Co-authored-by: Alessandro Molina <amol@turbogears.org>
5 months ago[Java] Cookbook CI (#133)
Alessandro Molina [Tue, 25 Jan 2022 16:45:30 +0000 (17:45 +0100)] 
[Java] Cookbook CI (#133)

* Add Java cookbook testing

* Add Java cookbook deploy

5 months ago[Java]: Setup java arrow cookbook doc and testing baseline (#125)
david dali susanibar arce [Tue, 25 Jan 2022 14:25:27 +0000 (09:25 -0500)] 
[Java]: Setup java arrow cookbook doc and testing baseline (#125)

* Setup java arrow cookbook doc and testing baseline

* Update java/ext/javadoctest.py

Co-authored-by: David Li <li.davidm96@gmail.com>
* Adding support for jdk9+ that does not support dash in bash/pipe

Co-authored-by: David Li <li.davidm96@gmail.com>
5 months agoUpdate reading_and_writing_data.Rmd (#117)
Benjamin Wolfe [Fri, 21 Jan 2022 10:29:02 +0000 (04:29 -0600)] 
Update reading_and_writing_data.Rmd (#117)

Add blank line to format bulleted list correctly

5 months agoFix Markdown/Jira-style links in README (#128)
David Li [Wed, 19 Jan 2022 19:42:22 +0000 (14:42 -0500)] 
Fix Markdown/Jira-style links in README (#128)

5 months agoAdd missing C++ Cookbook link in CONTRIBUTING.md (#127)
David Li [Wed, 19 Jan 2022 17:27:05 +0000 (12:27 -0500)] 
Add missing C++ Cookbook link in CONTRIBUTING.md (#127)

5 months ago[Python] Add Flight streaming example (#109)
David Li [Wed, 19 Jan 2022 10:23:25 +0000 (05:23 -0500)] 
[Python] Add Flight streaming example (#109)

* [Python] Add Flight streaming example

Fixes #86

* [Python] Remove special dataset example

* [Python] Add back missing server shutdown

5 months agoMerge pull request #119 from apache/contributing-typos
Alessandro Molina [Wed, 19 Jan 2022 10:19:51 +0000 (11:19 +0100)] 
Merge pull request #119 from apache/contributing-typos

Add note in CONTRIBUTING.MD about typos

5 months ago[R] Fix hyperlink (#114)
Tony Fujs [Tue, 18 Jan 2022 14:24:06 +0000 (11:24 -0300)] 
[R] Fix hyperlink (#114)

* Fix hyperlink

* Update r/content/arrays.Rmd

* Update r/content/arrays.Rmd

Co-authored-by: David Li <li.davidm96@gmail.com>
Co-authored-by: Nic Crane <thisisnic@gmail.com>
Co-authored-by: David Li <li.davidm96@gmail.com>
5 months agoSwap all instances of Table$create for arrow_table (#112)
Nic Crane [Sun, 16 Jan 2022 18:00:13 +0000 (18:00 +0000)] 
Swap all instances of Table$create for arrow_table (#112)

5 months agoAdd note in CONTRIBUTING.MD about typos contributing-typos 119/head
Nic Crane [Sun, 16 Jan 2022 17:57:48 +0000 (17:57 +0000)] 
Add note in CONTRIBUTING.MD about typos

5 months agoDescribe recipes and differentiate from user guides (#118)
Will Jones [Sun, 16 Jan 2022 17:53:37 +0000 (09:53 -0800)] 
Describe recipes and differentiate from user guides (#118)

* Describe recipes and differentiate from user guides

* Update CONTRIBUTING.md

Co-authored-by: Nic Crane <thisisnic@gmail.com>
7 months agoAdd link to the README (#102)
Nic Crane [Tue, 16 Nov 2021 15:09:12 +0000 (15:09 +0000)] 
Add link to the README (#102)

7 months agoFix broken build and aesthetic improvements (#103)
Nic Crane [Wed, 3 Nov 2021 21:03:25 +0000 (21:03 +0000)] 
Fix broken build and aesthetic improvements (#103)

* Add arrow logo

* Add missing solution headings

* Shorten section titles

* Delete redundant content

* Add link to R docs

* Rename intro sections

7 months agoAesthetic improvements to intro page (#101)
Nic Crane [Wed, 3 Nov 2021 12:49:32 +0000 (12:49 +0000)] 
Aesthetic improvements to intro page (#101)

* Add arrow logo

* Use includegraphics and add link to dplyr

7 months agoUpdate preface (#100)
Nic Crane [Wed, 3 Nov 2021 10:25:22 +0000 (10:25 +0000)] 
Update preface (#100)

7 months agoARROW-13749: [Doc][Cookbook] Work with functions from other packages via dplyr bindin...
Nic Crane [Wed, 3 Nov 2021 10:06:05 +0000 (10:06 +0000)] 
ARROW-13749: [Doc][Cookbook] Work with functions from other packages via dplyr bindings - R (#95)

* Add content on using functions from other packages

* Add links to packages

7 months agoARROW-13714: [Doc][Cookbook] Sharing data between R and Python - R (#99)
Nic Crane [Wed, 3 Nov 2021 09:55:47 +0000 (09:55 +0000)] 
ARROW-13714: [Doc][Cookbook] Sharing data between R and Python - R (#99)

* Initial file

* Add to bookdown file

* Add examples

* Add example of PyArrow functions

* Initial file

* Add to bookdown

* Add examples

* Add example of PyArrow functions

* Capitalisation

* Update dependencies

* Add pyarrow installation

* Consistent capitalisation

7 months agoARROW-13752 [Doc][Cookbook] Searching for values matching a predicate in Arrays...
Nic Crane [Wed, 3 Nov 2021 09:54:45 +0000 (09:54 +0000)] 
ARROW-13752  [Doc][Cookbook] Searching for values matching a predicate in Arrays - R (#96)

* Add in extra recipes

* Rephrase to make more idiomatic

8 months agoARROW-13713: [Doc][Cookbook] Reading and Writing Compressed Data - R (#91)
Nic Crane [Mon, 1 Nov 2021 05:43:49 +0000 (05:43 +0000)] 
ARROW-13713: [Doc][Cookbook] Reading and Writing Compressed Data - R (#91)

* Add initial recipes

* Add to bookdown

* Move "compressed data" content to the "read and write data" chapter

* Add "Solution" headings

* Write parquet not feather!

* Add .gz ending note

* Add note about defaults

* Add note in see also section

* Add comment about default compression

* Add to comment

8 months agoUpdate python tests for version 6.0.0 (#98)
Alessandro Molina [Thu, 28 Oct 2021 12:18:59 +0000 (14:18 +0200)] 
Update python tests for version 6.0.0 (#98)

* Update python tests for version 6.0.0

* Adapt to new partitioning convention

8 months ago[R] - Flight recipes (#90)
Nic Crane [Thu, 28 Oct 2021 09:23:46 +0000 (12:23 +0300)] 
[R] - Flight recipes (#90)

* Flight recipes

* Add code to make examples self-contained

* "discussion" -> "see also"

* In in solution headings

* Use sentence case and remove "ing" from first verb in title

* Mention it's a pyarrow thing

* Update r/content/flight.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
8 months agoARROW-13712: Reading and Writing Compressed Data (#87)
Alessandro Molina [Thu, 28 Oct 2021 09:22:05 +0000 (11:22 +0200)] 
ARROW-13712: Reading and Writing Compressed Data (#87)

* ARROW-13712: Reading and Writing Compressed Data

* Apply suggestions from code review

Co-authored-by: Nic <thisisnic@gmail.com>
* Rewording

* Rewording

* Update python/source/io.rst

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
8 months agoAdd a basic dataset reading example (#85)
Weston Pace [Wed, 27 Oct 2021 20:46:20 +0000 (10:46 -1000)] 
Add a basic dataset reading example (#85)

* Added a basic dataset reading example

* Remove cmake debugging

* Adding newline to end of common.cc

* Apply suggestions from code review

Co-authored-by: Nic Crane <thisisnic@gmail.com>
Co-authored-by: Nic Crane <thisisnic@gmail.com>
8 months ago[R] 93 - dplyr chapter feedback (#94)
Nic [Tue, 26 Oct 2021 10:47:52 +0000 (13:47 +0300)] 
[R] 93 - dplyr chapter feedback (#94)

* Fix bullet points

* Ensure it's obvious arrow is doing the work

* chunks

8 months agoARROW-13730: Adding a column to an existing Table (#81)
Alessandro Molina [Thu, 21 Oct 2021 08:45:54 +0000 (10:45 +0200)] 
ARROW-13730: Adding a column to an existing Table (#81)

8 months agoARROW-13732: [Doc][Cookbook] Manipulating and analyze Arrow data with dplyr verbs...
Nic [Thu, 21 Oct 2021 08:37:44 +0000 (09:37 +0100)] 
ARROW-13732: [Doc][Cookbook] Manipulating and analyze Arrow data with dplyr verbs - R (#78)

* Update chapter to follow problem/solution/discussion format

* Split data manipulation chapter into tables/arrays and add initial content

* Remove assignment and have simpler chains

* Shorten line

* Add content on using compute functions not implemented in the R package

* Remove the word tidyverse as it's inaccurate

* Add heading

* Entirely refactor

* Add comment with current missing content

* Actually use Arrow

* Add "what you should know" section and do loads of rephrasing and adding examples

* Add "what you should know before, and content on calling functions directly

* Add test

* Rename files

* Add tests to code chunks

* Add note about collect/create

* Fix bad link

* Update r/content/arrays.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/arrays.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/arrays.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Remove "where possible"

* Rephrase to clarify

* restyle

* More rephrasing

* Add some simpler intro content to the dplyr chapter

* Add tests for intro code

* Put dataset in Table$create() call

* Reduce whitespace

* Use arrow_table instead of Table$create

* Use Table$create for the moment while the next version isn't on CRAN yet

* Erroneous renaming

Co-authored-by: Weston Pace <weston.pace@gmail.com>
8 months agoARROW-13710: Arrow Flight recipe (#84)
Alessandro Molina [Wed, 13 Oct 2021 12:03:52 +0000 (14:03 +0200)] 
ARROW-13710: Arrow Flight recipe (#84)

* Flight RPC

* Create datasets directory

* comment

* Ops, forgot to create repo for real

* Size can change depending on the system

* Apply suggestions from code review

Co-authored-by: David Li <li.davidm96@gmail.com>
* Address code review feedbacks

Co-authored-by: David Li <li.davidm96@gmail.com>
8 months agoARROW-13753: Filtering Arrays for values matching a mask filter (#80)
Alessandro Molina [Fri, 8 Oct 2021 08:57:50 +0000 (10:57 +0200)] 
ARROW-13753: Filtering Arrays for values matching a mask filter (#80)

* filtering arrays recipe

* Wrong heading

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
8 months agoARROW-13751: Recipe for searching for values matching a predicate (#79)
Alessandro Molina [Tue, 5 Oct 2021 09:31:32 +0000 (11:31 +0200)] 
ARROW-13751: Recipe for searching for values matching a predicate (#79)

* Recipe for searching for values matching a predicate

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Wrong heading

Co-authored-by: Weston Pace <weston.pace@gmail.com>
9 months ago[R] - Schemas recipes (#67)
Nic [Fri, 1 Oct 2021 17:27:35 +0000 (17:27 +0000)] 
[R] - Schemas recipes (#67)

* Add the creating schemas recipe

* Add in content on combinig schemas, and specifying schemas when reading in files

* Delete unncecessary files, and stop showing test chunks

* Rephrase the bit about converting from R to Arrow

* Remove extraneous word

* Also mention reading in data

* Extra clarity

* missing word

* Add appendices

* Add section on casting, remove "problem" headings, update dataset, move tables to appendix, show example of incompatible data types

* Link between incompatible data types and appendix table

* Add content on combining schemas

* Rephrase

* Add context

* Reorder items in table

* Add recipe for schemas where match or don't match

* Rephrase

* Update code which causes an error to not run

* Relegate unify_schemas to discussion

* Fix rebase

* Remove appendix and link to vignette instead

* Remove examples of everything that could go wrong, as not relevant

* Fix failing test

9 months agoARROW-13727: Recipe to concatenate two tables (#76)
Alessandro Molina [Thu, 30 Sep 2021 14:19:08 +0000 (16:19 +0200)] 
ARROW-13727: Recipe to concatenate two tables (#76)

* Recipe to concatenate two tables

* Apply suggestions from code review

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Nic <thisisnic@gmail.com>
9 months agoUnify schemas recipe (#75)
Alessandro Molina [Thu, 30 Sep 2021 12:43:42 +0000 (14:43 +0200)] 
Unify schemas recipe (#75)

9 months agoAdd missing word (#77)
Nic [Mon, 20 Sep 2021 09:54:09 +0000 (09:54 +0000)] 
Add missing word (#77)

9 months agoUse as.data.frame instead of dplyr::collect (#71)
Nic [Wed, 15 Sep 2021 11:28:56 +0000 (11:28 +0000)] 
Use as.data.frame instead of dplyr::collect (#71)

* Use as.data.frame instead of dplyr::collect

* Typo

9 months agoARROW-13716: Add RecordBatch recipe (#66)
Alessandro Molina [Wed, 15 Sep 2021 11:26:16 +0000 (13:26 +0200)] 
ARROW-13716: Add RecordBatch recipe (#66)

* Add RecordBatch recipe

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Make example obvious

* Apply suggestions from code review

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Nic <thisisnic@gmail.com>
9 months agoSpecifying schemas for arrays and tables (#73)
Alessandro Molina [Wed, 15 Sep 2021 11:25:08 +0000 (13:25 +0200)] 
Specifying schemas for arrays and tables (#73)

9 months agoAdding anonymous flag to s3 (#70)
Tomek Drabas [Fri, 10 Sep 2021 23:46:18 +0000 (16:46 -0700)] 
Adding anonymous flag to s3 (#70)

* Adding anonymous flag to s3

* Fixing missing comma

* Info about s3 credentials

9 months agoCommitters automatically have all the permissions that collaborators do so this is...
Weston Pace [Thu, 9 Sep 2021 19:04:56 +0000 (09:04 -1000)] 
Committers automatically have all the permissions that collaborators do so this is no longer needed (congrats). (#69)

9 months agoARROW-13718: [Doc][Cookbook] Creating Arrays - R (#65)
Nic [Thu, 9 Sep 2021 16:38:34 +0000 (16:38 +0000)] 
ARROW-13718: [Doc][Cookbook] Creating Arrays - R (#65)

* Add recipe for creating an Array

9 months agoARROW-13709: Reading JSON in R recipe (#64)
Nic [Thu, 9 Sep 2021 16:38:09 +0000 (16:38 +0000)] 
ARROW-13709: Reading JSON in R recipe (#64)

* Ensure that test chunks are not rendered

* Add code to delete any temporarily generated files, add recipe for reading JSON

* Rephrase

9 months agoARROW-13717: Creating arrays recipe (#63)
Alessandro Molina [Tue, 7 Sep 2021 18:02:15 +0000 (20:02 +0200)] 
ARROW-13717: Creating arrays recipe (#63)

* Creating arrays recipe

* shorten pandas too for consistency

10 months agoAdded clang-tools (which includes clang-tidy) to environment.yml (#61)
Weston Pace [Wed, 1 Sep 2021 21:25:06 +0000 (11:25 -1000)] 
Added clang-tools (which includes clang-tidy) to environment.yml (#61)

* Added clang-tools (which includes clang-tidy) to environment.yml

* Added libstdc++ to the conda environment

* Only add clang flags if the compiler is clang

10 months agoCreating tables recipe (#51)
Alessandro Molina [Wed, 1 Sep 2021 14:38:35 +0000 (16:38 +0200)] 
Creating tables recipe (#51)

https://issues.apache.org/jira/browse/ARROW-13715

10 months agoRecipe to read line delimited json as of ARROW-13708 (#49)
Alessandro Molina [Wed, 1 Sep 2021 14:37:54 +0000 (16:37 +0200)] 
Recipe to read line delimited json as of ARROW-13708 (#49)

* Recipe to read json

* rename pj to pa.json

* Add colon

10 months agoAdded clang-format and clang-tidy files. Added clang-tidy to the build. (#54)
Weston Pace [Wed, 1 Sep 2021 02:18:26 +0000 (16:18 -1000)] 
Added clang-format and clang-tidy files.  Added clang-tidy to the build. (#54)

* Added clang-format and clang-tidy files.  Added clang-tidy to the build.

* Added copyright to .clang-tidy

* Wrapped duplicated cmake code into helper function.  Got rid of defunct lint target

10 months agoAdd a workflow for running C++ tests on PRs (#53)
Weston Pace [Tue, 31 Aug 2021 18:35:56 +0000 (08:35 -1000)] 
Add a workflow for running C++ tests on PRs (#53)

10 months agoUpdate CSV recipe to use pyarrow.csv instead of pandas (#50)
Alessandro Molina [Tue, 31 Aug 2021 18:27:28 +0000 (20:27 +0200)] 
Update CSV recipe to use pyarrow.csv instead of pandas (#50)

* Switch CSV writing to the arrow provided one

* Add incremental recipe

* Update python/source/io.rst

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
* Update python/source/io.rst

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update python/source/io.rst

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
10 months agoWriting Partitioned Datasets recipe for Python (#47)
Alessandro Molina [Tue, 24 Aug 2021 13:05:05 +0000 (15:05 +0200)] 
Writing Partitioned Datasets recipe for Python (#47)

10 months agoInitial C++ cookbook (#22)
Weston Pace [Mon, 23 Aug 2021 19:44:26 +0000 (09:44 -1000)] 
Initial C++ cookbook (#22)

* Initial C++ cookbook

* Addressing PR feedback.  Converted contributing doc from RST to MD (since the extension was MD).

* Update cpp/CONTRIBUTING.md

Co-authored-by: David Li <li.davidm96@gmail.com>
* Creating standalone section for code of conduct

* Update cpp/CONTRIBUTING.md

Co-authored-by: David Li <li.davidm96@gmail.com>
* Addressing PR comments

Co-authored-by: David Li <li.davidm96@gmail.com>
10 months ago[R] Remove unnecessary head()s (#41)
Neal Richardson [Sat, 21 Aug 2021 00:52:11 +0000 (20:52 -0400)] 
[R] Remove unnecessary head()s (#41)

* [R] Remove unnecessary head()s

* Update r/content/reading_and_writing_data.Rmd

10 months ago[R] Fix broken edit link (#39)
Neal Richardson [Thu, 19 Aug 2021 02:35:58 +0000 (22:35 -0400)] 
[R] Fix broken edit link (#39)

* [R] Fix broken edit link

* Update r/content/_bookdown.yml

Co-authored-by: Weston Pace <weston.pace@gmail.com>
10 months agoExplicit array creation (#2)
NathanaĆ«l Leaute [Wed, 18 Aug 2021 04:18:46 +0000 (06:18 +0200)] 
Explicit array creation (#2)

10 months agoCopy .asf.yaml to asf-site branch (#38)
Weston Pace [Tue, 17 Aug 2021 13:24:02 +0000 (03:24 -1000)] 
Copy .asf.yaml to asf-site branch (#38)

10 months agomore debugging
Daniel Gruno [Mon, 16 Aug 2021 12:11:15 +0000 (14:11 +0200)] 
more debugging

10 months agoanother whitespace trigger
Daniel Gruno [Mon, 16 Aug 2021 11:50:01 +0000 (13:50 +0200)] 
another whitespace trigger

10 months agowhitespace trigger for debugging .asf.yaml issue
Daniel Gruno [Mon, 16 Aug 2021 11:47:25 +0000 (13:47 +0200)] 
whitespace trigger for debugging .asf.yaml issue

10 months ago[asf infra] trigger resort of .asf.yaml
Daniel Gruno [Mon, 16 Aug 2021 11:40:19 +0000 (13:40 +0200)] 
[asf infra] trigger resort of .asf.yaml

10 months agoTemporarily removing collaborators ending in - (#37)
Weston Pace [Mon, 16 Aug 2021 11:19:15 +0000 (01:19 -1000)] 
Temporarily removing collaborators ending in - (#37)

ASF Infra has a regex to test if a username is a valid GH username.  GH does not allow usernames to end in hyphen.  However, they did allow this at one point and they grandfathered in some names.  ASF Infra is rejecting these grandfathered in names.

10 months agoAdd Apache publishing / hosting (#30)
Weston Pace [Mon, 16 Aug 2021 10:54:24 +0000 (00:54 -1000)] 
Add Apache publishing / hosting (#30)

* Add Apache publishing / hosting

* Adding asf-site to the destinations we push to

10 months agoFix C++ version (#36)
Nic [Fri, 13 Aug 2021 05:52:47 +0000 (05:52 +0000)] 
Fix C++ version (#36)

10 months agoMove tests into implementation-specific folders and workflows (#32)
Nic [Thu, 12 Aug 2021 22:28:43 +0000 (22:28 +0000)] 
Move tests into implementation-specific folders and workflows (#32)

* Move tests into implementation-specific folders and workflows

* Move workflows up a level so they run

* Improve naming

* Disallow multiple concurrent jobs

* Prevent concurrency

* Check concurrency change

10 months ago#13 install latest release (#20)
Nic [Wed, 11 Aug 2021 19:54:47 +0000 (19:54 +0000)] 
#13 install latest release (#20)

* install latest release

* Add Apache license, and simplify non-release builds for now

* Refactor and style code

* Fix typo

* Install binaries for dependencies too

* Run styler and update install_arrow_version logic to have better defaults

* Fix issue with NA var

* Less ambigous CI

* Got mixed up with RSPM having Linux binaries

* Typofix

* For loops aren't illegal

* Add issue number to TODO

* test empty commit

10 months agoChanging the gh-pages deploy so that the gh-pages branch is always a single commit...
Weston Pace [Wed, 11 Aug 2021 19:53:59 +0000 (09:53 -1000)] 
Changing the gh-pages deploy so that the gh-pages branch is always a single commit instead of something that keeps history (#26)

10 months agoSetting up notifications via the ML of git activity (#31)
Weston Pace [Wed, 11 Aug 2021 19:53:39 +0000 (09:53 -1000)] 
Setting up notifications via the ML of git activity (#31)

10 months agoAdding thisisnic and amol- as collaborators (#14)
Weston Pace [Tue, 10 Aug 2021 02:09:17 +0000 (16:09 -1000)] 
Adding thisisnic and amol- as collaborators (#14)

10 months agoCreated a .nojekyll file to tell github not to run jekyll on the gh-pages (#24)
Weston Pace [Sun, 8 Aug 2021 06:56:12 +0000 (20:56 -1000)] 
Created a .nojekyll file to tell github not to run jekyll on the gh-pages (#24)

10 months agoFixing depend target in deploy workflow to updated name of build target
Weston Pace [Fri, 6 Aug 2021 22:49:10 +0000 (12:49 -1000)] 
Fixing depend target in deploy workflow to updated name of build target

10 months agoAlso run CI on pull requests to the main branch (#21)
Nic [Fri, 6 Aug 2021 22:46:06 +0000 (22:46 +0000)] 
Also run CI on pull requests to the main branch (#21)

* Also run CI on pull requests to the main branch

* Separate out test/dploy scripts

* add tests back into deply stage

* rename jobs for consistency

10 months ago#17 no pacman (#18)
Nic [Fri, 6 Aug 2021 21:35:04 +0000 (21:35 +0000)] 
#17 no pacman (#18)

* remove redundant dependencies

* another redundant dependency

10 months agoPrevent force pushes to main (#15)
Weston Pace [Fri, 6 Aug 2021 07:53:27 +0000 (21:53 -1000)] 
Prevent force pushes to main (#15)

10 months agoThe build is dropping a lot of csv/feather/etc files which it's annoying to avoid...
Weston Pace [Fri, 6 Aug 2021 07:52:21 +0000 (21:52 -1000)] 
The build is dropping a lot of csv/feather/etc files which it's annoying to avoid so I'm adding ignore rules (#16)

10 months agoremove duplicated sections and add ASF license (#7)
Nic [Thu, 5 Aug 2021 21:42:28 +0000 (21:42 +0000)] 
remove duplicated sections and add ASF license (#7)

10 months agoTwo sections were using the same name across two different Rmd files and that was...
Weston Pace [Thu, 5 Aug 2021 08:43:09 +0000 (22:43 -1000)] 
Two sections were using the same name across two different Rmd files and that was causing the combined Rmd to fail (#4)

* Two sections were using the same name across two different Rmd files and that was causing the combined Rmd to fail

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Nic <thisisnic@gmail.com>
10 months agoEnable issues & gh-pages branch on Github repo (#6)
Nic [Thu, 5 Aug 2021 07:49:11 +0000 (07:49 +0000)] 
Enable issues & gh-pages branch on Github repo (#6)

10 months agoRefactor GH Actions job that uses third party action (#3)
Nic [Thu, 5 Aug 2021 04:32:27 +0000 (04:32 +0000)] 
Refactor GH Actions job that uses third party action (#3)

* remove third-party GH actions

* re-enable some stages

11 months agoTrivial change to see if I can commit
Neal Richardson [Thu, 29 Jul 2021 17:05:47 +0000 (13:05 -0400)] 
Trivial change to see if I can commit

11 months agoInitial content for Arrow Cookbook for Python and R (#1)
Alessandro Molina [Wed, 28 Jul 2021 14:38:20 +0000 (16:38 +0200)] 
Initial content for Arrow Cookbook for Python and R (#1)

* Initial Import

* R cookbook initial commit (#1)

* R Cookbook skeleton and initial chapter

* Move r test script to a separate directory

* Add Apache 2 license

* Add parquet section

* Delete files used to demonstrate failing tests in CI

* Licensing

* Add content for different formats and rearrange headings

* Small change to make the tests run on macOS

* Completed the IO section and added intersphinx with PyArrow

* Add workflow to deploy to GH pages

* Update path

* Rename chapters and fill in section titles

* Commit whitespace to trigger build

* Update bookdown job

* try new job config

* Install nightly Arrow

* Evaluate all relevant bits!

* Deploy to r dir

* Try new workflow

* update build path

* Add email and update paths

* Update job to build all cookbooks

* Delete whitespace to trigger build

* Swap order to see if this fixes build

* Install system dependencies

* Put it back on Mac so it's faster

* Separate steps to diagnose issue

* Brew not sudo

* Switching to ubuntu as I don't understand why python 2

* Don't put results in r directory

* Capitalise 'C'

* Update bookdown link so can click to fork/edit

* Add CI stage that runs tests

* Add examples of manually creating Arrow objects and writing to various formats

* Add S3 parquet

* Partitioned data

* Partitioned Data from S3

* Rename record_batch_create chunk

* CSV recipe requires pandas

* Filter parquet data on read

* Reading/Writing feather files

* remove duplicated chunk name

* tweak create

* Categorical data

* Speed up compiling

* Fix tests

* tests pass

* Data manipulation functions

* Link to compute functions

* Tweak naming

* Add contribution file

* landing page style tweak

* Improve contribution documentation

* Explicitly reference the contribution docs

* ignore build directory

* Change branch name

* Update contents

* Update CONTRIBUTING.md

* Suggestions from Grammarly

* Rename initial chapter

* Update Makefile to allow Arrow version to be specified

* Truncate license file to relevant part

* typo

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Add link to code of conduct

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Capitalise "Array"

* Update r/CONTRIBUTING.md

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/creating_arrow_objects.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Mention dependencies

* Mention that this is not the documentation

* rewording

* Add -jauto by default and indent a print

* The Apache Software Foundation

* reword

* Correct ambiguous and incorrect phrasing

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Reorder sections

* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Remove redundant code snippet

* Update reading CSVs

* Add in section on converting from/to Arrow Tables and tibbles

* rephrase list of numbers

* rephrase list of numbers

* Add missing bracket

* Rephrase about parquet containing multiple cols

* rephrased

* Adapt to Arrow 5.0 output

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Jonathan Keane <jkeane@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>
11 months agoInitial commit
Wes McKinney [Wed, 14 Jul 2021 21:42:28 +0000 (16:42 -0500)] 
Initial commit