arrow-cookbook.git
10 months agoUpdate r/content/reading_and_writing_data.Rmd nealrichardson-patch-2 41/head
Neal Richardson [Thu, 19 Aug 2021 14:49:11 +0000 (10:49 -0400)] 
Update r/content/reading_and_writing_data.Rmd

10 months ago[R] Remove unnecessary head()s
Neal Richardson [Wed, 18 Aug 2021 12:38:45 +0000 (08:38 -0400)] 
[R] Remove unnecessary head()s

10 months agoExplicit array creation (#2)
NathanaĆ«l Leaute [Wed, 18 Aug 2021 04:18:46 +0000 (06:18 +0200)] 
Explicit array creation (#2)

10 months agoCopy .asf.yaml to asf-site branch (#38)
Weston Pace [Tue, 17 Aug 2021 13:24:02 +0000 (03:24 -1000)] 
Copy .asf.yaml to asf-site branch (#38)

10 months agomore debugging
Daniel Gruno [Mon, 16 Aug 2021 12:11:15 +0000 (14:11 +0200)] 
more debugging

10 months agoanother whitespace trigger
Daniel Gruno [Mon, 16 Aug 2021 11:50:01 +0000 (13:50 +0200)] 
another whitespace trigger

10 months agowhitespace trigger for debugging .asf.yaml issue
Daniel Gruno [Mon, 16 Aug 2021 11:47:25 +0000 (13:47 +0200)] 
whitespace trigger for debugging .asf.yaml issue

10 months ago[asf infra] trigger resort of .asf.yaml
Daniel Gruno [Mon, 16 Aug 2021 11:40:19 +0000 (13:40 +0200)] 
[asf infra] trigger resort of .asf.yaml

10 months agoTemporarily removing collaborators ending in - (#37)
Weston Pace [Mon, 16 Aug 2021 11:19:15 +0000 (01:19 -1000)] 
Temporarily removing collaborators ending in - (#37)

ASF Infra has a regex to test if a username is a valid GH username.  GH does not allow usernames to end in hyphen.  However, they did allow this at one point and they grandfathered in some names.  ASF Infra is rejecting these grandfathered in names.

10 months agoAdd Apache publishing / hosting (#30)
Weston Pace [Mon, 16 Aug 2021 10:54:24 +0000 (00:54 -1000)] 
Add Apache publishing / hosting (#30)

* Add Apache publishing / hosting

* Adding asf-site to the destinations we push to

10 months agoFix C++ version (#36)
Nic [Fri, 13 Aug 2021 05:52:47 +0000 (05:52 +0000)] 
Fix C++ version (#36)

10 months agoMove tests into implementation-specific folders and workflows (#32)
Nic [Thu, 12 Aug 2021 22:28:43 +0000 (22:28 +0000)] 
Move tests into implementation-specific folders and workflows (#32)

* Move tests into implementation-specific folders and workflows

* Move workflows up a level so they run

* Improve naming

* Disallow multiple concurrent jobs

* Prevent concurrency

* Check concurrency change

10 months ago#13 install latest release (#20)
Nic [Wed, 11 Aug 2021 19:54:47 +0000 (19:54 +0000)] 
#13 install latest release (#20)

* install latest release

* Add Apache license, and simplify non-release builds for now

* Refactor and style code

* Fix typo

* Install binaries for dependencies too

* Run styler and update install_arrow_version logic to have better defaults

* Fix issue with NA var

* Less ambigous CI

* Got mixed up with RSPM having Linux binaries

* Typofix

* For loops aren't illegal

* Add issue number to TODO

* test empty commit

10 months agoChanging the gh-pages deploy so that the gh-pages branch is always a single commit...
Weston Pace [Wed, 11 Aug 2021 19:53:59 +0000 (09:53 -1000)] 
Changing the gh-pages deploy so that the gh-pages branch is always a single commit instead of something that keeps history (#26)

10 months agoSetting up notifications via the ML of git activity (#31)
Weston Pace [Wed, 11 Aug 2021 19:53:39 +0000 (09:53 -1000)] 
Setting up notifications via the ML of git activity (#31)

10 months agoAdding thisisnic and amol- as collaborators (#14)
Weston Pace [Tue, 10 Aug 2021 02:09:17 +0000 (16:09 -1000)] 
Adding thisisnic and amol- as collaborators (#14)

10 months agoCreated a .nojekyll file to tell github not to run jekyll on the gh-pages (#24)
Weston Pace [Sun, 8 Aug 2021 06:56:12 +0000 (20:56 -1000)] 
Created a .nojekyll file to tell github not to run jekyll on the gh-pages (#24)

10 months agoFixing depend target in deploy workflow to updated name of build target
Weston Pace [Fri, 6 Aug 2021 22:49:10 +0000 (12:49 -1000)] 
Fixing depend target in deploy workflow to updated name of build target

10 months agoAlso run CI on pull requests to the main branch (#21)
Nic [Fri, 6 Aug 2021 22:46:06 +0000 (22:46 +0000)] 
Also run CI on pull requests to the main branch (#21)

* Also run CI on pull requests to the main branch

* Separate out test/dploy scripts

* add tests back into deply stage

* rename jobs for consistency

10 months ago#17 no pacman (#18)
Nic [Fri, 6 Aug 2021 21:35:04 +0000 (21:35 +0000)] 
#17 no pacman (#18)

* remove redundant dependencies

* another redundant dependency

10 months agoPrevent force pushes to main (#15)
Weston Pace [Fri, 6 Aug 2021 07:53:27 +0000 (21:53 -1000)] 
Prevent force pushes to main (#15)

10 months agoThe build is dropping a lot of csv/feather/etc files which it's annoying to avoid...
Weston Pace [Fri, 6 Aug 2021 07:52:21 +0000 (21:52 -1000)] 
The build is dropping a lot of csv/feather/etc files which it's annoying to avoid so I'm adding ignore rules (#16)

10 months agoremove duplicated sections and add ASF license (#7)
Nic [Thu, 5 Aug 2021 21:42:28 +0000 (21:42 +0000)] 
remove duplicated sections and add ASF license (#7)

10 months agoTwo sections were using the same name across two different Rmd files and that was...
Weston Pace [Thu, 5 Aug 2021 08:43:09 +0000 (22:43 -1000)] 
Two sections were using the same name across two different Rmd files and that was causing the combined Rmd to fail (#4)

* Two sections were using the same name across two different Rmd files and that was causing the combined Rmd to fail

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Nic <thisisnic@gmail.com>
10 months agoEnable issues & gh-pages branch on Github repo (#6)
Nic [Thu, 5 Aug 2021 07:49:11 +0000 (07:49 +0000)] 
Enable issues & gh-pages branch on Github repo (#6)

10 months agoRefactor GH Actions job that uses third party action (#3)
Nic [Thu, 5 Aug 2021 04:32:27 +0000 (04:32 +0000)] 
Refactor GH Actions job that uses third party action (#3)

* remove third-party GH actions

* re-enable some stages

10 months agoTrivial change to see if I can commit
Neal Richardson [Thu, 29 Jul 2021 17:05:47 +0000 (13:05 -0400)] 
Trivial change to see if I can commit

11 months agoInitial content for Arrow Cookbook for Python and R (#1)
Alessandro Molina [Wed, 28 Jul 2021 14:38:20 +0000 (16:38 +0200)] 
Initial content for Arrow Cookbook for Python and R (#1)

* Initial Import

* R cookbook initial commit (#1)

* R Cookbook skeleton and initial chapter

* Move r test script to a separate directory

* Add Apache 2 license

* Add parquet section

* Delete files used to demonstrate failing tests in CI

* Licensing

* Add content for different formats and rearrange headings

* Small change to make the tests run on macOS

* Completed the IO section and added intersphinx with PyArrow

* Add workflow to deploy to GH pages

* Update path

* Rename chapters and fill in section titles

* Commit whitespace to trigger build

* Update bookdown job

* try new job config

* Install nightly Arrow

* Evaluate all relevant bits!

* Deploy to r dir

* Try new workflow

* update build path

* Add email and update paths

* Update job to build all cookbooks

* Delete whitespace to trigger build

* Swap order to see if this fixes build

* Install system dependencies

* Put it back on Mac so it's faster

* Separate steps to diagnose issue

* Brew not sudo

* Switching to ubuntu as I don't understand why python 2

* Don't put results in r directory

* Capitalise 'C'

* Update bookdown link so can click to fork/edit

* Add CI stage that runs tests

* Add examples of manually creating Arrow objects and writing to various formats

* Add S3 parquet

* Partitioned data

* Partitioned Data from S3

* Rename record_batch_create chunk

* CSV recipe requires pandas

* Filter parquet data on read

* Reading/Writing feather files

* remove duplicated chunk name

* tweak create

* Categorical data

* Speed up compiling

* Fix tests

* tests pass

* Data manipulation functions

* Link to compute functions

* Tweak naming

* Add contribution file

* landing page style tweak

* Improve contribution documentation

* Explicitly reference the contribution docs

* ignore build directory

* Change branch name

* Update contents

* Update CONTRIBUTING.md

* Suggestions from Grammarly

* Rename initial chapter

* Update Makefile to allow Arrow version to be specified

* Truncate license file to relevant part

* typo

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Add link to code of conduct

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Capitalise "Array"

* Update r/CONTRIBUTING.md

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/creating_arrow_objects.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Mention dependencies

* Mention that this is not the documentation

* rewording

* Add -jauto by default and indent a print

* The Apache Software Foundation

* reword

* Correct ambiguous and incorrect phrasing

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Reorder sections

* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Remove redundant code snippet

* Update reading CSVs

* Add in section on converting from/to Arrow Tables and tibbles

* rephrase list of numbers

* rephrase list of numbers

* Add missing bracket

* Rephrase about parquet containing multiple cols

* rephrased

* Adapt to Arrow 5.0 output

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Jonathan Keane <jkeane@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>
11 months agoInitial commit
Wes McKinney [Wed, 14 Jul 2021 21:42:28 +0000 (16:42 -0500)] 
Initial commit