arrow-cookbook.git
13 months agoFix C++ version (#36)
Nic [Fri, 13 Aug 2021 05:52:47 +0000 (05:52 +0000)] 
Fix C++ version (#36)

13 months agoMove tests into implementation-specific folders and workflows (#32)
Nic [Thu, 12 Aug 2021 22:28:43 +0000 (22:28 +0000)] 
Move tests into implementation-specific folders and workflows (#32)

* Move tests into implementation-specific folders and workflows

* Move workflows up a level so they run

* Improve naming

* Disallow multiple concurrent jobs

* Prevent concurrency

* Check concurrency change

13 months ago#13 install latest release (#20)
Nic [Wed, 11 Aug 2021 19:54:47 +0000 (19:54 +0000)] 
#13 install latest release (#20)

* install latest release

* Add Apache license, and simplify non-release builds for now

* Refactor and style code

* Fix typo

* Install binaries for dependencies too

* Run styler and update install_arrow_version logic to have better defaults

* Fix issue with NA var

* Less ambigous CI

* Got mixed up with RSPM having Linux binaries

* Typofix

* For loops aren't illegal

* Add issue number to TODO

* test empty commit

13 months agoChanging the gh-pages deploy so that the gh-pages branch is always a single commit...
Weston Pace [Wed, 11 Aug 2021 19:53:59 +0000 (09:53 -1000)] 
Changing the gh-pages deploy so that the gh-pages branch is always a single commit instead of something that keeps history (#26)

13 months agoSetting up notifications via the ML of git activity (#31)
Weston Pace [Wed, 11 Aug 2021 19:53:39 +0000 (09:53 -1000)] 
Setting up notifications via the ML of git activity (#31)

13 months agoAdding thisisnic and amol- as collaborators (#14)
Weston Pace [Tue, 10 Aug 2021 02:09:17 +0000 (16:09 -1000)] 
Adding thisisnic and amol- as collaborators (#14)

13 months agoCreated a .nojekyll file to tell github not to run jekyll on the gh-pages (#24)
Weston Pace [Sun, 8 Aug 2021 06:56:12 +0000 (20:56 -1000)] 
Created a .nojekyll file to tell github not to run jekyll on the gh-pages (#24)

13 months agoFixing depend target in deploy workflow to updated name of build target
Weston Pace [Fri, 6 Aug 2021 22:49:10 +0000 (12:49 -1000)] 
Fixing depend target in deploy workflow to updated name of build target

13 months agoAlso run CI on pull requests to the main branch (#21)
Nic [Fri, 6 Aug 2021 22:46:06 +0000 (22:46 +0000)] 
Also run CI on pull requests to the main branch (#21)

* Also run CI on pull requests to the main branch

* Separate out test/dploy scripts

* add tests back into deply stage

* rename jobs for consistency

13 months ago#17 no pacman (#18)
Nic [Fri, 6 Aug 2021 21:35:04 +0000 (21:35 +0000)] 
#17 no pacman (#18)

* remove redundant dependencies

* another redundant dependency

13 months agoPrevent force pushes to main (#15)
Weston Pace [Fri, 6 Aug 2021 07:53:27 +0000 (21:53 -1000)] 
Prevent force pushes to main (#15)

13 months agoThe build is dropping a lot of csv/feather/etc files which it's annoying to avoid...
Weston Pace [Fri, 6 Aug 2021 07:52:21 +0000 (21:52 -1000)] 
The build is dropping a lot of csv/feather/etc files which it's annoying to avoid so I'm adding ignore rules (#16)

13 months agoremove duplicated sections and add ASF license (#7)
Nic [Thu, 5 Aug 2021 21:42:28 +0000 (21:42 +0000)] 
remove duplicated sections and add ASF license (#7)

13 months agoTwo sections were using the same name across two different Rmd files and that was...
Weston Pace [Thu, 5 Aug 2021 08:43:09 +0000 (22:43 -1000)] 
Two sections were using the same name across two different Rmd files and that was causing the combined Rmd to fail (#4)

* Two sections were using the same name across two different Rmd files and that was causing the combined Rmd to fail

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Nic <thisisnic@gmail.com>
13 months agoEnable issues & gh-pages branch on Github repo (#6)
Nic [Thu, 5 Aug 2021 07:49:11 +0000 (07:49 +0000)] 
Enable issues & gh-pages branch on Github repo (#6)

13 months agoRefactor GH Actions job that uses third party action (#3)
Nic [Thu, 5 Aug 2021 04:32:27 +0000 (04:32 +0000)] 
Refactor GH Actions job that uses third party action (#3)

* remove third-party GH actions

* re-enable some stages

14 months agoTrivial change to see if I can commit
Neal Richardson [Thu, 29 Jul 2021 17:05:47 +0000 (13:05 -0400)] 
Trivial change to see if I can commit

14 months agoInitial content for Arrow Cookbook for Python and R (#1)
Alessandro Molina [Wed, 28 Jul 2021 14:38:20 +0000 (16:38 +0200)] 
Initial content for Arrow Cookbook for Python and R (#1)

* Initial Import

* R cookbook initial commit (#1)

* R Cookbook skeleton and initial chapter

* Move r test script to a separate directory

* Add Apache 2 license

* Add parquet section

* Delete files used to demonstrate failing tests in CI

* Licensing

* Add content for different formats and rearrange headings

* Small change to make the tests run on macOS

* Completed the IO section and added intersphinx with PyArrow

* Add workflow to deploy to GH pages

* Update path

* Rename chapters and fill in section titles

* Commit whitespace to trigger build

* Update bookdown job

* try new job config

* Install nightly Arrow

* Evaluate all relevant bits!

* Deploy to r dir

* Try new workflow

* update build path

* Add email and update paths

* Update job to build all cookbooks

* Delete whitespace to trigger build

* Swap order to see if this fixes build

* Install system dependencies

* Put it back on Mac so it's faster

* Separate steps to diagnose issue

* Brew not sudo

* Switching to ubuntu as I don't understand why python 2

* Don't put results in r directory

* Capitalise 'C'

* Update bookdown link so can click to fork/edit

* Add CI stage that runs tests

* Add examples of manually creating Arrow objects and writing to various formats

* Add S3 parquet

* Partitioned data

* Partitioned Data from S3

* Rename record_batch_create chunk

* CSV recipe requires pandas

* Filter parquet data on read

* Reading/Writing feather files

* remove duplicated chunk name

* tweak create

* Categorical data

* Speed up compiling

* Fix tests

* tests pass

* Data manipulation functions

* Link to compute functions

* Tweak naming

* Add contribution file

* landing page style tweak

* Improve contribution documentation

* Explicitly reference the contribution docs

* ignore build directory

* Change branch name

* Update contents

* Update CONTRIBUTING.md

* Suggestions from Grammarly

* Rename initial chapter

* Update Makefile to allow Arrow version to be specified

* Truncate license file to relevant part

* typo

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Add link to code of conduct

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Capitalise "Array"

* Update r/CONTRIBUTING.md

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/creating_arrow_objects.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Mention dependencies

* Mention that this is not the documentation

* rewording

* Add -jauto by default and indent a print

* The Apache Software Foundation

* reword

* Correct ambiguous and incorrect phrasing

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>
* Reorder sections

* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>
* Remove redundant code snippet

* Update reading CSVs

* Add in section on converting from/to Arrow Tables and tibbles

* rephrase list of numbers

* rephrase list of numbers

* Add missing bracket

* Rephrase about parquet containing multiple cols

* rephrased

* Adapt to Arrow 5.0 output

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Jonathan Keane <jkeane@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>
14 months agoInitial commit
Wes McKinney [Wed, 14 Jul 2021 21:42:28 +0000 (16:42 -0500)] 
Initial commit