arrow-datafusion.git
5 hours agofix `NULL <op> column` evaluation, tests for same (#2510) master
Andrew Lamb [Wed, 18 May 2022 01:29:20 +0000 (21:29 -0400)] 
fix `NULL <op> column` evaluation, tests for same (#2510)

5 hours agoMove expression utils from sql module to expr crate (#2553)
Andy Grove [Wed, 18 May 2022 01:16:48 +0000 (19:16 -0600)] 
Move expression utils from sql module to expr crate (#2553)

10 hours agoFix some 404 links in the contribution guide (#2561)
二手掉包工程师 [Tue, 17 May 2022 19:58:43 +0000 (03:58 +0800)] 
Fix some 404 links in the contribution guide (#2561)

Signed-off-by: hi-rustin <rustin.liu@gmail.com>
12 hours agoSupport for OFFSET in LogicalPlan (#2521)
Jeremy Dyer [Tue, 17 May 2022 18:36:54 +0000 (14:36 -0400)] 
Support for OFFSET in LogicalPlan (#2521)

* Introduce support for OFFSET

* lint fixes

* Slightly modify existing test to include LIMIT and OFFSET

* Uncomment accidental comment out for pre-commit script

* OFFSET should come before LIMIT

* Check for OFFSET <= 0 and add more tests

12 hours agoFix Redundant ScalarValue Boxed Collection (#2523)
comphead [Tue, 17 May 2022 17:54:30 +0000 (10:54 -0700)] 
Fix Redundant ScalarValue Boxed Collection (#2523)

* unbox scalars

* fix conflicts

* fix clippy

* Update datafusion/common/src/scalar.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update datafusion/common/src/scalar.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* boxing datatype

* fixing test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
14 hours agoUpdate datafusion-cli readme cli version (#2559)
二手掉包工程师 [Tue, 17 May 2022 16:05:04 +0000 (00:05 +0800)] 
Update datafusion-cli readme cli version (#2559)

Signed-off-by: hi-rustin <rustin.liu@gmail.com>
19 hours agoMINOR: Move `expr_rewriter.rs` to `datafusion-expr` crate (#2552)
Andy Grove [Tue, 17 May 2022 11:07:08 +0000 (05:07 -0600)] 
MINOR: Move `expr_rewriter.rs` to `datafusion-expr` crate (#2552)

* move expr_rewrite to expr crate

* Move expr_rewriter to expr crate

19 hours agoUpdate to arrow-rs 14.0.0 (#2528)
Andrew Lamb [Tue, 17 May 2022 11:03:43 +0000 (07:03 -0400)] 
Update to arrow-rs 14.0.0  (#2528)

* TEMP: Patch to use apache repo

* Update to arrow 14.0.0

* Consolidate to single OffsetSizeTrait

* Update for new API

* clippy

* moar clippy

* TEMP: patch datafusion cli

* fixup

* Update datafusion-cli deps

32 hours agoRemove `scan_csv` methods from `LogicalPlanBuilder` (#2537)
Andy Grove [Mon, 16 May 2022 22:12:17 +0000 (16:12 -0600)] 
Remove `scan_csv` methods from `LogicalPlanBuilder` (#2537)

34 hours agoMINOR: Fix release packaging issues (#2545)
Andy Grove [Mon, 16 May 2022 20:38:54 +0000 (14:38 -0600)] 
MINOR: Fix release packaging issues (#2545)

* fix release issues

* add newline at end of Cargo.toml

37 hours agoFix size_of_scalar test (#2531)
Andrew Lamb [Mon, 16 May 2022 17:32:56 +0000 (13:32 -0400)] 
Fix size_of_scalar test (#2531)

* Fix size_of_scalar test

* add comments

* Update tests

39 hours agoRemove scan_avro methods from LogicalPlanBuilder (#2540)
Andy Grove [Mon, 16 May 2022 15:00:33 +0000 (09:00 -0600)] 
Remove scan_avro methods from LogicalPlanBuilder (#2540)

42 hours agosplit ON expressions only by AND operator (#2534)
Eduard Karacharov [Mon, 16 May 2022 12:23:30 +0000 (15:23 +0300)] 
split ON expressions only by AND operator (#2534)

42 hours agoRemove scan_json methods from LogicalPlanBuilder (#2541)
Andy Grove [Mon, 16 May 2022 12:21:57 +0000 (06:21 -0600)] 
Remove scan_json methods from LogicalPlanBuilder (#2541)

42 hours agoRemove `scan_parquet` methods from `LogicalPlanBuilder` (#2539)
Andy Grove [Mon, 16 May 2022 12:20:32 +0000 (06:20 -0600)] 
Remove `scan_parquet` methods from `LogicalPlanBuilder` (#2539)

* Remove scan_parquet methods from LogicalPlanBuilder

* simplify code

42 hours agoMINOR: Move `ExprVisitable` and `exprlist_to_columns` to datafusion-expr crate (...
Andy Grove [Mon, 16 May 2022 12:19:03 +0000 (06:19 -0600)] 
MINOR: Move `ExprVisitable` and `exprlist_to_columns` to datafusion-expr crate (#2538)

* Move ExprVisitable to datafusion-expr crate

* move more utility methods

45 hours agoReduce duplication in file scan tests (#2533)
Raphael Taylor-Davies [Mon, 16 May 2022 09:39:49 +0000 (10:39 +0100)] 
Reduce duplication in file scan tests (#2533)

4 days agoFix projection pushdown produces incorrect results when column names are reused ...
Jon Mease [Fri, 13 May 2022 18:28:51 +0000 (14:28 -0400)] 
Fix projection pushdown produces incorrect results when column names are reused (#2463)

* Add failing select_with_alias_overwrite test

* candidate fix

* reinstate optimization, but check that expressions are plain columns

4 days agoObjectStoreRegistry get_by_uri now returns correct path when "scheme" is provided...
Tim Van Wassenhove [Fri, 13 May 2022 17:25:25 +0000 (17:25 +0000)] 
ObjectStoreRegistry get_by_uri now returns correct path when "scheme" is provided (#2526)

* demonstrate and fix bug with get uri

* linting

4 days agoAdd ORDER BY clause to test (#2524)
Andy Grove [Fri, 13 May 2022 16:28:44 +0000 (10:28 -0600)] 
Add ORDER BY clause to test (#2524)

4 days agoupdate verification script and add notes on signing keys (#2520) 8.0.0 8.0.0-rc2 ballista-0.7.0
Andy Grove [Fri, 13 May 2022 11:06:22 +0000 (05:06 -0600)] 
update verification script and add notes on signing keys (#2520)

5 days agoPrepare for datafusion 8.0.0 , ballista 0.7.0 release (#2490) 8.0.0-rc1
Andy Grove [Thu, 12 May 2022 23:13:05 +0000 (17:13 -0600)] 
Prepare for datafusion 8.0.0 , ballista 0.7.0 release (#2490)

* bump versions

* changelog 7.0.0 to 8.0.0

* update versions in docs

* revert changelog

* re-generate changelogs

* revert ballista changelog

* update version missed by script and update script

* regenerate ballista changelog with correct version

5 days agoAdd new `ballista-cli` crate (#2495)
Andy Grove [Thu, 12 May 2022 14:47:57 +0000 (08:47 -0600)] 
Add new `ballista-cli` crate (#2495)

* Add new ballista-cli crate

* update dependency diagram

* re-use PrintFormat from datafusion-cli

* re-use PrintOptions from datafusion-cli

* re-use Helper

* re-use functions

* update dev scripts

* update diagram and docs

* stop building DataFusion CLI with ballista in CI

* update user guide

* docs for building ballista-cli with docker

* make version numbers consistent with repo

* update Cargo.lock files for CLIs and add ballista-cli to GitHub workflow

* disable ballista tests

* fix ci

* fix merge conflict

* fix

6 days agoAdd `CREATE VIEW` (#2279)
Matthew Turner [Wed, 11 May 2022 16:03:27 +0000 (12:03 -0400)] 
Add `CREATE VIEW` (#2279)

* Initial commit

* First passing test

* Add OR REPLACE and more tests

* Update doc comment

* More tests

* Add CreateView to Ballista

* Include Q15 for TPCH

* Ignore q15

* Delete view physical plan

6 days agoMinor: remove code that is now in arrow (#2511)
Andrew Lamb [Wed, 11 May 2022 14:36:44 +0000 (10:36 -0400)] 
Minor: remove code that is now in arrow (#2511)

6 days agoAdd metrics for ParquetExec (#2499)
Yang Jiang [Wed, 11 May 2022 12:38:07 +0000 (20:38 +0800)] 
Add metrics for ParquetExec (#2499)

* Add metrics for ParquetExec

* fix row_count

7 days agoEnable multi-statement benchmark queries (#2507)
Andy Grove [Wed, 11 May 2022 02:35:42 +0000 (20:35 -0600)] 
Enable multi-statement benchmark queries (#2507)

7 days agoMINOR: Add ignored tests for all remaining benchmark queries (#2506)
Andy Grove [Wed, 11 May 2022 02:30:51 +0000 (20:30 -0600)] 
MINOR: Add ignored tests for all remaining benchmark queries (#2506)

7 days agoMove `data-access` crate into `datafusion` directory (#2479)
Andy Grove [Wed, 11 May 2022 02:30:22 +0000 (20:30 -0600)] 
Move `data-access` crate into `datafusion` directory (#2479)

* move data-access crate

* update references

7 days agoNumeric, String, Boolean comparisons with literal NULL (#2481)
DuRipeng [Wed, 11 May 2022 01:47:08 +0000 (09:47 +0800)] 
Numeric, String, Boolean comparisons with literal NULL (#2481)

7 days agoOptimize MergeJoin by storing joined indices instead of creating small record batches...
Zhang Li [Wed, 11 May 2022 01:46:49 +0000 (09:46 +0800)] 
Optimize MergeJoin by storing joined indices instead of creating small record batches for each match (#2492)

* optimize smj

* fix timer not working problem

* implement smj's fmt_as() and relies_on_input_order()

* add comments

* add join_type checking in freeze methods

Co-authored-by: zhangli20 <zhangli20@kuaishou.com>
7 days ago[Ballista] Persist session configs in scheduler (#2501)
Dan Harris [Tue, 10 May 2022 20:54:39 +0000 (23:54 +0300)] 
[Ballista] Persist session configs in scheduler (#2501)

7 days agoAdd SQL planner support for `grouping()` aggregate expressions (#2486)
Andy Grove [Tue, 10 May 2022 20:53:45 +0000 (14:53 -0600)] 
Add SQL planner support for `grouping()` aggregate expressions (#2486)

* Add SQL planner support for grouping() aggregate function

* complex test with rank and partition

* fix window aggregate case

* code cleanup

7 days agoAdd SQL planner support for calling `round` function with two arguments (#2503)
Andy Grove [Tue, 10 May 2022 20:01:13 +0000 (14:01 -0600)] 
Add SQL planner support for calling `round` function with two arguments (#2503)

* test

* specify correct signature for ROUND function

7 days agoUpdate to `sqlparser` `0.17.0` (#2500)
Andrew Lamb [Tue, 10 May 2022 19:24:15 +0000 (15:24 -0400)] 
Update to `sqlparser` `0.17.0` (#2500)

* fix: Update parser for indexed field

* Update to sqlparser 0.17

* Update more cargo

* Fix test to use string key rather than identifier

8 days agoLimit cpu cores used when generating changelog (#2494)
Andy Grove [Mon, 9 May 2022 20:34:16 +0000 (14:34 -0600)] 
Limit cpu cores used when generating changelog (#2494)

8 days agoAdd support for list_dir() on local fs (#2467)
Will Jones [Mon, 9 May 2022 18:03:35 +0000 (11:03 -0700)] 
Add support for list_dir() on local fs (#2467)

* Add support for list_dir on local fs

* Format

* Print invalid values in errors

* Update data-access/src/object_store/local.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Use ok_or_else

* Format

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
8 days agoFix bug in ballista version upgrade script (#2488)
Andy Grove [Mon, 9 May 2022 14:07:45 +0000 (08:07 -0600)] 
Fix bug in ballista version upgrade script (#2488)

8 days agoAdd SQL planner support for `ROLLUP` and `CUBE` grouping set expressions (#2446)
Andy Grove [Mon, 9 May 2022 12:57:12 +0000 (06:57 -0600)] 
Add SQL planner support for `ROLLUP` and `CUBE` grouping set expressions (#2446)

* Add SQL planner support for ROLLUP and CUBE grouping sets

* prep for review

* fix more todo comments

* code cleanup

* clippy

* fmt and clippy

* revert change

* clippy

9 days agoMINOR: Parameterize changelog script (#2484)
Rich [Sun, 8 May 2022 17:12:02 +0000 (13:12 -0400)] 
MINOR: Parameterize changelog script (#2484)

* modify script to respect release branch

* Separate usage and example

* Update dev/release/README.md

Co-authored-by: Andy Grove <andygrove73@gmail.com>
9 days agoSupport struct_expr generate struct in sql (#2389)
Yang Jiang [Sun, 8 May 2022 15:23:00 +0000 (23:23 +0800)] 
Support struct_expr generate struct in sql (#2389)

9 days agoadd unit tests of mathematical expressions with null (#2478)
DuRipeng [Sun, 8 May 2022 13:03:35 +0000 (21:03 +0800)] 
add unit tests of mathematical expressions with null (#2478)

10 days agoGrouped Aggregate in row format (#2375)
Yijie Shen [Sat, 7 May 2022 16:00:31 +0000 (00:00 +0800)] 
Grouped Aggregate in row format (#2375)

* first move: re-group aggregates functionalities in core/physical_p/aggregates

* basic accumulators

* main updating procedure

* output as record batch

* aggregate with row state

* make row non-optional

* address comments, add docs, part fix #2455

* Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
11 days agoMinor: Move test code from `context.rs` into `sql_integration` (#2473)
Andrew Lamb [Fri, 6 May 2022 19:49:28 +0000 (15:49 -0400)] 
Minor: Move test code from `context.rs` into `sql_integration`  (#2473)

11 days agoFix stage key extraction (#2472)
Dan Harris [Fri, 6 May 2022 19:49:06 +0000 (15:49 -0400)] 
Fix stage key extraction (#2472)

11 days agominor: remove expr dependency from the row crate, update crate-deps.dot/svg (#2470)
Yijie Shen [Fri, 6 May 2022 18:07:30 +0000 (02:07 +0800)] 
minor: remove expr dependency from the row crate, update crate-deps.dot/svg (#2470)

11 days agoMinor: Use ExprVisitor to find columns referenced by expr (#2471)
Andrew Lamb [Fri, 6 May 2022 18:06:45 +0000 (14:06 -0400)] 
Minor: Use ExprVisitor to find columns referenced by expr (#2471)

11 days agofeat: Support CompoundIdentifier access as GetIndexedField (#2454)
Dmitry Patsura [Fri, 6 May 2022 18:06:29 +0000 (21:06 +0300)] 
feat: Support CompoundIdentifier access as GetIndexedField (#2454)

11 days agoAdd proper support for `null` literal by introducing `ScalarValue::Null` (#2364)
DuRipeng [Fri, 6 May 2022 17:24:46 +0000 (01:24 +0800)] 
Add proper support for `null` literal by introducing `ScalarValue::Null` (#2364)

* introduce null

* fix fmt

11 days agoFix `read_from_registered_table_with_glob_path` fails if path contains // #2465 ...
Tim Van Wassenhove [Fri, 6 May 2022 17:21:23 +0000 (17:21 +0000)] 
Fix `read_from_registered_table_with_glob_path` fails if path contains // #2465 (#2468)

* reproduce and fix issue

* fix reference reported by clippy

11 days agoFix bugs in SQL planner with GROUP BY scalar function and alias (#2457)
Andy Grove [Fri, 6 May 2022 16:48:35 +0000 (10:48 -0600)] 
Fix bugs in SQL planner with GROUP BY scalar function and alias (#2457)

11 days agoMINOR: Partial fix for SQL aggregate queries with aliases (#2464)
Andy Grove [Fri, 6 May 2022 15:13:05 +0000 (09:13 -0600)] 
MINOR: Partial fix for SQL aggregate queries with aliases (#2464)

11 days agoIssue 2321: Add code formatting instructions to CONTRIBUTING (#2444)
Tim Van Wassenhove [Fri, 6 May 2022 15:11:45 +0000 (15:11 +0000)] 
Issue 2321: Add code formatting instructions to CONTRIBUTING (#2444)

* ehance documentation on how to format/check your code

* Update dev/format-code.sh

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* Update dev/format-code.sh

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* attempt to share lint infra

* corrected invalid /

* mention scripts in formatting instructions

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
11 days agominor: move struct definition out of `aggregate/mod.rs`, etc (#2458)
DuRipeng [Fri, 6 May 2022 14:52:57 +0000 (22:52 +0800)] 
minor: move struct definition out of `aggregate/mod.rs`, etc (#2458)

12 days agoTable provider error propagation (#2438)
Jeremy Dyer [Thu, 5 May 2022 19:22:02 +0000 (15:22 -0400)] 
Table provider error propagation (#2438)

* Change return type of get_table_provider from Option<T> to Result<T>

* cargo fmt changes

* Update another error location

* Update datafusion/core/src/execution/context.rs

Co-authored-by: Andy Grove <andygrove73@gmail.com>
* Update datafusion/core/src/execution/context.rs

Co-authored-by: Andy Grove <andygrove73@gmail.com>
* Update error messages now that errors are propagated

* linter updates

* Remove commented out code that was left by mistake

* Cargo fmt

Co-authored-by: Andy Grove <andygrove73@gmail.com>
13 days agoIssue 2393: Support glob patterns for files (#2394)
Tim Van Wassenhove [Thu, 5 May 2022 06:33:37 +0000 (06:33 +0000)] 
Issue 2393: Support glob patterns for files (#2394)

* implement globbing on ObjectStore

* remove unused code

* update list_file_with_suffix to use glob_file

* reworked code such that glob_file matches list_file and glob_file_with_suffix list_file_with_suffix

* rework the way we figure out what the greatest common base path is

* refactor tests on longested_search_path_without_glob_pattern

* added comment on / value

* remove unused use stmt

* rework implementation to find largest common path

* revert accidental/temp changes

* added tests to verify globbing

* find inspiration in glob crate to better deal with windows

* when running on windows, the expected path is slightly different (\ instead of /).

* fixed clippy issue

* added section on checks that are executed during a PR build

* updated section (and script) to make explicit this is about formatting

* replace with simple break

* make filter_suffix not-async as it does not need to be async

* no need to collect

* attempt to make tests more understandable

* actually format the code instead of only verifying

* added test with ** as glob pattern as well

* remove changes related to code formatting

* remove unneeded empty line

* run cargo fmt

* Update data-access/src/object_store/mod.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* use try_filter as suggested in pr review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
13 days agofix: union schema (#2334)
George Andronchik [Wed, 4 May 2022 16:09:47 +0000 (00:09 +0800)] 
fix: union schema (#2334)

13 days agoMake ExecutionPlan sync (#2307) (#2434)
Raphael Taylor-Davies [Wed, 4 May 2022 16:09:03 +0000 (17:09 +0100)] 
Make ExecutionPlan sync (#2307) (#2434)

13 days agoFix ballista integration tests (#2441)
Andy Grove [Wed, 4 May 2022 15:31:37 +0000 (09:31 -0600)] 
Fix ballista integration tests (#2441)

* make dev/build-ballista-docker.sh executable

* update Dockerfile

13 days agoFix Ballista executing during plan (#2428)
Raphael Taylor-Davies [Wed, 4 May 2022 14:12:53 +0000 (15:12 +0100)] 
Fix Ballista executing during plan (#2428)

13 days agoIntroduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199...
Raphael Taylor-Davies [Wed, 4 May 2022 14:12:16 +0000 (15:12 +0100)] 
Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199) (#2226)

* Morsel-driven Parallelism using rayon (#2199)

* Fix LIFO spawn ordering

* Further docs for ExecutionPipeline

* Deduplicate concurrent wakes

* Add license headers

* Sort Cargo.toml

* Revert accidental change to ParquetExec

* Handle wakeups triggered by other threads

* Use SeqCst memory ordering

* Review feedback

* Add panic handler

* Cleanup structs

Add test of tokio interoperation

* Review feedback

* Use BatchPartitioner

Cleanup error handling

* Clarify shutdown characteristics

* Fix racy test_panic

* Don't overload Query nomenclature

* Rename QueryResults to ExecutionResults

* Further review feedback

* Merge scheduler into datafusion/core

* Review feedback

* Fix partitioned execution

* Format

* Format Cargo.toml

* Fix doc link

13 days agoBasic support for `IN` and `NOT IN` Subqueries by rewriting them to `SEMI` / `ANTI...
Eduard Karacharov [Wed, 4 May 2022 13:52:16 +0000 (16:52 +0300)] 
Basic support for `IN` and `NOT IN` Subqueries by rewriting them to `SEMI` / `ANTI` (#2421)

* naive in subquery implementation

* 16 and 18 tpch queries enabled in benchmark

* rollback rewriting instead of fail

* try_fold used for input plan rewriting

* test readability & negative test cases

2 weeks agoUpgrade to arrow 13 (#2382)
Andrew Lamb [Wed, 4 May 2022 01:36:36 +0000 (21:36 -0400)] 
Upgrade to arrow 13 (#2382)

* Update to use arrow 13

* Updates for API change

* Update Cargo.lock for datafusion cli

* fix clippy

2 weeks agoMINOR: Make crate READMEs consistent (#2437)
Andy Grove [Wed, 4 May 2022 01:31:55 +0000 (19:31 -0600)] 
MINOR: Make crate READMEs consistent (#2437)

2 weeks ago`sum(distinct)` support (#2405)
DuRipeng [Wed, 4 May 2022 01:11:23 +0000 (09:11 +0800)] 
`sum(distinct)` support (#2405)

* sum(distinct) support

* fix clippy

* merge state() code logic

* revise annotation

* remove u64->i63 coercion

2 weeks agominor: update versions and paths in changelog scripts (#2429)
Andy Grove [Wed, 4 May 2022 00:41:37 +0000 (18:41 -0600)] 
minor: update versions and paths in changelog scripts (#2429)

* update versions and paths in changelog scripts

* Update release process README

* fix crate publish order

* add ASL header for dot file

2 weeks agoImprove error messages (#2435)
Andy Grove [Tue, 3 May 2022 23:01:59 +0000 (17:01 -0600)] 
Improve error messages (#2435)

2 weeks agoFix bug in subquery join filters referencing outer query (#2416)
Andy Grove [Tue, 3 May 2022 17:59:47 +0000 (11:59 -0600)] 
Fix bug in subquery join filters referencing outer query (#2416)

2 weeks agominor: remove redundant code (#2432)
jakevin [Tue, 3 May 2022 17:17:50 +0000 (01:17 +0800)] 
minor: remove redundant code (#2432)

2 weeks agoUpdate `update_datafusion_versions.py` to include all crates (#2423)
Andy Grove [Tue, 3 May 2022 13:38:40 +0000 (07:38 -0600)] 
Update `update_datafusion_versions.py` to include all crates (#2423)

* Update path to core crate in script

* update script to update deps for all df crates

* make script smarter

* also update ballista dependencies on datafusion

* do not bump data-access version

* include data-access

2 weeks agominor: format table result vec & remove unnecessary semicolon (#2425)
DuRipeng [Tue, 3 May 2022 11:58:37 +0000 (19:58 +0800)] 
minor: format table result vec & remove unnecessary semicolon (#2425)

2 weeks agodocs: Update the Ballista dev env instructions (#2419)
Hao Xin [Mon, 2 May 2022 23:37:45 +0000 (07:37 +0800)] 
docs: Update the Ballista dev env instructions (#2419)

2 weeks agonested query fix (#2402)
comphead [Mon, 2 May 2022 19:47:18 +0000 (12:47 -0700)] 
nested query fix (#2402)

2 weeks agoAllow subqueries without aliases (#2418)
Andy Grove [Mon, 2 May 2022 19:46:27 +0000 (13:46 -0600)] 
Allow subqueries without aliases (#2418)

2 weeks agoremove duplicated function format_state_name() (#2414)
DuRipeng [Mon, 2 May 2022 15:30:26 +0000 (23:30 +0800)] 
remove duplicated function format_state_name() (#2414)

2 weeks agomake expected result string in ut more readable (#2413)
DuRipeng [Mon, 2 May 2022 14:39:05 +0000 (22:39 +0800)] 
make expected result string in ut more readable (#2413)

2 weeks agoUpdate ordered-float requirement from 2.10 to 3.0 (#2403)
dependabot[bot] [Mon, 2 May 2022 09:26:57 +0000 (17:26 +0800)] 
Update ordered-float requirement from 2.10 to 3.0 (#2403)

Updates the requirements on [ordered-float](https://github.com/reem/rust-ordered-float) to permit the latest version.
- [Release notes](https://github.com/reem/rust-ordered-float/releases)
- [Commits](https://github.com/reem/rust-ordered-float/compare/v2.10.0...v3.0.0)

---
updated-dependencies:
- dependency-name: ordered-float
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 weeks agoRe-organize and rename aggregates physical plan (#2388)
Yijie Shen [Mon, 2 May 2022 00:41:14 +0000 (08:41 +0800)] 
Re-organize and rename aggregates physical plan (#2388)

* first move: re-group aggregates functionalities in core/physical_p/aggregates

* address review comments feedback

* naming

2 weeks agoremove duplicated aggregate() (#2400)
DuRipeng [Sun, 1 May 2022 23:21:58 +0000 (07:21 +0800)] 
remove duplicated aggregate() (#2400)

2 weeks agodocs: update the renamed project Flock (Squirtle) (#2401)
Hao Xin [Sun, 1 May 2022 23:21:37 +0000 (07:21 +0800)] 
docs: update the renamed project Flock (Squirtle) (#2401)

2 weeks agominor: SchemaError code cleanup and improvements (#2391)
Andy Grove [Sun, 1 May 2022 15:54:21 +0000 (09:54 -0600)] 
minor: SchemaError code cleanup and improvements (#2391)

2 weeks agoAllow CTEs to be referenced from subquery expressions (#2384)
Andy Grove [Sun, 1 May 2022 02:17:33 +0000 (20:17 -0600)] 
Allow CTEs to be referenced from subquery expressions (#2384)

2 weeks agoSupport type-coercion from Decimal to Float64 (#2396)
comphead [Sun, 1 May 2022 00:26:23 +0000 (17:26 -0700)] 
Support type-coercion from Decimal to Float64 (#2396)

2 weeks agoImplement physical planner support for DATE +/- INTERVAL (#2357)
Andy Grove [Sun, 1 May 2022 00:26:05 +0000 (18:26 -0600)] 
Implement physical planner support for DATE +/- INTERVAL (#2357)

2 weeks agorefactor distinct_expressions.rs (#2386)
DuRipeng [Sat, 30 Apr 2022 13:57:04 +0000 (21:57 +0800)] 
refactor distinct_expressions.rs (#2386)

2 weeks agoFix bugs with CTE aliasing and normalize all identifiers in the SQL planner (#2373)
Andy Grove [Fri, 29 Apr 2022 19:35:35 +0000 (13:35 -0600)] 
Fix bugs with CTE aliasing and normalize all identifiers in the SQL planner (#2373)

2 weeks agoStop optimizing queries twice (#2369)
Andy Grove [Fri, 29 Apr 2022 15:12:14 +0000 (09:12 -0600)] 
Stop optimizing queries twice (#2369)

2 weeks agoAdd `Expr` to prelude (#2348)
Andrew Lamb [Fri, 29 Apr 2022 00:45:07 +0000 (20:45 -0400)] 
Add `Expr` to prelude (#2348)

* Add `Expr` to prelude

* cleanup

2 weeks agofeat: Support casting to array of primity type (#2366)
Dmitry Patsura [Thu, 28 Apr 2022 20:58:35 +0000 (23:58 +0300)] 
feat: Support casting to array of primity type (#2366)

2 weeks agoIntroduce new `DataFusionError::SchemaError` type (#2371)
Andy Grove [Thu, 28 Apr 2022 20:38:27 +0000 (14:38 -0600)] 
Introduce new `DataFusionError::SchemaError` type (#2371)

2 weeks agoImprove documentation for DFSchema combine and merge functions (#2367)
Andy Grove [Thu, 28 Apr 2022 19:01:06 +0000 (13:01 -0600)] 
Improve documentation for DFSchema combine and merge functions (#2367)

2 weeks agominor: fix duplicate column bug in subquery support (#2362)
Andy Grove [Thu, 28 Apr 2022 18:29:28 +0000 (12:29 -0600)] 
minor: fix duplicate column bug in subquery support (#2362)

2 weeks agorewrite approx_median to approx_percentile_cont while planning phase (#2262)
Eduard Karacharov [Thu, 28 Apr 2022 17:49:33 +0000 (20:49 +0300)] 
rewrite approx_median to approx_percentile_cont while planning phase (#2262)

2 weeks agoImplementing math power function for SQL (#2324)
comphead [Thu, 28 Apr 2022 12:53:33 +0000 (05:53 -0700)] 
Implementing math power function for SQL (#2324)

* Implementing POWER function

* Delete pv.yaml

* Delete build-ballista-docker.sh

* Delete ballista.dockerfile

* aligining with latest upstream changes

* Readding docker files

* Formatting

* Leaving only 64bit types

* Adding tests, remove type conversion

* fix for cast

* Update functions.rs

2 weeks agoAdd SQL query planner support for scalar subqueries (#2354)
Andy Grove [Wed, 27 Apr 2022 20:54:04 +0000 (14:54 -0600)] 
Add SQL query planner support for scalar subqueries (#2354)

2 weeks agoAdd SQL query planner support for IN subqueries (#2352)
Andy Grove [Wed, 27 Apr 2022 19:51:33 +0000 (13:51 -0600)] 
Add SQL query planner support for IN subqueries (#2352)

2 weeks agonormalize subquery alias (#2359)
Andy Grove [Wed, 27 Apr 2022 19:43:42 +0000 (13:43 -0600)] 
normalize subquery alias (#2359)

2 weeks agoAdd SQL planner support for EXISTS subqueries (#2344)
Andy Grove [Wed, 27 Apr 2022 10:51:39 +0000 (04:51 -0600)] 
Add SQL planner support for EXISTS subqueries (#2344)

* Add SQL planner support for EXISTS subqueries

* update comments

* improve formatting in test and rename outer_schema to outer_query_schema

* improve formatting in test

2 weeks agoAdd public Serialization/Deserialization API for `Expr` to/from bytes (#2341)
Andrew Lamb [Wed, 27 Apr 2022 10:29:52 +0000 (06:29 -0400)] 
Add public Serialization/Deserialization API for `Expr` to/from bytes (#2341)

* Add public Serialization/Deserialization API to/from bytes

* cleanup

* Remove Expr from prelude

* RAT, add Bytes API

* Rename to `bytes`

* Break out registry into its own module

* Add more tests

3 weeks agoAdd `Expr::InSubquery` and `Expr::ScalarSubquery` (#2342)
Andy Grove [Tue, 26 Apr 2022 22:53:43 +0000 (16:53 -0600)] 
Add `Expr::InSubquery` and `Expr::ScalarSubquery` (#2342)