[CARBONDATA-4330] Incremental Dataload of Average aggregate in MV
authorShreelekhyaG <shreelu_gampa@yahoo.com>
Mon, 24 Jan 2022 09:47:48 +0000 (15:17 +0530)
committerIndhumathi27 <indhumathim27@gmail.com>
Thu, 28 Apr 2022 14:28:33 +0000 (19:58 +0530)
commit45acd67ed742d89d539ec0351f77f08c7762e7de
tree34be73ab948c150e8ef8cc9323dd63b4b78ac5b3
parent46b62cf6f79d1d826b498609435337b2ed342bbe
[CARBONDATA-4330] Incremental Dataload of Average aggregate in MV

Why is this PR needed?
Currently, whenever MV is created with average aggregate, a full
refresh is done meaning it reloads the whole MV for any newly
added segments. This will slow down the loading. With incremental
data load, only the segments that are newly added can be loaded to the MV.

What changes were proposed in this PR?
If avg is present, rewrite the query with the sum and count of the
columns to create MV and use them to derive avg.
Refer: https://docs.google.com/document/d/1kPEMCX50FLZcmyzm6kcIQtUH9KXWDIqh-Hco7NkTp80/edit

Does this PR introduce any user interface change?
No

Is any new testcase added?
Yes

This closes #4257
core/src/main/java/org/apache/carbondata/core/view/MVSchema.java
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexForCleanAndDeleteSegment.scala
integration/spark/src/main/scala/org/apache/carbondata/view/MVCatalogInSpark.scala
integration/spark/src/main/scala/org/apache/carbondata/view/MVRefresher.scala
integration/spark/src/main/scala/org/apache/carbondata/view/MVSchemaWrapper.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/view/CarbonCreateMVCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/MVRewrite.scala
integration/spark/src/main/scala/org/apache/spark/sql/parser/MVQueryParser.scala
integration/spark/src/test/scala/org/apache/carbondata/view/MVTest.scala
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/AggregatePushDown.scala