[CARBONDATA-4190] Integrate Carbondata with Spark 3.1.1 version
authorVikram Ahuja <vikramahuja8803@gmail.com>
Tue, 6 Apr 2021 06:55:14 +0000 (12:25 +0530)
committerakashrn5 <akashnilugal@gmail.com>
Wed, 23 Jun 2021 05:24:57 +0000 (10:54 +0530)
commit8ceb4fdd94a1d92a760a3c96d212fd521e9f3ccf
tree870bc7f95afe6011cc145769b1217fc83da455a8
parent18665cc0b72d455ff66bfa63501faec94025b492
[CARBONDATA-4190] Integrate Carbondata with Spark 3.1.1 version

Why is this PR needed?
To integrate Carbondata with Spark3.1.1

What changes were proposed in this PR?
Refactored code to add changes to support Spark 3.1.1 along with Spark 2.3 and 2.4 versions
Changes:

1. Compile Related Changes
1. New Spark package in MV, Streaming and spark-integration.
2. API wise changes as per spark changes
2. Spark has moved to Proleptic Gregorian Calendar, due to which timestamp related changes in carbondata are also required.
3. Show segment by select command refactor
4. Few Lucene test cases ignored due to the deadlock in spark DAGSchedular, which does not allow it to work.
5. Alter rename: Parser enabled in Carbon and check for carbon
6. doExecuteColumnar() changes in CarbonDataSourceScan.scala
7. char/varchar changes from spark side.
8. Rule name changed in MV
9. In univocity parser, CSVParser version changed.
10. New Configs added in SparkTestQueryExecutor to keep some behaviour same as 2.3 and 2.4

Does this PR introduce any user interface change?
No

Is any new testcase added?
No

This closes #4141
178 files changed:
LICENSE
examples/flink/pom.xml
examples/spark/pom.xml
examples/spark/src/main/scala/org/apache/carbondata/examples/StreamingWithRowParserExample.scala
examples/spark/src/main/scala/org/apache/carbondata/examples/StructuredStreamingExample.scala
examples/spark/src/main/scala/org/apache/carbondata/examples/util/ExampleUtils.scala
examples/spark/src/test/scala/org/apache/carbondata/examplesCI/RunExamples.scala
index/examples/pom.xml
index/secondary-index/pom.xml
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithComplexArrayType.scala
integration/flink-build/pom.xml
integration/flink-proxy/pom.xml
integration/flink/pom.xml
integration/flink/src/test/scala/org/apache/carbon/flink/TestCarbonPartitionWriter.scala
integration/flink/src/test/scala/org/apache/carbon/flink/TestCarbonWriter.scala
integration/hive/pom.xml
integration/presto/pom.xml
integration/spark/pom.xml
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/CarbonDataSourceScanHelper.scala [new file with mode: 0644]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/SparkVersionAdapter.scala [new file with mode: 0644]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/execution/CarbonCodegenSupport.scala [moved from integration/spark/src/main/spark2.3/org/apache/carbondata/spark/adapter/CarbonToSparkAdapter.scala with 70% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/hive/CarbonAnalyzer.scala [moved from integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonAnalyzer.scala with 100% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/hive/CarbonSqlAstBuilder.scala [moved from integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonSqlAstBuilder.scala with 100% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/hive/SqlAstBuilderHelper.scala [moved from integration/spark/src/main/scala/org/apache/spark/sql/hive/SqlAstBuilderHelper.scala with 97% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/hive/execution/command/CarbonResetCommand.scala [moved from integration/spark/src/main/scala/org/apache/spark/sql/hive/execution/command/CarbonResetCommand.scala with 100% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/parser/CarbonExtensionSqlParser.scala [moved from integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonExtensionSqlParser.scala with 100% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/parser/CarbonSparkSqlParser.scala [moved from integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParser.scala with 93% similarity]
integration/spark/src/main/common2.3and2.4/org/apache/spark/sql/parser/SparkSqlAstBuilderWrapper.scala [moved from integration/spark/src/main/spark2.4/org/apache/spark/sql/parser/SparkSqlAstBuilderWrapper.scala with 100% similarity]
integration/spark/src/main/common2.4and3.1/org/apache/spark/sql/CarbonBoundReference.scala [moved from integration/spark/src/main/spark2.4/org/apache/spark/sql/CarbonBoundReference.scala with 99% similarity]
integration/spark/src/main/common2.4and3.1/org/apache/spark/sql/avro/AvroFileFormatFactory.scala [moved from integration/spark/src/main/spark2.4/org/apache/spark/sql/avro/AvroFileFormatFactory.scala with 100% similarity]
integration/spark/src/main/common2.4and3.1/org/apache/spark/sql/execution/CreateDataSourceTableCommand.scala [new file with mode: 0644]
integration/spark/src/main/java/org/apache/spark/sql/CarbonVectorProxy.java
integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/StreamingOption.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/load/CsvRDDHelper.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/load/DataLoadProcessorStepOnSpark.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonRDD.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/UpdateDataLoad.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CarbonSparkUtil.scala
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
integration/spark/src/main/scala/org/apache/carbondata/view/MVCatalogInSpark.scala
integration/spark/src/main/scala/org/apache/carbondata/view/MVHelper.scala
integration/spark/src/main/scala/org/apache/spark/sql/CarbonExpressions.scala
integration/spark/src/main/scala/org/apache/spark/sql/CarbonExtensions.scala
integration/spark/src/main/scala/org/apache/spark/sql/CarbonSession.scala
integration/spark/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/CarbonFileIndex.scala
integration/spark/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/SparkCarbonFileFormat.scala
integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/CarbonTakeOrderedAndProjectExec.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonShowSegmentsAsSelectCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/IUDCommonUtil.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/MergeProjection.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateDataSourceTableCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableAsSelectCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonExplainCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/view/CarbonCreateMVCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/joins/BroadCastPolygonFilterPushJoin.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonPlanHelper.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonSourceStrategy.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DDLHelper.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DMLStrategy.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/MixedFormatHandler.scala
integration/spark/src/main/scala/org/apache/spark/sql/execution/streaming/CarbonAppendableStreamSink.scala
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
integration/spark/src/main/scala/org/apache/spark/sql/hive/CreateCarbonSourceTableAsSelectCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/CarbonIUDRule.scala
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/CarbonUDFTransformRule.scala
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/MVMatcher.scala
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/MVRewrite.scala
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/MVRewriteRule.scala
integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonExtensionSpark2SqlParser.scala
integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/jobs/SparkBlockletIndexLoaderJob.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/joins/BroadCastSIFilterPushJoin.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/optimizer/CarbonSITransformationRule.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/optimizer/CarbonSecondaryIndexOptimizer.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/rdd/CarbonSIRebuildRDD.scala
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/rdd/CarbonSecondaryIndexRDD.scala
integration/spark/src/main/scala/org/apache/spark/sql/test/SparkTestQueryExecutor.scala
integration/spark/src/main/scala/org/apache/spark/sql/util/CreateTableCommonUtil.scala [new file with mode: 0644]
integration/spark/src/main/scala/org/apache/spark/sql/util/SparkSQLUtil.scala
integration/spark/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala
integration/spark/src/main/scala/org/apache/spark/util/SparkUtil.scala
integration/spark/src/main/spark2.3/org/apache/spark/sql/CarbonBoundReference.scala
integration/spark/src/main/spark2.3/org/apache/spark/sql/CarbonToSparkAdapter.scala
integration/spark/src/main/spark2.3/org/apache/spark/sql/SparkSqlAdapter.scala
integration/spark/src/main/spark2.3/org/apache/spark/sql/execution/CreateDataSourceTableCommand.scala [new file with mode: 0644]
integration/spark/src/main/spark2.3/org/apache/spark/sql/hive/CarbonSessionStateBuilder.scala
integration/spark/src/main/spark2.4/org/apache/spark/sql/CarbonToSparkAdapter.scala
integration/spark/src/main/spark3.1/org/apache/spark/sql/CarbonDataSourceScanHelper.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/CarbonToSparkAdapter.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/SparkSqlAdapter.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/SparkVersionAdapter.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/execution/CarbonCodegenSupport.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/hive/CarbonAnalyzer.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/hive/CarbonSessionStateBuilder.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/hive/CarbonSqlAstBuilder.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/hive/SqlAstBuilderHelper.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/hive/execution/command/CarbonResetCommand.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/parser/CarbonExtensionSqlParser.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/parser/CarbonSparkSqlParser.scala [new file with mode: 0644]
integration/spark/src/main/spark3.1/org/apache/spark/sql/parser/SparkSqlAstBuilderWrapper.scala [moved from integration/spark/src/main/spark2.3/org/apache/spark/sql/parser/SparkSqlAstBuilderWrapper.scala with 91% similarity]
integration/spark/src/test/scala/org/apache/carbondata/geo/GeoQueryTest.scala
integration/spark/src/test/scala/org/apache/carbondata/index/lucene/LuceneFineGrainIndexSuite.scala
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataWithCompression.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/alterTable/TestAlterTableAddColumns.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/badrecordloger/BadRecordLoggerTest.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/booleantype/BooleanDataTypesFilterTest.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/MajorCompactionWithMeasureSortColumns.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestGlobalSortDataLoad.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithHiveSyntaxDefaultFormat.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithHiveSyntaxUnsafe.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadOptions.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/directdictionary/TimestampNoDictionaryColumnTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/FilterProcessorTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/TestImplicitFilterExpression.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/index/TestIndexCommand.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/iud/UpdateCarbonTableTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/localdictionary/LocalDictionarySupportLoadTableTest.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/merge/MergeTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableQueryTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/windowsexpr/WindowsExprTestCase.scala
integration/spark/src/test/scala/org/apache/carbondata/view/plans/ExtractJoinConditionsSuite.scala
integration/spark/src/test/scala/org/apache/carbondata/view/plans/LogicalToModularPlanSuite.scala
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestAllOperationsOnMV.scala
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestPartitionWithMV.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOpName.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/TestStreamingTableQueryFilter.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/TestStreamingTableWithLongString.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/TestStreamingTableWithRowParser.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/bucketing/TableBucketingTestCase.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
integration/spark/src/test/scala/org/apache/spark/carbondata/vectorreader/VectorReaderTestCase.scala
integration/spark/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceBinaryTest.scala
integration/spark/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
licenses-binary/LICENSE-paranamer.txt [new file with mode: 0644]
mv/plan/pom.xml
mv/plan/src/main/common2.3and2.4/org/apache/carbondata/mv/plans/modular/SparkVersionHelper.scala [new file with mode: 0644]
mv/plan/src/main/scala/org/apache/carbondata/mv/expressions/modular/subquery.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/AggregatePushDown.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularRelation.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/basicOperators.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/util/BirdcageOptimizer.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/util/Logical2ModularExtractions.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/util/LogicalPlanSignatureGenerator.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/util/Printers.scala
mv/plan/src/main/scala/org/apache/carbondata/mv/plans/util/SQLBuilder.scala
mv/plan/src/main/spark2.3/org/apache/carbondata/mv/plans/modular/ExpressionHelper.scala
mv/plan/src/main/spark3.1/org/apache/carbondata/mv/plans/modular/ExpressionHelper.scala [new file with mode: 0644]
mv/plan/src/main/spark3.1/org/apache/carbondata/mv/plans/modular/SparkVersionHelper.scala [new file with mode: 0644]
pom.xml
processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java
scalastyle-config.xml
sdk/sdk/pom.xml
streaming/pom.xml
streaming/src/main/scala/org/apache/carbondata/streaming/parser/RowStreamParserImp.scala
streaming/src/main/spark2.x/org.apache.carbondata.util/SparkStreamingUtil.scala [moved from integration/spark/src/main/spark2.4/org/apache/carbondata/spark/adapter/CarbonToSparkAdapter.scala with 70% similarity]
streaming/src/main/spark3.1/org/apache/carbondata/util/SparkStreamingUtil.scala [new file with mode: 0644]