[SPARK-2561][SQL] Fix apply schema
authorMichael Armbrust <michael@databricks.com>
Tue, 22 Jul 2014 01:18:17 +0000 (18:18 -0700)
committerMichael Armbrust <michael@databricks.com>
Tue, 22 Jul 2014 01:18:35 +0000 (18:18 -0700)
We need to use the analyzed attributes otherwise we end up with a tree that will never resolve.

Author: Michael Armbrust <michael@databricks.com>

Closes #1470 from marmbrus/fixApplySchema and squashes the following commits:

f968195 [Michael Armbrust] Use analyzed attributes when applying the schema.
4969015 [Michael Armbrust] Add test case.

(cherry picked from commit 511a7314037219c23e824ea5363bf7f1df55bab3)
Signed-off-by: Michael Armbrust <michael@databricks.com>
sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala
sql/core/src/test/scala/org/apache/spark/sql/DslQuerySuite.scala

index d5214a3..d0d2db2 100644 (file)
@@ -410,7 +410,7 @@ class SchemaRDD(
    * @group schema
    */
   private def applySchema(rdd: RDD[Row]): SchemaRDD = {
-    new SchemaRDD(sqlContext, SparkLogicalPlan(ExistingRdd(logicalPlan.output, rdd)))
+    new SchemaRDD(sqlContext, SparkLogicalPlan(ExistingRdd(queryExecution.analyzed.output, rdd)))
   }
 
   // =======================================================================
index 05aac66..0a69cbc 100644 (file)
@@ -34,6 +34,12 @@ class DslQuerySuite extends QueryTest {
       testData.collect().toSeq)
   }
 
+  test("repartition") {
+    checkAnswer(
+      testData.select('key).repartition(10).select('key),
+      testData.select('key).collect().toSeq)
+  }
+
   test("agg") {
     checkAnswer(
       testData2.groupBy('a)('a, Sum('b)),