如何解决如何在 Databricks 中的 Iceberg 表上执行 Spark SQL 合并语句?
我正在尝试在我们的 Databricks 环境中设置 Apache Iceberg,但在 Spark sql 中执行 MERGE
语句时遇到错误。
此代码:
CREATE TABLE iceberg.db.table (id bigint,data string) USING iceberg;
INSERT INTO iceberg.db.table VALUES (1,'a'),(2,'b'),(3,'c');
INSERT INTO iceberg.db.table SELECT id,data FROM (select * from iceberg.db.table) t WHERE length(data) = 1;
MERGE INTO iceberg.db.table t USING (SELECT * FROM iceberg.db.table) u ON t.id = u.id
WHEN NOT MATCHED THEN INSERT *
产生这个错误:
Error in sql statement: AnalysisException: MERGE destination only supports Delta sources.
Some(RelationV2[id#116L,data#117] iceberg.db.table
我认为问题的根源在于 MERGE
也是 Delta Lake sql 引擎的关键字。据我所知,这个问题源于 Spark 尝试执行计划的顺序。 MERGE
触发增量规则,然后抛出错误,因为它不是增量表。我可以毫无问题地读取、追加和覆盖冰山表。
主要问题:如何让 Spark 将其识别为 Iceberg 查询而不是 Delta?或者是否可以完全删除与增量相关的 sql 规则?
环境
Spark 版本:3.0.1
Databricks 运行时版本:7.6
冰山配置
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.iceberg.type=hadoop
spark.sql.catalog.iceberg.warehouse=BLOB_STORAGE_CONTAINER
堆栈跟踪:
com.databricks.backend.common.rpc.DatabricksExceptions$sqlExecutionException: org.apache.spark.sql.AnalysisException: MERGE destination only supports Delta sources.
Some(RelationV2[id#116L,data#117] iceberg.db.table
);
at com.databricks.sql.transaction.tahoe.DeltaErrors$.notADeltaSourceException(DeltaErrors.scala:343)
at com.databricks.sql.transaction.tahoe.PreprocesstableMerge.apply(PreprocesstableMerge.scala:201)
at com.databricks.sql.transaction.tahoe.PreprocesstableMergeEdge$$anonfun$apply$1.applyOrElse(PreprocesstableMergeEdge.scala:39)
at com.databricks.sql.transaction.tahoe.PreprocesstableMergeEdge$$anonfun$apply$1.applyOrElse(PreprocesstableMergeEdge.scala:36)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDown$2(AnalysisHelper.scala:112)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDown$1(AnalysisHelper.scala:112)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:216)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDown(AnalysisHelper.scala:110)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDown$(AnalysisHelper.scala:108)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperators(AnalysisHelper.scala:73)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperators$(AnalysisHelper.scala:72)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:29)
at com.databricks.sql.transaction.tahoe.PreprocesstableMergeEdge.apply(PreprocesstableMergeEdge.scala:36)
at com.databricks.sql.transaction.tahoe.PreprocesstableMergeEdge.apply(PreprocesstableMergeEdge.scala:29)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:152)```
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。