如何解决不同阶段在 Hive 查询执行计划中做同样的事情
Hive 版本:1.1.0-cdh5.15.2,我最近开始学习 hive 源代码及其工作原理。下面是我遇到的问题
explain insert into testv1 select * from test_textfile where val >200;
STAGE DEPENDENCIES: Stage-1 is a root stage Stage-7 depends on stages: Stage-1,consists of Stage-4,Stage-3,Stage-5 Stage-4 Stage-0 depends on stages: Stage-4,Stage-6 Stage-2 depends on stages: Stage-0 Stage-3 Stage-5 Stage-6 depends on stages: Stage-5 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: test_textfile Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (val > 200) (type: boolean) Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToString(val) (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: true Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetoutputFormat serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe name: test.testv1 Stage: Stage-7 Conditional Operator Stage: Stage-4 Move Operator files: hdfs directory: true destination: hdfs://xlclusterns1/tmp/hive-stagingdir/staging_hive_2021-04-14_15-14-30_205_4974356220876798617-1/-ext-10000 Stage: Stage-0 Move Operator tables: replace: false table: input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetoutputFormat serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe name: test.testv1 Stage: Stage-2 Stats-Aggr Operator Stage: Stage-3 Map Reduce Map Operator Tree: TableScan File Output Operator compressed: true table: input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetoutputFormat serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe name: test.testv1 Stage: Stage-5 Map Reduce Map Operator Tree: TableScan File Output Operator compressed: true table: input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetoutputFormat serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe name: test.testv1 Stage: Stage-6 Move Operator files: hdfs directory: true destination: hdfs://xlclusterns1/tmp/hive-stagingdir/staging_hive_2021-04-14_15-14-30_205_4974356220876798617-1/-ext-10000
问题是我无法解释为什么 stage-3 和 stage-5 做同样的事情,有人知道这个问题吗?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。