这篇文章将为大家详细讲解有关怎么进行Sparksql部署与简单使用,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
一、运行环境
Ø JDK:1.8.0_45 64位
Ø hadoop-2.6.0-cdh6.7.0
Ø Scala:2.11.8
Ø spark-2.3.1-bin-2.6.0-cdh6.7.0(需要自己编译)
Ø hive-1.1.0-cdh6.7.0
Ø MySQL5.6
二、Sparksql运行准备
[root@hadoop001 ~]# su MysqLadmin [MysqLadmin@hadoop001 root]$ cd ~ [MysqLadmin@hadoop001 ~]$ service MysqL start Starting MysqL [ OK ]
#启动HDFS
[hadoop@hadoop001 sbin]$ ./start-dfs.sh
#配置Sparksql 的hive-site.xml
[hadoop@hadoop001 ~]$ cp $HIVE_HOME/conf/hive-site.xml $SPARK_HOME/conf/
三、Sparksql启动
#spark-sehll方式启动:
[hadoop@hadoop001 bin]$ ./spark-shell --master local[2] \ --jars ~/software/mysql-connector-java-5.1.34-bin.jar scala> spark.sql("use hive_data2").show(false) scala> spark.sql("select * from emp").show(false) +-----+------+---------+----+----------+-------+------+------+ |empno|ename |job |mgr |hiredate |salary |comm |deptno| +-----+------+---------+----+----------+-------+------+------+ |7369 |SMITH |CLERK |7902|1980-12-17|800.0 |null |20 | |7499 |ALLEN |SALESMAN |7698|1981-2-20 |1600.0 |300.0 |30 | |7521 |WARD |SALESMAN |7698|1981-2-22 |1250.0 |500.0 |30 | |7566 |JOnes |MANAGER |7839|1981-4-2 |2975.0 |null |20 | |7654 |MARTIN|SALESMAN |7698|1981-9-28 |1250.0 |1400.0|30 | |7698 |BLAKE |MANAGER |7839|1981-5-1 |2850.0 |null |30 | |7782 |CLARK |MANAGER |7839|1981-6-9 |2450.0 |null |10 | |7788 |SCott |ANALYST |7566|1987-4-19 |3000.0 |null |20 | |7839 |KING |PRESIDENT|null|1981-11-17|5000.0 |null |10 | |7844 |TURNER|SALESMAN |7698|1981-9-8 |1500.0 |0.0 |30 | |7876 |AdamS |CLERK |7788|1987-5-23 |1100.0 |null |20 | |7900 |JAMES |CLERK |7698|1981-12-3 |950.0 |null |30 | |7902 |FORD |ANALYST |7566|1981-12-3 |3000.0 |null |20 | |7934 |MILLER|CLERK |7782|1982-1-23 |1300.0 |null |10 | |8888 |HIVE |PROGRAM |7839|1988-1-23 |10300.0|null |null | +-----+------+---------+----+----------+-------+------+------+
#spark-sql方式启动:
[hadoop@hadoop001 bin]$ ./spark-sql --master local[2] \ --driver-class-path ~/software/mysql-connector-java-5.1.34-bin.jar #进入数据库 spark-sql> use hive_data2; 18/08/30 20:36:52 INFO Hivemetastore: 0: get_database: hive_data2 18/08/30 20:36:52 INFO audit: ugi=hadoop ip=unkNown-ip-addr cmd=get_database: hive_data2 Time taken: 0.114 seconds #查询数据 spark-sql> select * from emp; 18/08/30 20:37:05 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 1.292944 s 7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20 7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30 7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30 7566 JOnes MANAGER 7839 1981-4-2 2975.0 NULL 20 7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30 7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30 7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10 7788 SCott ANALYST 7566 1987-4-19 3000.0 NULL 20 7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10 7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30 7876 AdamS CLERK 7788 1987-5-23 1100.0 NULL 20 7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30 7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20 7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10 8888 HIVE PROGRAM 7839 1988-1-23 10300.0 NULL NULL
关于怎么进行Sparksql部署与简单使用就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。