微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Ubuntu安装Spark和Hadoop集群

一、JDK安装

手动解压JDK的压缩包,然后设置环境变量

1.1在/usr/目录下创建java目录

root@ubuntu:~#mkdir/usr/java
root@ubuntu:~# cd /usr/java

1.2 下载jdk,然后解压

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
root@ubuntu:~# tar -zxvf jdk-8u144-linux-x64.tar.gz

1.3 设置环境变量

root@ubuntu:~# vi /etc/profile

在profile中添加如下内容:

#set java environment
JAVA_HOME=/usr/java/jdk1.8.0_144
JRE_HOME=/usr/java/jdk1.8.0_144/jre
CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH

修改生效:

root@ubuntu:~# source /etc/profile

1.4 验证JDK有效性

root@ubuntu:~# java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01,mixed mode)
root@ubuntu:~#

二、Scala安装

下载scala:http://www.scala-lang.org/download/2.11.6.html
root@ubuntu:~# cd /usr
root@ubuntu:/usr# mkdir scala
root@ubuntu:/usr# cd scala/
root@ubuntu:/usr/scala# cp /home/baihu/pragrom/scala-2.11.6.tgz .
root@ubuntu:/usr/scala# ls
scala-2.11.6.tgz
root@ubuntu:/usr/scala# tar -zxf scala-2.11.6.tgz
root@ubuntu:/usr/scala# ls
scala-2.11.6 scala-2.11.6.tgz
root@ubuntu:/usr/scala#
编辑配置文件添加scala的配置


[root@master scala-2.11.6]# vi /etc/profile


export SCALA_HOME=/usr/scala/scala-2.11.6
export PATH=$PATH:$SCALA_HOME/bin


root@ubuntu:/usr/scala# source /etc/profile
root@ubuntu:/usr/scala# scala -version
Scala code runner version 2.11.6 -- copyright 2002-2013,LAMP/EPFL

三、安装Spark 下载spark最新版spark-2.2.0-bin-hadoop2.7.tgz http://spark.apache.org/downloads.html root@ubuntu:/usr# mkdir spark root@ubuntu:/usr# cd spark/ root@ubuntu:/usr/spark# mv /home/baihu/pragrom/spark-2.2.0-bin-hadoop2.7.tgz . root@ubuntu:/usr/spark# ls spark-2.2.0-bin-hadoop2.7.tgz root@ubuntu:/usr/spark# tar -zxf spark-2.2.0-bin-hadoop2.7.tgz root@ubuntu:/usr/spark# ls spark-2.2.0-bin-hadoop2.7 spark-2.2.0-bin-hadoop2.7.tgz root@ubuntu:/usr/spark# cd spark-2.2.0-bin-hadoop2.7/ root@ubuntu:/usr/spark/spark-2.2.0-bin-hadoop2.7# ls bin data jars licenses python README.md sbin conf examples LICENSE NOTICE R RELEASE yarn root@ubuntu:/usr/spark/spark-2.2.0-bin-hadoop2.7# cd bin/ root@ubuntu:/usr/spark/spark-2.2.0-bin-hadoop2.7/bin# ./pyspark Python 2.7.6 (default,Jun 22 2015,17:58:13) [GCC 4.8.2] on linux2 Type "help","copyright","credits" or "license" for more information. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR,use setLogLevel(newLevel). 17/08/02 07:34:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/08/02 07:34:23 WARN Utils: Your hostname,ubuntu resolves to a loopback address: 127.0.1.1; using 192.168.75.130 instead (on interface eth0) 17/08/02 07:34:23 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 17/08/02 07:34:52 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 17/08/02 07:34:53 WARN ObjectStore: Failed to get database default,returning NoSuchObjectException 17/08/02 07:34:55 WARN ObjectStore: Failed to get database global_temp,returning NoSuchObjectException Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.2.0 /_/ Using Python version 2.7.6 (default,Jun 22 2015 17:58:13) SparkSession available as 'spark'. >>>

原文地址:https://www.jb51.cc/ubuntu/351877.html

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐