zoukankan      html  css  js  c++  java
  • linux上配置spark集群

    环境:

    linux

    spark1.6.0

    hadoop2.2.0

    一.安装scala(每台机器)
     
    1.下载scala-2.11.0.tgz
     
    放在目录: /opt下,tar -zxvf scala-2.11.0.tgz
     
    2.在hadoop用户下
     
    vim /etc/profile
    3.在profile文件加入Scala路径
     
     export SCALA_JAVA=/opt/scala-2.11.0
     export PATH=$PATH:$SCALA_JAVA/bin  
     
    4.使配置环境生效
    source /etc/profile
    5.检验scala是否安装成功
    [hadoop@testhdp01 ~]$ scala -version
    Scala code runner version 2.10.1 -- Copyright 2002-2013, LAMP/EPF
    成功
     
     
    二.安装spark
     
    1.编译spark1.6.0(在linux下编译很多次都编译不成功,所以我放到mac下编译的。)
     
    进入spark目录,然后执行以下命令:
    build/mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package
    ./make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.2 -Phive -Phive-thriftserver -Pyarn
    mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -Phive -Phive-thriftserver -DskipTests clean package
     
    用idea编译方法:
     
    2.配置spark
     
    cd /opt/spark-1.6.0-bin-hadoop2.2.0/conf
     
    cp spark-env.sh.template spark-env.sh
     
    cp slaves.template slaves
     
    vim spark-env.sh
     
    加入
    export SCALA_HOME=/opt/scala-2.10.1
    export JAVA_HOME=/opt/jdk1.7.0_51
    export SPARK_MASTER_IP=192.168.22.7
    export HADOOP_HOME=/opt/hadoop-2.2.0
    export SPARK_HOME=/opt/spark-1.6.0-bin-hadoop2.2.0
    export SPARK_LIBRARY_PATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib/native
    export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop/
    export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
    export SPARK_JAR=$SPARK_HOME/lib/spark-assembly-1.6.0-hadoop2.2.0.jar

    mac 下配置如下,在文件头加入

    #jdk
    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home
    export PATH=$PATH:$JAVA_HOME/bin
    
    #scala
    export SCALA_HOME=/usr/local/Cellar/scala-2.10.4
    export PATH=$PATH:$SCALA_HOME/bin
    
    #hadoop
    export HADOOP_HOME=/usr/local/Cellar/hadoop/2.7.2/libexec
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    #hive
    export HIVE_HOME=/usr/local/Cellar/hive/2.0.1/libexec
    export SPARK_CLASSPATH=$HIVE_HOME/lib/mysql-connector-java-5.1.28.jar:$SPARK_CLASSPATH
    
    #spark
    export SPARK_HOME=/usr/local/Cellar/spark-1.3.1-bin-hadoop2.6
    export PATH=$PATH:$SPARK_HOME/bin

    3.配置spark 支持hive

    vim spark-env.sh
    export HIVE_HOME=/opt/apache-hive-0.13.0
    export SPARK_CLASSPATH=$HIVE_HOME/lib/mysql-connector-java-5.1.26.jar:$SPARK_CLASSPATH
    拷贝apache-hive-0.13.1-bin/conf/hive-site.xml到$SPARK_HOME/conf下
    cp /opt/apache-hive-0.13.0/conf/hive-site.xml conf/
    在/etc/profile.d目录下创建hive.sh文件
    加入环境变量设置
    #!/bin/bash
    export HIVE_HOME=/opt/apache-hive-0.13.0
    export PATH=$HIVE_HOME/bin:$PATH
    是环境变量生效
    source /etc/profile.d/hive.sh
    4.配置集群
    进入spark的conf目录
    vim slaves
    删除localhost
    加入子节点的名字
    testhdp02
    testhdp03
     
    配置spark系统环境(三个子节点都要配置)
    sudo su - root
    sudo vim /etc/profile
    export SPARK_HOME=/opt/spark-1.5.0-bin-hadoop2.2.0
    export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
     
    3.把配置好的spark打包,发送到子节点
     
     
     
    三:错误分析
     
    bin/spark-shell
    运行
    val textFile = sc.textFile("README.md")
    textFile.count()
     
    出现如下错误:
     
    Caused by: java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
            ... 61 more
    Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
            at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:135)
            at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:175)
            at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
            ... 66 more
    Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
            at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
            at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
            ... 68 more
     
    解决方案:
    修改saprk-env.sh文件
     
    export SCALA_HOME=/opt/scala-2.10.1
    export JAVA_HOME=/opt/jdk1.7.0_51
    export SPARK_MASTER_IP=192.168.22.7
    export HADOOP_HOME=/opt/hadoop-2.2.0
    export SPARK_HOME=/opt/spark-1.6.0-bin-hadoop2.2.0
    export SPARK_LIBRARY_PATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib/native
    export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop/
    export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
    export SPARK_JAR=$SPARK_HOME/lib/spark-assembly-1.6.0-hadoop2.2.0.jar
    export SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_HOME/share/hadoop/yarn/*:$HADOOP_HOME/share/hadoop/yarn/lib/*:$HADOOP_HOME/share/hadoop/common/*:$HADOOP_HOME/share/hadoop/common/lib/*:$HADOOP_HOME/share/hadoop/hdfs/*:$HADOOP_HOME/share/hadoop/hdfs/lib/*:$HADOOP_HOME/share/hadoop/mapreduce/*:$HADOOP_HOME/share/hadoop/mapreduce/lib/*:$HADOOP_HOME/share/hadoop/tools/lib/*:$SPARK_HOME/lib/*

     
     
     
     
     
     
     
     
     
     
     
  • 相关阅读:
    一、Javadoc文档标记
    0-写在java系列文章之前
    Tomcat全攻略
    linux使用普通账户时,无法登录,提示“-bash: fork: retry: Resource temporarily unavailable”
    在Linux下安装和使用MySQL
    linux下修改jdk环境变量的方法
    linux下卸载系统自带或者非自带的jdk
    linux中 /etc/profile的作用
    每天一个linux命令:tar命令-jia2
    如何使用蓝湖设计稿同时适配PC及移动端
  • 原文地址:https://www.cnblogs.com/aijianiula/p/5192580.html
Copyright © 2011-2022 走看看