zoukankan      html  css  js  c++  java
  • Spark笔记

    [root@master bin]# ./spark-shell  --master yarn-client 
    20/03/21 02:48:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/03/21 02:48:40 INFO spark.SecurityManager: Changing view acls to: root
    20/03/21 02:48:40 INFO spark.SecurityManager: Changing modify acls to: root
    20/03/21 02:48:40 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    20/03/21 02:48:40 INFO spark.HttpServer: Starting HTTP Server
    20/03/21 02:48:41 INFO server.Server: jetty-8.y.z-SNAPSHOT
    20/03/21 02:48:41 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:50659
    20/03/21 02:48:41 INFO util.Utils: Successfully started service 'HTTP class server' on port 50659.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_   version 1.6.0
          /_/
    
    Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45)
    Type in expressions to have them evaluated.
    Type :help for more information.
    20/03/21 02:48:52 INFO spark.SparkContext: Running Spark version 1.6.0
    20/03/21 02:48:52 INFO spark.SecurityManager: Changing view acls to: root
    20/03/21 02:48:52 INFO spark.SecurityManager: Changing modify acls to: root
    20/03/21 02:48:52 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    20/03/21 02:48:53 INFO util.Utils: Successfully started service 'sparkDriver' on port 38244.
    20/03/21 02:48:54 INFO slf4j.Slf4jLogger: Slf4jLogger started
    20/03/21 02:48:54 INFO Remoting: Starting remoting
    20/03/21 02:48:55 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.0.90:44009]
    20/03/21 02:48:55 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 44009.
    20/03/21 02:48:55 INFO spark.SparkEnv: Registering MapOutputTracker
    20/03/21 02:48:55 INFO spark.SparkEnv: Registering BlockManagerMaster
    20/03/21 02:48:55 INFO storage.DiskBlockManager: Created local directory at /usr/local/src/spark-1.6.0-bin-hadoop2.6/blockmgr-bae586ec-9c91-4d9b-bbec-682c94e5fe14
    20/03/21 02:48:55 INFO storage.MemoryStore: MemoryStore started with capacity 511.5 MB
    20/03/21 02:48:56 INFO spark.SparkEnv: Registering OutputCommitCoordinator
    20/03/21 02:48:56 INFO server.Server: jetty-8.y.z-SNAPSHOT
    20/03/21 02:48:56 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
    20/03/21 02:48:56 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
    20/03/21 02:48:56 INFO ui.SparkUI: Started SparkUI at http://192.168.0.90:4040
    20/03/21 02:48:57 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.0.90:8032
    20/03/21 02:48:58 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
    20/03/21 02:48:58 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
    20/03/21 02:48:58 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
    20/03/21 02:48:58 INFO yarn.Client: Setting up container launch context for our AM
    20/03/21 02:48:58 INFO yarn.Client: Setting up the launch environment for our AM container
    20/03/21 02:48:58 INFO yarn.Client: Preparing resources for our AM container
    20/03/21 02:49:00 INFO yarn.Client: Uploading resource file:/usr/local/src/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar -> hdfs://master:9000/user/root/.sparkStaging/application_1584680108277_0036/spark-assembly-1.6.0-hadoop2.6.0.jar
    scala> lines.flatMap(_.split(" "))
    res25: List[String] = List(Preface, “The, Forsyte, Saga”, was, the, title, originally, destined, for, that, part, of, it, which, is, called, “The, Man, of, Property”;, and, to, adopt, it, for, the, collected, chronicles, of, the, Forsyte, family, has, indulged, the, Forsytean, tenacity, that, is, in, all, of, us., The, word, Saga, might, be, objected, to, on, the, ground, that, it, connotes, the, heroic, and, that, there, is, little, heroism, in, these, pages., But, it, is, used, with, a, suitable, irony;, and,, after, all,, this, long, tale,, though, it, may, deal, with, folk, in, frock, coats,, furbelows,, and, a, gilt-edged, period,, is, not, devoid, of, the, essential, heat, of, conflict., Discounting, for, the, gigantic, stature, and, blood-thirstiness, of, old, days,, as, they, ha...
    scala> lines.flatMap(_.split(" ")).map((_,1)).groupBy(_._1).mapValues((_.size)).toArray.sortWith(_._2>_._2).slice(0,10) 
    res26: Array[(String, Int)] = Array((the,5144), (of,3407), (to,2782), (and,2573), (a,2543), (he,2139), (his,1912), (was,1702), (in,1694), (had,1526))
    Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> import spark.sql
    import spark.sql
    
    scala> val orders = sql("select * from badou.orders")
    20/03/22 10:23:16 ERROR metastore.ObjectStore: Version information found in metastore differs 0.13.0 from expected schema version 1.2.0. Schema verififcation is disabled hive.metastore.schema.verification so setting version.
    20/03/22 10:23:20 ERROR metastore.RetryingHMSHandler: AlreadyExistsException(message:Database default already exists)
            at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:891)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
            at com.sun.proxy.$Proxy17.create_database(Unknown Source)
            at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:644)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
            at com.sun.proxy.$Proxy18.createDatabase(Unknown Source)
            at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:306)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply$mcV$sp(HiveClientImpl.scala:309)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:309)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:309)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:280)
            at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227)
            at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226)
            at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:269)
            at org.apache.spark.sql.hive.client.HiveClientImpl.createDatabase(HiveClientImpl.scala:308)
            at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply$mcV$sp(HiveExternalCatalog.scala:99)
            at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:99)
            at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:99)
            at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:72)
            at org.apache.spark.sql.hive.HiveExternalCatalog.createDatabase(HiveExternalCatalog.scala:98)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:147)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:89)
            at org.apache.spark.sql.hive.HiveSessionCatalog.<init>(HiveSessionCatalog.scala:51)
            at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:49)
            at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
            at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
            at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
            at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
            at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
            at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
            at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:24)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:29)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
            at $line18.$read$$iw$$iw$$iw$$iw.<init>(<console>:35)
            at $line18.$read$$iw$$iw$$iw.<init>(<console>:37)
            at $line18.$read$$iw$$iw.<init>(<console>:39)
            at $line18.$read$$iw.<init>(<console>:41)
            at $line18.$read.<init>(<console>:43)
            at $line18.$read$.<init>(<console>:47)
            at $line18.$read$.<clinit>(<console>)
            at $line18.$eval$.$print$lzycompute(<console>:7)
            at $line18.$eval$.$print(<console>:6)
            at $line18.$eval.$print(<console>)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
            at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
            at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
            at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
            at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
            at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
            at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
            at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
            at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
            at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
            at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
            at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
            at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)
            at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
            at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
            at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
            at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
            at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
            at org.apache.spark.repl.Main$.doMain(Main.scala:68)
            at org.apache.spark.repl.Main$.main(Main.scala:51)
            at org.apache.spark.repl.Main.main(Main.scala)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
            at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
            at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
            at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
            at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    
    orders: org.apache.spark.sql.DataFrame = [order_id: string, user_id: string ... 5 more fields]
    
    scala> orders.show(10)
    +--------+-------+--------+------------+---------+-----------------+----------------------+
    |order_id|user_id|eval_set|order_number|order_dow|order_hour_of_day|days_since_prior_order|
    +--------+-------+--------+------------+---------+-----------------+----------------------+
    |order_id|user_id|eval_set|order_number|order_dow|order_hour_of_day|  days_since_prior_...|
    | 2539329|      1|   prior|           1|        2|               08|                      |
    | 2398795|      1|   prior|           2|        3|               07|                  15.0|
    |  473747|      1|   prior|           3|        3|               12|                  21.0|
    | 2254736|      1|   prior|           4|        4|               07|                  29.0|
    |  431534|      1|   prior|           5|        4|               15|                  28.0|
    | 3367565|      1|   prior|           6|        2|               07|                  19.0|
    |  550135|      1|   prior|           7|        1|               09|                  20.0|
    | 3108588|      1|   prior|           8|        1|               14|                  14.0|
    | 2295261|      1|   prior|           9|        1|               16|                   0.0|
    +--------+-------+--------+------------+---------+-----------------+----------------------+
    only showing top 10 rows
  • 相关阅读:
    Vue基础
    Document
    Document
    Document
    Document
    Document
    Document
    Document
    Document
    Document
  • 原文地址:https://www.cnblogs.com/hackerer/p/12541479.html
Copyright © 2011-2022 走看看