Java学习者论坛

 找回密码
 立即注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

恭喜Java学习者论坛(https://www.javaxxz.com)已经为数万Java学习者服务超过8年了!积累会员资料超过10000G+
成为本站VIP会员,下载本站10000G+会员资源,购买链接:点击进入购买VIP会员
JAVA高级面试进阶视频教程Java架构师系统进阶VIP课程

分布式高可用全栈开发微服务教程

Go语言视频零基础入门到精通

Java架构师3期(课件+源码)

Java开发全终端实战租房项目视频教程

SpringBoot2.X入门到高级使用教程

大数据培训第六期全套视频教程

深度学习(CNN RNN GAN)算法原理

Java亿级流量电商系统视频教程

互联网架构师视频教程

年薪50万Spark2.0从入门到精通

年薪50万!人工智能学习路线教程

年薪50万!大数据从入门到精通学习路线年薪50万!机器学习入门到精通视频教程
仿小米商城类app和小程序视频教程深度学习数据分析基础到实战最新黑马javaEE2.1就业课程从 0到JVM实战高手教程 MySQL入门到精通教程
查看: 391|回复: 0

[默认分类] 本地windows跑Scala程序调用Spark

[复制链接]
  • TA的每日心情
    开心
    2021-12-13 21:45
  • 签到天数: 15 天

    [LV.4]偶尔看看III

    发表于 2018-6-8 10:20:28 | 显示全部楼层 |阅读模式

    应用场景

      Spark是用scala写的一种极其强悍的计算工具,spark内存计算,提供了图计算,流式计算,机器学习,即时查询等十分方便的工具,所以利用scala来进行spark编程是十分必要的,下面简单书写一个spark连接mysql读取信息的例子。

    操作流程

      按照windows搭建Scala开发环境博文,搭建scala开发环境,实际已经将Spark环境部署完成了,所以直接可以用scala语言写一些spark相关的程序!

    1. [code]package epoint.com.cn.test001
    2. import org.apache.spark.sql.SQLContext
    3. import org.apache.spark.SparkConf
    4. import org.apache.spark.SparkContext
    5. import org.apache.spark.rdd.RDD
    6. object SparkConnMysql {
    7. def main(args: Array[String]) {
    8. println("Hello, world!")
    9. val conf = new SparkConf()
    10. conf.setAppName("wow,my first spark app")
    11. conf.setMaster("local")
    12. val sc = new SparkContext(conf)
    13. val sqlContext = new SQLContext(sc)
    14. val url = "jdbc:mysql://192.168.114.67:3306/user"
    15. val table = "user"
    16. val reader = sqlContext.read.format("jdbc")
    17. reader.option("url", url)
    18. reader.option("dbtable", table)
    19. reader.option("driver", "com.mysql.jdbc.Driver")
    20. reader.option("user", "root")
    21. reader.option("password", "11111")
    22. val df = reader.load()
    23. df.show()
    24. }
    25. }
    复制代码
    [/code]
    运行结果:
    1. [code]Hello, world!
    2. Using Spark"s default log4j profile: org/apache/spark/log4j-defaults.properties
    3. SLF4J: Class path contains multiple SLF4J bindings.
    4. SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    5. SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    6. SLF4J: Found binding in [jar:file:/D:/kettle7.1/inceptor-driver.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    7. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    8. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    9. 17/11/21 11:43:53 INFO SparkContext: Running Spark version 1.6.1
    10. 17/11/21 11:43:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/11/21 11:43:56 INFO SecurityManager: Changing view acls to: lenovo 17/11/21 11:43:56 INFO SecurityManager: Changing modify acls to: lenovo 17/11/21 11:43:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(lenovo); users with modify permissions: Set(lenovo) 17/11/21 11:43:59 INFO Utils: Successfully started service "sparkDriver" on port 55824. 17/11/21 11:43:59 INFO Slf4jLogger: Slf4jLogger started 17/11/21 11:43:59 INFO Remoting: Starting remoting 17/11/21 11:43:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.114.67:55837]
    11. 17/11/21 11:43:59 INFO Utils: Successfully started service "sparkDriverActorSystem" on port 55837.
    12. 17/11/21 11:43:59 INFO SparkEnv: Registering MapOutputTracker
    13. 17/11/21 11:43:59 INFO SparkEnv: Registering BlockManagerMaster
    14. 17/11/21 11:43:59 INFO DiskBlockManager: Created local directory at C:\Users\lenovo\AppData\Local\Temp\blockmgr-16383e3c-7cb6-43c7-b300-ccc1a1561bb4
    15. 17/11/21 11:43:59 INFO MemoryStore: MemoryStore started with capacity 1129.9 MB
    16. 17/11/21 11:44:00 INFO SparkEnv: Registering OutputCommitCoordinator
    17. 17/11/21 11:44:00 INFO Utils: Successfully started service "SparkUI" on port 4040.
    18. 17/11/21 11:44:00 INFO SparkUI: Started SparkUI at http://192.168.114.67:4040
    19. 17/11/21 11:44:00 INFO Executor: Starting executor ID driver on host localhost
    20. 17/11/21 11:44:00 INFO Utils: Successfully started service "org.apache.spark.network.netty.NettyBlockTransferService" on port 55844.
    21. 17/11/21 11:44:00 INFO NettyBlockTransferService: Server created on 55844
    22. 17/11/21 11:44:00 INFO BlockManagerMaster: Trying to register BlockManager
    23. 17/11/21 11:44:00 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55844 with 1129.9 MB RAM, BlockManagerId(driver, localhost, 55844)
    24. 17/11/21 11:44:00 INFO BlockManagerMaster: Registered BlockManager
    25. 17/11/21 11:44:05 INFO SparkContext: Starting job: show at SparkConnMysql.scala:25 17/11/21 11:44:05 INFO DAGScheduler: Got job 0 (show at SparkConnMysql.scala:25) with 1 output partitions 17/11/21 11:44:05 INFO DAGScheduler: Final stage: ResultStage 0 (show at SparkConnMysql.scala:25) 17/11/21 11:44:05 INFO DAGScheduler: Parents of final stage: List() 17/11/21 11:44:05 INFO DAGScheduler: Missing parents: List() 17/11/21 11:44:05 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25), which has no missing parents 17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 5.2 KB, free 5.2 KB) 17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 7.7 KB) 17/11/21 11:44:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55844 (size: 2.5 KB, free: 1129.9 MB) 17/11/21 11:44:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 17/11/21 11:44:06 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25) 17/11/21 11:44:06 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 17/11/21 11:44:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1922 bytes) 17/11/21 11:44:06 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 17/11/21 11:44:06 INFO JDBCRDD: closed connection 17/11/21 11:44:06 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 3472 bytes result sent to driver 17/11/21 11:44:06 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 224 ms on localhost (1/1) 17/11/21 11:44:06 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/11/21 11:44:06 INFO DAGScheduler: ResultStage 0 (show at SparkConnMysql.scala:25) finished in 0.261 s 17/11/21 11:44:06 INFO DAGScheduler: Job 0 finished: show at SparkConnMysql.scala:25, took 1.467252 s +---+----+----+------------+------------------+---------+-------+ | id|name| age| phone| email|startdate|enddate| +---+----+----+------------+------------------+---------+-------+ | 11| 徐心三| 24| 2423424| 2423424@qq.com| null| null| | 33| 徐心七| 23| 23232323| 13131@qe| null| null| | 55| 徐彬| 22| 15262301036|徐彬757661238@ww.com| null| null| | 44| 徐成|3333| 23423424332| 2342423@qq.com| null| null| | 66| 徐心四| 23|242342342423| 徐彬23424@qq.com| null| null| | 11| 徐心三| 24| 2423424| 2423424@qq.com| null| null| | 33| 徐心七| 23| 23232323| 13131@qe| null| null| | 55| 徐彬| 22| 15262301036|徐彬757661238@ww.com| null| null| | 44| 徐成|3333| 23423424332| 2342423@qq.com| null| null| | 66| 徐心四| 23|242342342423| 徐彬23424@qq.com| null| null| | 88| 徐心八| 123| 131231312| 123123@qeqe| null| null| | 99| 徐心二| 23| 13131313| 1313133@qeq.com| null| null| |121| 徐心五| 13| 123131231| 1231312@qq.com| null| null| |143| 徐心九| 23| 234234| 徐彬234@wrwr| null| null| +---+----+----+------------+------------------+---------+-------+ only showing top 14 rows 17/11/21 11:44:06 INFO SparkContext: Invoking stop() from shutdown hook 17/11/21 11:44:06 INFO SparkUI: Stopped Spark web UI at http://192.168.114.67:4040 17/11/21 11:44:06 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/11/21 11:44:06 INFO MemoryStore: MemoryStore cleared 17/11/21 11:44:06 INFO BlockManager: BlockManager stopped 17/11/21 11:44:06 INFO BlockManagerMaster: BlockManagerMaster stopped 17/11/21 11:44:06 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    26. 17/11/21 11:44:06 INFO SparkContext: Successfully stopped SparkContext
    27. 17/11/21 11:44:07 INFO ShutdownHookManager: Shutdown hook called
    28. 17/11/21 11:44:07 INFO ShutdownHookManager: Deleting directory C:\Users\lenovo\AppData\Local\Temp\spark-7877d903-f8f7-4efb-9e0c-7a11ac147153
    复制代码
    [/code]

    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 立即注册

    本版积分规则

    QQ|手机版|Java学习者论坛 ( 声明:本站资料整理自互联网,用于Java学习者交流学习使用,对资料版权不负任何法律责任,若有侵权请及时联系客服屏蔽删除 )

    GMT+8, 2024-4-20 05:14 , Processed in 0.405900 second(s), 48 queries .

    Powered by Discuz! X3.4

    © 2001-2017 Comsenz Inc.

    快速回复 返回顶部 返回列表