Spark Streaming with Scala and Akka
Apache Spark is a fast and general engine for large-scale data processing. This Typesafe Activator template demonstrates Apache Spark for near-real-time data streaming with Scala and Akka using the Spark Streaming extension.
This tutorial demonstrates how one can use Spark Streaming and the Akka actor system so actors can be used as receivers for incoming stream. Since actors can be used to receive data from any input source it enhances Spark Streaming's built-in streaming sources.
Develop Spark Streaming application
You start developing Spark Streaming application by creating a SparkConf that's followed by a StreamingContext.
val conf = new SparkConf(false) // skip loading external settings
.setMaster("local[*]") // run locally with enough threads
.setAppName("Spark Streaming with Scala and Akka") // name in Spark web UI
val ssc = new StreamingContext(conf, Seconds(1))
This gives you a context to access the actor system that is of type
val actorName = "helloer"
val actorStream: ReceiverInputDStream[String] = ssc.actorStream[String](Props[Helloer], actorName)
Having DStream lets you define a high-level processing pipeline in Spark Streaming.
In the above case,
method is going to print the first ten elements of each RDD generated in this DStream.
Nothing happens until
With the context up and running, the code connects to a Akka remote actor system in Spark Streaming that hosts the
actor and sends messages that, as the above code shows, display them all to standard output.
val actorSystem = SparkEnv.get.actorSystem
val url = s"akka.tcp://spark@$driverHost:$driverPort/user/Supervisor0/$actorName"
val timeout = 100 seconds
val helloer = Await.result(actorSystem.actorSelection(url).resolveOne(timeout), timeout)
helloer ! "Hello"
helloer ! "from"
helloer ! "Apache Spark (Streaming)"
helloer ! "and"
helloer ! "Akka"
helloer ! "and"
helloer ! "Scala"
The Scala version is available at
Run the App
Let's run the sample application.
In Run, select the application to run from the drop-down list under Main Class, and select Start. Feel free to modify, compile and re-run the sample.
The Spark Documentation offers Setup instructions, programming guides, and other documentation.
If you have questions don't hesitate to post them to the firstname.lastname@example.org mailing list.
or contact the author of the activator.