This application comes as Spark2.1-REST-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server.
I wanted to build an interactive REST api service on top of my ApacheSpark application which serves use-cases like:
- Load the trained model in SparkSession and quickly do the prediction for user given query._
- Have your big-data cached in cluster and provide user an endpoint to query it.
- Run some recurrent spark queries with varying parameters.
As you can see that the core
of the application is not primarily a web-application OR browser-interaction but to have REST service performing big-data cluster-computation on ApacheSpark.
With Akka-Http, you normally don’t build your application on top of
Akka HTTP, but you build your application on top of whatever makes sense and use Akka HTTP merely for the HTTP integration needs. So, I found Akka-HTTP to be right fit for the usecases mentioned above.
- homepage - http://localhost:8001 - says "hello world"
- version - http://localhost:8001/version - queries shared SparkSession and tells "spark version"
- activeStreams - http://localhost:8001/activeStreams - tells how many spark streams are active currently
- count - http://localhost:8001/count - random spark job to count number of elements in a sequence.
Following picture illustrates the routing of a HttpRequest:
It uses Scala 2.11, Spark 2.1 and Akka-Http
mvn clean install
We can start our application as stand-alone jar like this:
mvn exec:java
Optionally, you can provide configuration params like spark-master, akka-port etc from command line. To see the list of configurable params, just type:
mvn exec:java -Dexec.args="--help"
OR
mvn exec:java -Dexec.args=“-h"
Help content will look something like this:
This application comes as Spark2.1-REST-Service-Provider using an embedded,
Reactive-Streams-based, fully asynchronous HTTP server (i.e., using akka-http).
So, this application needs config params like AkkaWebPort to bind to, SparkMaster
and SparkAppName
Usage: spark-submit spark-as-service-using-embedded-server.jar [options]
Options:
-h, --help
-m, --master <master_url> spark://host:port, mesos://host:port, yarn, or local. Default: local
-n, --name <name> A name of your application. Default: SparkAsRestService
-p, --akkaHttpPort <portnumber> Port where akka-http is binded. Default: 8001
There are 2 ways to change the default param values:
- Update
src/main/resources/application.conf
file directly. Build and then Run mvn exec:java -Dexec.args="--master <master> --name <spark-app-name> --akkaHttpPort <port-to-which-akka-should-listen-to>"