Submit your spark jobs using rest api.
- Submit spark job.
- Resubmit spark job using its id.
- Get user jobs.
- Get job details.
- Token based authentication.
Ideas on how to improve the project:
- Create a front end, maybe ScalaJS/React?
- Schedule spark jobs e.g. run every Monday at 1PM.
- Send emails with job statuses.
- Setup a test module to test spark and db integration (TravisCI).
- Improve README :)
Here is the Sparxer API description that will help you to submit a spark job, resubmit it, get your submitted jobs and to find a job details.
Sparxer uses token based authentication (Bearer). Generated token will be active for one hour.
TOKEN=$(curl -s -H 'Content-Type: application/json' --data '{email: "{email}", password: "{password}"}' -X POST http://{hostname}:{port}/auth/sign-in)
Sparxer uses api that is close to native spark-submit tool.
curl --data {json} -H 'Content-Type: application/json' -H 'Authorization: Bearer ${TOKEN}' -X POST http://{hostname}:{port}/spark/submit
Json format:
{
"mainClass": "org.apache.spark.examples.SparkPi",
"master": "local[*]",
"deployMode": "client",
"jar": "/opt/spark/examples/jars/spark-examples_2.11-2.4.3.jar",
"sparkConf": {},
"args": [
"100"
],
"envs": {
"SPARK_HOME": "/opt/spark"
}
}
Sparxer allows to resubmit existing job providing its id number.
curl -H 'Content-Type: application/json' --data '{"id": 12345}' -H 'Authorization: Bearer ${TOKEN}' -X POST http://{hostname}:{port}/spark/resubmit
You can access all jobs that were submitted by the logged in user.
curl -H 'Accept: application/json' -H 'Authorization: Bearer ${TOKEN}' -X GET http://{hostname}:{port}/jobs
You can get data about a submitted job. It will provide information about a job configuration and a list of statuses.
curl -H 'Accept: application/json' -H 'Content-Type: application/json' -H 'Authorization: Bearer ${TOKEN}' -X GET http://{hostname}:port/jobs/12345
todo...