Activation via AWS CLI

This page contains steps to activate Xonai Accelerator via AWS command-line interface.

Before creating the cluster, copy the following script and edit key-pair-name to be the name of your personal key pair file (*.pem file name but without the extension), and edit BOOTSTRAP_ACTION_SCRIPT to point to the S3 location where exists.

export KEYNAME=<key-pair-name>
export CLUSTER_NAME=xonai-spark-cluster
export EMR_RELEASE_LABEL=emr-6.7.0
export INSTANCE_TYPE=m5.xlarge
export CONFIG_JSON_LOCATION=./xonai-configuration.json
export BOOTSTRAP_ACTION_SCRIPT=<xonai-activation-script-path>

Create the EMR cluster with the following command, which on successful creation it should output JSON containing values such as the ID of the new cluster:

aws emr create-cluster \
  --name $CLUSTER_NAME \
  --release-label $EMR_RELEASE_LABEL \
  --service-role EMR_DefaultRole \
  --applications Name=Hadoop Name=Spark \
  --ec2-attributes KeyName=$KEYNAME,InstanceProfile=EMR_EC2_DefaultRole \
  --instance-type $INSTANCE_TYPE \
  --configurations file://$CONFIG_JSON_LOCATION \
  --bootstrap-actions Name='Xonai Accelerator activation',Path=$BOOTSTRAP_ACTION_SCRIPT
    "ClusterId": "j-3HWJEKDYQWKCU",


It can take up to ~5 minutes to create a cluster.

Check if the cluster is waiting for steps to run with the following command, which should output JSON indicating the state of “WAITING”.

aws emr describe-cluster --cluster-id <cluster_id>
    "Cluster": {
        "Id": "j-3HWJEKDYQWKCU",
        "Name": "xonai-spark-cluster",
        "Status": {
            "State": "WAITING",
            "StateChangeReason": {
                "Message": "Cluster ready to run steps."

Submit a Spark Application to the EMR Cluster With Xonai

SSH into the master node of the cluster in “WAITING” state with the following command:

aws emr ssh --cluster-id <cluster_id> --key-pair-file ~/<my-key-pair.pem>

Now you can submit a Xonai-accelerated Spark application just like any ordinary Spark application via spark-submit, for example:

spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --conf spark.executor.memoryOverhead=4g \
  $SPARK_HOME/examples/jars/spark-examples.jar \

The console output should look like this in order to indicate that the plugin component was initialized.


Not setting spark.executor.memoryOverhead or having insufficient memory overhead will result in an error message, for example:


Cluster Termination

When you are done submitting Spark applications, do not forget to terminate the cluster either via the “Terminate” button or via CLI:

aws emr terminate-clusters --cluster-id <cluster_id>

See the official Amazon EMR guide to learn more about launching EMR clusters.