How Google Dataproc cluster auto scales?
Job Submission on cluster
Job
Jobs can be submitted on the cluster created using the console screen.
Job ID : Unique job ID needs to given to the job
Cluster : We can select the cluster created here
Job Type : We selected job type as “Spark” for our PoC but other job types available are Hadoop, Spark SQL, PySpark, MapReduce, Hive, and Pig
Main class or Jar : Include the FQCN of the main class of your project. E.g. com.acmesolutions.component.CalculateIncentive
Arguments: Any additional command line argument to be passed to the main class can be done here. We had six parameter which we passed for our main class.
Jars: Comma (,) separated list of Jars needed to run the job. We choose to keep our jar in the directory on master node. The complete path preceded by file:/// was provided in jars input.
Example of Jars:
file:///root/project_dir/ProjectMain.jar,file:///root/project_dir/lib/GoogleMapsService.jar,file:///root/project_dir/lib/HikariCP-java7–2.4.12.jar,file:///root/project_dir/lib/JavaEWAH-0.3.2.jar …
So the jar GoogleMapsService.jar resides physically in directory /root/project_dir on the master node. Our main class was part of the ProjectMain.jar which also resided in a directory on the master node.
Properties: Any external properties to be passed to the job can be done using properties. These properties have any impact of the resource allocation & scaling which we shall discuss in the Part II of this article.
Load creation
Auto scaling gets invoked only when there is load induced on the cluster with one or more jobs. In production scenario, we may have a need to execute multiple jobs is parallel. For the PoC, we decided to create multiple instances of our test job having their independent directories on the HDFS to read and write into. These jobs were added in staggered manner to try to mimic a real world scenario.
Auto scaling trigger
Yarn Memory
As we submit multiple parallel jobs to the cluster, it starts to allocate the Yarn core & memory for the jobs on the cluster. As the allocated Yarn memory increases (Figure 1), the Pending memory reduces (Figure 2).
It reaches a point where is no Yarn memory left on the cluster to allocate, this triggers creation of additional node(s). These additional nodes are copies on the worker node each having 2 cores and 7.5 GB RAM making 2 Yarn core sand 6 GB Yarn memory available. At the peak on the job execution the active nodes went to 10 nodes. So 20 Yarn cores and 60 GB Yarn memory were in action at that point in time (Figure 3)
Scaled Workers
After the job is complete, the cluster automatically scales down by decommissioning the workers and cluster returns back to the original state of 2 active workers and 8 decommissioned workers as shown the following figure:
CPU utilization
Following figure depicts CPU utilization goes down for the cluster as more nodes are added to it. As pointed out in the auto scaling policy, the Yarn’s default memory based auto scaling may lead to surplus cores available on the cluster (see 15% CPU utilization and 85% CPU availability around 1 PM).
Task distribution across the workers
Following figure, depicts the executors in action to execute one of the multiple jobs submitted on the cluster. The job consists of 604 tasks divided between the 8 executors on different works nodes.
Conclusion
With this we have covered setting up of cluster, auto scaling policy, submitting job on this cluster, inducing load with multiple jobs and monitoring how cluster auto scales up & down.