GCE Extensions
To use this Apache Druid extension, include gce-extensions
in the extensions load list.
At the moment, this extension enables only Druid to autoscale instances in GCE.
The extension manages the instances to be scaled up and down through the use of the Managed Instance Groups of GCE (MIG from now on). This choice has been made to ease the configuration of the machines and simplify their management.
For this reason, in order to use this extension, the user must have created
- An instance template with the right machine type and image to bu used to run the MiddleManager
- A MIG that has been configured to use the instance template created in the point above
Moreover, in order to be able to rescale the machines in the MIG, the Overlord must run with a service account guaranteeing the following two scopes from the Compute Engine API
https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/compute
Overlord Dynamic Configuration
The Overlord can dynamically change worker behavior.
The JSON object can be submitted to the Overlord via a POST request at:
http://<OVERLORD_IP>:<port>/druid/indexer/v1/worker
Optional Header Parameters for auditing the config change can also be specified.
Header Param Name | Description | Default |
---|---|---|
X-Druid-Author | author making the config change | "" |
X-Druid-Comment | comment describing the change being done | "" |
A sample worker config spec is shown below:
{
"autoScaler": {
"envConfig" : {
"numInstances" : 1,
"projectId" : "super-project",
"zoneName" : "us-central-1",
"managedInstanceGroupName" : "druid-middlemanagers"
},
"maxNumWorkers" : 4,
"minNumWorkers" : 2,
"type" : "gce"
}
}
The configuration of the autoscaler is quite simple and it is made of two levels only.
The external level specifies the type
—always gce
in this case— and two numeric values,
the maxNumWorkers
and minNumWorkers
used to define the boundaries in between which the
number of instances must be at any time.
The internal level is the envConfig
and it is used to specify
- The
numInstances
used to specify how many workers will be spawned at each request to provision more workers. This is safe to be left to1
- The
projectId
used to specify the name of the project in which the MIG resides - The
zoneName
used to identify in which zone of the worlds the MIG is - The
managedInstanceGroupName
used to specify the MIG containing the instances created or removed
Please refer to the Overlord Dynamic Configuration section in the main documentation
for parameters other than the ones specified here, such as selectStrategy
etc.
Known limitations
- The module internally uses the ListManagedInstances
call from the API and, while the documentation of the API states that the call can be paged through using the
pageToken
argument, the responses to such call do not provide anynextPageToken
to set such parameter. This means that the extension can operate safely with a maximum of 500 MiddleManagers instances at any time (the maximum number of instances to be returned for each call).