Skip to content

Commit

Permalink
Middle Manager wording update in docs (#17005)
Browse files Browse the repository at this point in the history
  • Loading branch information
writer-jill authored Sep 5, 2024
1 parent 40f38f0 commit b4d83a8
Show file tree
Hide file tree
Showing 31 changed files with 160 additions and 160 deletions.
24 changes: 12 additions & 12 deletions docs/api-reference/service-status-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ You can use each endpoint with the ports for each type of service. The following
| Router|8888|
| Broker|8082|
| Historical|8083|
| MiddleManager|8091|
| Middle Manager|8091|

### Get service information

Expand Down Expand Up @@ -791,11 +791,11 @@ Host: http://OVERLORD_IP:OVERLORD_PORT
</details>


## MiddleManager
## Middle Manager

### Get MiddleManager state status
### Get Middle Manager state status

Retrieves the enabled state of the MiddleManager. Returns JSON object keyed by the combined `druid.host` and `druid.port` with a boolean `true` or `false` state as the value.
Retrieves the enabled state of the Middle Manager process. Returns JSON object keyed by the combined `druid.host` and `druid.port` with a boolean `true` or `false` state as the value.

#### URL

Expand All @@ -810,7 +810,7 @@ Retrieves the enabled state of the MiddleManager. Returns JSON object keyed by t

<br/>

*Successfully retrieved MiddleManager state*
*Successfully retrieved Middle Manager state*

</TabItem>
</Tabs>
Expand Down Expand Up @@ -855,7 +855,7 @@ Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT

### Get active tasks

Retrieves a list of active tasks being run on MiddleManager. Returns JSON list of task ID strings. Note that for normal usage, you should use the `/druid/indexer/v1/tasks` [Tasks API](./tasks-api.md) endpoint or one of the task state specific variants instead.
Retrieves a list of active tasks being run on the Middle Manager. Returns JSON list of task ID strings. Note that for normal usage, you should use the `/druid/indexer/v1/tasks` [Tasks API](./tasks-api.md) endpoint or one of the task state specific variants instead.

#### URL

Expand Down Expand Up @@ -984,9 +984,9 @@ Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT

</details>

### Disable MiddleManager
### Disable Middle Manager

Disables a MiddleManager, causing it to stop accepting new tasks but complete all existing tasks. Returns a JSON object
Disables a Middle Manager, causing it to stop accepting new tasks but complete all existing tasks. Returns a JSON object
keyed by the combined `druid.host` and `druid.port`.

#### URL
Expand All @@ -1002,7 +1002,7 @@ keyed by the combined `druid.host` and `druid.port`.

<br/>

*Successfully disabled MiddleManager*
*Successfully disabled Middle Manager*

</TabItem>
</Tabs>
Expand Down Expand Up @@ -1043,9 +1043,9 @@ Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT

</details>

### Enable MiddleManager
### Enable Middle Manager

Enables a MiddleManager, allowing it to accept new tasks again if it was previously disabled. Returns a JSON object keyed by the combined `druid.host` and `druid.port`.
Enables a Middle Manager, allowing it to accept new tasks again if it was previously disabled. Returns a JSON object keyed by the combined `druid.host` and `druid.port`.

#### URL

Expand All @@ -1060,7 +1060,7 @@ Enables a MiddleManager, allowing it to accept new tasks again if it was previou

<br/>

*Successfully enabled MiddleManager*
*Successfully enabled Middle Manager*

</TabItem>
</Tabs>
Expand Down
94 changes: 47 additions & 47 deletions docs/configuration/index.md

Large diffs are not rendered by default.

28 changes: 14 additions & 14 deletions docs/design/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ Druid has several types of services:
* [Broker](../design/broker.md) handles queries from external clients.
* [Router](../design/router.md) routes requests to Brokers, Coordinators, and Overlords.
* [Historical](../design/historical.md) stores queryable data.
* [MiddleManager](../design/middlemanager.md) and [Peon](../design/peons.md) ingest data.
* [Indexer](../design/indexer.md) serves an alternative to the MiddleManager + Peon task execution system.
* [Middle Manager](../design/middlemanager.md) and [Peon](../design/peons.md) ingest data.
* [Indexer](../design/indexer.md) serves an alternative to the Middle Manager + Peon task execution system.

You can view services in the **Services** tab in the web console:

Expand All @@ -63,7 +63,7 @@ Master servers divide operations between Coordinator and Overlord services.

#### Overlord service

[Overlord](../design/overlord.md) services watch over the MiddleManager services on the Data servers and are the controllers of data ingestion into Druid. They are responsible for assigning ingestion tasks to MiddleManagers and for coordinating segment publishing.
[Overlord](../design/overlord.md) services watch over the Middle Manager services on the Data servers and are the controllers of data ingestion into Druid. They are responsible for assigning ingestion tasks to Middle Managers and for coordinating segment publishing.

### Query server

Expand All @@ -73,7 +73,7 @@ Query servers divide operations between Broker and Router services.

#### Broker service

[Broker](../design/broker.md) services receive queries from external clients and forward those queries to Data servers. When Brokers receive results from those subqueries, they merge those results and return them to the caller. Typically, you query Brokers rather than querying Historical or MiddleManager services on Data servers directly.
[Broker](../design/broker.md) services receive queries from external clients and forward those queries to Data servers. When Brokers receive results from those subqueries, they merge those results and return them to the caller. Typically, you query Brokers rather than querying Historical or Middle Manager services on Data servers directly.

#### Router service

Expand All @@ -85,30 +85,30 @@ The Router service also runs the [web console](../operations/web-console.md), a

A Data server executes ingestion jobs and stores queryable data.

Data servers divide operations between Historical and MiddleManager services.
Data servers divide operations between Historical and Middle Manager services.

#### Historical service

[**Historical**](../design/historical.md) services handle storage and querying on historical data, including any streaming data that has been in the system long enough to be committed. Historical services download segments from deep storage and respond to queries about these segments. They don't accept writes.

#### MiddleManager service
#### Middle Manager service

[**MiddleManager**](../design/middlemanager.md) services handle ingestion of new data into the cluster. They are responsible
[**Middle Manager**](../design/middlemanager.md) services handle ingestion of new data into the cluster. They are responsible
for reading from external data sources and publishing new Druid segments.

##### Peon service

[**Peon**](../design/peons.md) services are task execution engines spawned by MiddleManagers. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the MiddleManager that spawned them.
[**Peon**](../design/peons.md) services are task execution engines spawned by Middle Managers. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the Middle Manager that spawned them.

#### Indexer service (optional)

[**Indexer**](../design/indexer.md) services are an alternative to MiddleManagers and Peons. Instead of
[**Indexer**](../design/indexer.md) services are an alternative to Middle Managers and Peons. Instead of
forking separate JVM processes per-task, the Indexer runs tasks as individual threads within a single JVM process.

The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks. The Indexer is a newer feature and is currently designated [experimental](../development/experimental.md) due to the fact that its memory management system is still under
The Indexer is designed to be easier to configure and deploy compared to the Middle Manager + Peon system and to better enable resource sharing across tasks. The Indexer is a newer feature and is currently designated [experimental](../development/experimental.md) due to the fact that its memory management system is still under
development. It will continue to mature in future versions of Druid.

Typically, you would deploy either MiddleManagers or Indexers, but not both.
Typically, you would deploy either Middle Managers or Indexers, but not both.

## Colocation of services

Expand All @@ -126,11 +126,11 @@ In clusters with very high segment counts, it can make sense to separate the Coo
You can run the Coordinator and Overlord services as a single combined service by setting the `druid.coordinator.asOverlord.enabled` property.
For more information, see [Coordinator Operation](../configuration/index.md#coordinator-operation).

### Historicals and MiddleManagers
### Historicals and Middle Managers

With higher levels of ingestion or query load, it can make sense to deploy the Historical and MiddleManager services on separate hosts to to avoid CPU and memory contention.
With higher levels of ingestion or query load, it can make sense to deploy the Historical and Middle Manager services on separate hosts to to avoid CPU and memory contention.

The Historical service also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Historical and MiddleManager services separately.
The Historical service also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Historical and Middle Manager services separately.

## External dependencies

Expand Down
8 changes: 4 additions & 4 deletions docs/design/indexer.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,17 +28,17 @@ sidebar_label: "Indexer"
Its memory management system is still under development and will be significantly enhanced in later releases.
:::

The Apache Druid Indexer service is an alternative to the MiddleManager + Peon task execution system. Instead of forking a separate JVM process per-task, the Indexer runs tasks as separate threads within a single JVM process.
The Apache Druid Indexer service is an alternative to the Middle Manager + Peon task execution system. Instead of forking a separate JVM process per-task, the Indexer runs tasks as separate threads within a single JVM process.

The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks.
The Indexer is designed to be easier to configure and deploy compared to the Middle Manager + Peon system and to better enable resource sharing across tasks.

## Configuration

For Apache Druid Indexer service configuration, see [Indexer Configuration](../configuration/index.md#indexer).

## HTTP endpoints

The Indexer service shares the same HTTP endpoints as the [MiddleManager](../api-reference/service-status-api.md#middlemanager).
The Indexer service shares the same HTTP endpoints as the [Middle Manager](../api-reference/service-status-api.md#middle-manager).

## Running

Expand Down Expand Up @@ -73,7 +73,7 @@ This global limit is evenly divided across the number of task slots configured b

To apply the per-task heap limit, the Indexer overrides `maxBytesInMemory` in task tuning configurations, that is ignoring the default value or any user configured value. It also overrides `maxRowsInMemory` to an essentially unlimited value: the Indexer does not support row limits.

By default, `druid.worker.globalIngestionHeapLimitBytes` is set to 1/6th of the available JVM heap. This default is chosen to align with the default value of `maxBytesInMemory` in task tuning configs when using the MiddleManager + Peon system, which is also 1/6th of the JVM heap.
By default, `druid.worker.globalIngestionHeapLimitBytes` is set to 1/6th of the available JVM heap. This default is chosen to align with the default value of `maxBytesInMemory` in task tuning configs when using the Middle Manager + Peon system, which is also 1/6th of the JVM heap.

The peak usage for rows held in heap memory relates to the interaction between the `maxBytesInMemory` and `maxPendingPersists` properties in the task tuning configs. When the amount of row data held in-heap by a task reaches the limit specified by `maxBytesInMemory`, a task will persist the in-heap row data. After the persist has been started, the task can again ingest up to `maxBytesInMemory` bytes worth of row data while the persist is running.

Expand Down
4 changes: 2 additions & 2 deletions docs/design/indexing-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ The Apache Druid indexing service is a highly-available, distributed service tha

Indexing [tasks](../ingestion/tasks.md) are responsible for creating and [killing](../ingestion/tasks.md#kill) Druid [segments](../design/segments.md).

The indexing service is composed of three main components: [Peons](../design/peons.md) that can run a single task, [MiddleManagers](../design/middlemanager.md) that manage Peons, and an [Overlord](../design/overlord.md) that manages task distribution to MiddleManagers.
Overlords and MiddleManagers may run on the same process or across multiple processes, while MiddleManagers and Peons always run on the same process.
The indexing service is composed of three main components: [Peons](../design/peons.md) that can run a single task, [Middle Managers](../design/middlemanager.md) that manage Peons, and an [Overlord](../design/overlord.md) that manages task distribution to Middle Managers.
Overlords and Middle Managers may run on the same process or across multiple processes, while Middle Managers and Peons always run on the same process.

Tasks are managed using API endpoints on the Overlord service. Please see [Tasks API](../api-reference/tasks-api.md) for more information.

Expand Down
2 changes: 1 addition & 1 deletion docs/design/metadata-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ parameters across the cluster at runtime.

### Task-related tables

Task-related tables are created and used by the [Overlord](../design/overlord.md) and [MiddleManager](../design/middlemanager.md) when managing tasks.
Task-related tables are created and used by the [Overlord](../design/overlord.md) and [Middle Manager](../design/middlemanager.md) when managing tasks.

### Audit table

Expand Down
14 changes: 7 additions & 7 deletions docs/design/middlemanager.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
id: middlemanager
title: "MiddleManager service"
sidebar_label: "MiddleManager"
title: "Middle Manager service"
sidebar_label: "Middle Manager"
---

<!--
Expand All @@ -23,18 +23,18 @@ sidebar_label: "MiddleManager"
~ under the License.
-->

The MiddleManager service is a worker service that executes submitted tasks. MiddleManagers forward tasks to [Peons](../design/peons.md) that run in separate JVMs.
Druid uses separate JVMs for tasks to isolate resources and logs. Each Peon is capable of running only one task at a time, wheres a MiddleManager may have multiple Peons.
The Middle Manager service is a worker service that executes submitted tasks. Middle Managers forward tasks to [Peons](../design/peons.md) that run in separate JVMs.
Druid uses separate JVMs for tasks to isolate resources and logs. Each Peon is capable of running only one task at a time, whereas a Middle Manager may have multiple Peons.

## Configuration

For Apache Druid MiddleManager service configuration, see [MiddleManager and Peons](../configuration/index.md#middlemanager-and-peons).
For Apache Druid Middle Manager service configuration, see [Middle Manager and Peons](../configuration/index.md#middle-manager-and-peon).

For basic tuning guidance for the MiddleManager service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#middlemanager).
For basic tuning guidance for the Middle Manager service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#middle-manager).

## HTTP endpoints

For a list of API endpoints supported by the MiddleManager, see the [Service status API reference](../api-reference/service-status-api.md#middlemanager).
For a list of API endpoints supported by the Middle Manager, see the [Service status API reference](../api-reference/service-status-api.md#middle-manager).

## Running

Expand Down
10 changes: 5 additions & 5 deletions docs/design/overlord.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ sidebar_label: "Overlord"


The Overlord service is responsible for accepting tasks, coordinating task distribution, creating locks around tasks, and returning statuses to callers. The Overlord can be configured to run in one of two modes - local or remote (local being default).
In local mode, the Overlord is also responsible for creating Peons for executing tasks. When running the Overlord in local mode, all MiddleManager and Peon configurations must be provided as well.
Local mode is typically used for simple workflows. In remote mode, the Overlord and MiddleManager are run in separate services and you can run each on a different server.
In local mode, the Overlord is also responsible for creating Peons for executing tasks. When running the Overlord in local mode, all Middle Manager and Peon configurations must be provided as well.
Local mode is typically used for simple workflows. In remote mode, the Overlord and Middle Manager are run in separate services and you can run each on a different server.
This mode is recommended if you intend to use the indexing service as the single endpoint for all Druid indexing.

## Configuration
Expand All @@ -41,7 +41,7 @@ For a list of API endpoints supported by the Overlord, please see the [Service s

## Blacklisted workers

If a MiddleManager has task failures above a threshold, the Overlord will blacklist these MiddleManagers. No more than 20% of the MiddleManagers can be blacklisted. Blacklisted MiddleManagers will be periodically whitelisted.
If a Middle Manager has task failures above a threshold, the Overlord will blacklist these Middle Managers. No more than 20% of the Middle Managers can be blacklisted. Blacklisted Middle Managers will be periodically whitelisted.

The following variables can be used to set the threshold and blacklist timeouts.

Expand All @@ -54,6 +54,6 @@ druid.indexer.runner.maxPercentageBlacklistWorkers

## Autoscaling

The autoscaling mechanisms currently in place are tightly coupled with our deployment infrastructure but the framework should be in place for other implementations. We are highly open to new implementations or extensions of the existing mechanisms. In our own deployments, MiddleManager services are Amazon AWS EC2 nodes and they are provisioned to register themselves in a [galaxy](https://github.com/ning/galaxy) environment.
The autoscaling mechanisms currently in place are tightly coupled with our deployment infrastructure but the framework should be in place for other implementations. We are highly open to new implementations or extensions of the existing mechanisms. In our own deployments, Middle Manager services are Amazon AWS EC2 nodes and they are provisioned to register themselves in a [galaxy](https://github.com/ning/galaxy) environment.

If autoscaling is enabled, new MiddleManagers may be added when a task has been in pending state for too long. MiddleManagers may be terminated if they have not run any tasks for a period of time.
If autoscaling is enabled, new Middle Managers may be added when a task has been in pending state for too long. Middle Managers may be terminated if they have not run any tasks for a period of time.
8 changes: 4 additions & 4 deletions docs/design/peons.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,22 @@ sidebar_label: "Peon"
~ under the License.
-->

The Peon service is a task execution engine spawned by the MiddleManager. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the MiddleManager that spawned them.
The Peon service is a task execution engine spawned by the Middle Manager. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the Middle Manager that spawned them.

## Configuration

For Apache Druid Peon configuration, see [Peon Query Configuration](../configuration/index.md#peon-query-configuration) and [Additional Peon Configuration](../configuration/index.md#additional-peon-configuration).

For basic tuning guidance for MiddleManager tasks, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#task-configurations).
For basic tuning guidance for Middle Manager tasks, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#task-configurations).

## HTTP endpoints

Peons run a single task in a single JVM. The MiddleManager is responsible for creating Peons for running tasks.
Peons run a single task in a single JVM. The Middle Manager is responsible for creating Peons for running tasks.
Peons should rarely run on their own.

## Running

The Peon should seldom run separately from the MiddleManager, except for development purposes.
The Peon should seldom run separately from the Middle Manager, except for development purposes.

```
org.apache.druid.cli.Main internal peon <task_file> <status_file>
Expand Down
Loading

0 comments on commit b4d83a8

Please sign in to comment.