Skip to content

Commit db73f08

Browse files
authored
Merge pull request #3222 from MarioRobres/scheduler_docs
f #1550: add scheduler manager introduction
2 parents 156596f + 353a777 commit db73f08

2 files changed

Lines changed: 180 additions & 4 deletions

File tree

source/installation_and_configuration/opennebula_services/scheduler/scheduler.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,12 @@
44
Scheduler Configuration
55
=======================
66

7+
The OpenNebula scheduling framework for virtual machine allocation and cluster optimization is designed to:
8+
9+
- **Perform initial placement** of VMs by enforcing capacity control, resource compatibility, affinity groups, and specific placement requirements.
10+
- **Enable cluster-wide load balancing** by dynamically generating placement plans that evenly distribute workloads across available hypervisor nodes.
11+
- **Support user-driven optimization** by allowing administrators to modify or review optimization plans, as well as define custom policies and parameters for workload management.
12+
713
There are three main components related to **OpenNebula virtual machine allocation**:
814

915
1. :ref:`scheduler_scheduler_manager`
Lines changed: 174 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,177 @@
1-
.. _scheduler_scheduler_manager:
1+
.. _scheduler_manager:
22

3-
=================
3+
=====================
44
Scheduler Manager
5-
=================
5+
=====================
66

7-
TODO
7+
The Scheduler Manager is the core component responsible for generating and handling both placement and optimization plans. It interacts with the scheduler driver, processes VM placement/optimization actions, and triggers the corresponding plan execution.
8+
9+
Schedulers Configuration
10+
------------------------
11+
Scheduler-related parameters are defined in the OpenNebula Daemon configuration file (``/etc/one/oned.conf``):
12+
13+
14+
* ``SCHED_MAD``: Defines the scheduler driver used to manage scheduling operations.
15+
* ``EXECUTABLE``: The path to the scheduler driver executable (absolute or relative to ``/usr/lib/one/mads/``).
16+
* ``ARGUMENTS``: Options passed to the driver, such as:
17+
18+
* ``-t``: Number of concurrent threads.
19+
* ``-p``: Scheduler for initial placement (e.g., ``rank`` for the default match-making algorithm).
20+
* ``-o``: Scheduler for optimizing VM allocation (e.g., ``one_drs``).
21+
22+
These parameters determine when the scheduler dispatches a placement request:
23+
24+
* ``SCHED_MAX_WND_TIME``: Maximum time (in seconds) that a scheduling window remains open.
25+
* ``SCHED_MAX_WND_LENGTH``: Maximum number of queued VMs before a placement action is triggered.
26+
27+
Retry and Timeout Settings:
28+
29+
* ``SCHED_RETRY_TIME``: Interval (in seconds) at which placement is retried for VMs that could not be allocated.
30+
* ``ACTION_TIMEOUT``: Time (in seconds) after which pending actions are marked as failed due to timeout.
31+
* ``MAX_ACTIONS_PER_HOST``: Limit on the number of concurrent actions at the host level.
32+
* ``MAX_ACTIONS_PER_CLUSTER``: Limit on the number of concurrent actions at the cluster level.
33+
34+
Rescheduling Options:
35+
36+
* ``LIVE_RESCHEDS``: Indicates whether to perform live (``1``) or cold (``0``) migrations when rescheduling.
37+
* ``COLD_MIGRATE_MODE``: Specifies the type of cold migration:
38+
39+
* ``0``: Save (default)
40+
* ``1``: Poweroff
41+
* ``2``: Poweroff-hard
42+
43+
DRS Interval:
44+
45+
* ``DRS_INTERVAL``: Time (in seconds) between Distributed Resource Scheduler actions; set to ``-1`` to disable.
46+
47+
This is an example configuration snippet from ``/etc/one/oned.conf``:
48+
49+
.. code-block:: ini
50+
51+
SCHED_MAD = [
52+
EXECUTABLE = "one_sched",
53+
ARGUMENTS = "-t 15 -p rank -o one_drs"
54+
]
55+
56+
SCHED_MAX_WND_TIME = 10
57+
SCHED_MAX_WND_LENGTH = 7
58+
59+
SCHED_RETRY_TIME = 60
60+
61+
MAX_ACTIONS_PER_HOST = 1
62+
MAX_ACTIONS_PER_CLUSTER = 30
63+
64+
ACTION_TIMEOUT = 300
65+
66+
LIVE_RESCHEDS = 0
67+
COLD_MIGRATE_MODE = 0
68+
69+
DRS_INTERVAL = -1
70+
71+
72+
Protocol
73+
--------
74+
75+
The scheduler driver implements two main actions that define the protocol for scheduling:
76+
77+
- **OPTIMIZE:** This action optimizes the workload of a cluster. Its outcomes are:
78+
79+
- **SUCCESS:** A plan is returned and attached to the cluster. The plan can be applied either automatically or manually (after user review).
80+
- **FAILURE:** The optimization cannot be computed; an error is returned and attached to the cluster.
81+
82+
**Format:** ``OPTIMIZE <CLUSTER_ID> <SCHED_ACTION_DOCUMENT>``
83+
84+
- **PLACE:** This action allocates VMs in the *PENDING* state and re-schedules VMs flagged for rescheduling.
85+
86+
- **SUCCESS:** A plan is returned and automatically executed. (Note: A successful status may include VMs that could not be allocated if free resources are lacking.)
87+
- **FAILURE:** An error is returned, and the failure is logged to inform the user.
88+
89+
**Format:** ``PLACE - <SCHED_ACTION_DOCUMENT>``
90+
91+
92+
Data Model
93+
----------
94+
95+
The scheduler driver communicates using an XML-based protocol defined in a ``<SCHED_ACTION_MESSAGE>`` which includes:
96+
97+
- **/VM_POOL/VM:** Lists all VMs matching the scheduling request:
98+
99+
- For PLACE: VMs in *PENDING* or *RESCHED* states.
100+
- For OPTIMIZE: VMs running in the specified cluster.
101+
102+
- **/HOST_POOL/HOST:** Lists hosts to consider:
103+
104+
- For PLACE: all available hosts.
105+
- For OPTIMIZE: only hosts in the target cluster.
106+
107+
- **/DATASTORE_POOL/DATASTORE:** Lists datastores:
108+
109+
- For PLACE: all datastores.
110+
- For OPTIMIZE: only cluster-associated datastores.
111+
112+
- **/VNET_POOL/VNET:** Lists virtual networks:
113+
114+
- For PLACE: all networks.
115+
- For OPTIMIZE: only cluster-associated networks.
116+
117+
- **/VMGROUP_POOL/VMGROUP:** Lists defined VM groups.
118+
119+
- **/REQUIREMENTS/VM/:** Contains placement requirements for each VM, such as:
120+
121+
- ``<ID>``: VM identifier.
122+
- ``<HOSTS>/<ID>``: IDs of eligible hosts.
123+
- ``<NIC>/<ID>``: NIC identifier.
124+
- ``<NIC>/<VNETS>/<ID>``: Virtual network ID for the NIC.
125+
- ``<DATASTORE>/<ID>``: Datastore ID.
126+
127+
- **/CLUSTER:** The cluster document, including:
128+
129+
- ``TEMPLATE/DRS``: DRS configuration (e.g. MIGRATION_THRESHOLD, POLICY, COST_FUNCTION, MODE).
130+
- ``PLAN``: The previous optimization plan (if any), which may be reused for faster re-optimization.
131+
132+
133+
The result of a scheduling action is an XML plan document. This plan specifies the operations to be executed on VMs and includes detailed information about each action.
134+
135+
- **PLAN/ID:** Cluster ID which the plan is appliad for (``-1`` ofr initial placement actions)
136+
137+
- **ACTION:** Each Plan action contains:
138+
139+
- ``VM_ID``: Identifier of the target VM.
140+
- ``OPERATION``: The operation to perform (e.g., ``deploy``, ``migrate``, ``poweroff``).
141+
- ``HOST_ID/DS_ID``: For operations like deploy and migrate, the target host and datastore are specified.
142+
- ``NIC``: (For deploy operations) Contains one or more NIC configurations with:
143+
144+
- ``NIC_ID``: Identifier of the NIC.
145+
- ``NETWORK_ID``: The associated virtual network.
146+
147+
Example of an XML Plan:
148+
149+
.. code-block:: xml
150+
151+
<PLAN>
152+
<ID>-1</ID>
153+
<ACTION>
154+
<VM_ID>23</VM_ID>
155+
<OPERATION>deploy</OPERATION>
156+
<HOST_ID>12</HOST_ID>
157+
<DS_ID>100</DS_ID>
158+
<NIC>
159+
<NIC_ID>0</NIC_ID>
160+
<NETWORK_ID>101</NETWORK_ID>
161+
</NIC>
162+
<NIC>
163+
<NIC_ID>1</NIC_ID>
164+
<NETWORK_ID>100</NETWORK_ID>
165+
</NIC>
166+
</ACTION>
167+
<ACTION>
168+
<VM_ID>24</VM_ID>
169+
<OPERATION>migrate</OPERATION>
170+
<HOST_ID>15</HOST_ID>
171+
<DS_ID>200</DS_ID>
172+
</ACTION>
173+
<ACTION>
174+
<VM_ID>25</VM_ID>
175+
<OPERATION>poweroff</OPERATION>
176+
</ACTION>
177+
</PLAN>

0 commit comments

Comments
 (0)