@@ -341,16 +341,13 @@ Working with the scheduler is difficult. Challenges include:
341341 later. For example, data unpacked from the CIB can safely be used anytime
342342 after ``unpack_cib(), `` but actions may become optional or required anytime
343343 before ``pcmk__create_graph() ``. There's no easy way to deal with this.
344- * Many names of struct members, functions, etc., are suboptimal, but are part
345- of the public API and cannot be changed until an API backward compatibility
346- break.
347344
348345
349346.. index ::
350347 single: pcmk_scheduler_t
351348
352- Cluster Working Set
353- ___________________
349+ The Scheduler Object
350+ ____________________
354351
355352The main data object for the scheduler is ``pcmk_scheduler_t ``, which contains
356353all information needed about nodes, resources, constraints, etc., both as the
@@ -363,18 +360,21 @@ transition graph XML. The variable name is usually ``scheduler``.
363360Resources
364361_________
365362
366- ``pcmk_resource_t `` is the data object representing cluster resources. A
367- resource has a variant: :term: `primitive `, group, clone, or :term: `bundle `.
363+ ``pcmk_resource_t `` is the data object representing cluster resources. It has a
364+ couple of public members for backward compatibility reasons, but most of the
365+ implementation is in the internal ``pcmk__resource_private_t `` type.
368366
369- The resource object has members for two sets of methods,
370- ``resource_object_functions_t `` from the ``libpe_status `` public API, and
371- ``resource_alloc_functions_t `` whose implementation is internal to
367+ A resource has a variant: :term: `primitive `, group, clone, or :term: `bundle `.
368+
369+ The private resource object has members for two sets of methods,
370+ ``pcmk__rsc_methods_t `` from ``libcrmcommon ``, and
371+ ``pcmk__assignment_methods_t `` whose implementation is internal to
372372``libpacemaker ``. The actual functions vary by variant.
373373
374- The object functions have basic capabilities such as unpacking the resource
374+ The resource methods have basic capabilities such as unpacking the resource
375375XML, and determining the current or planned location of the resource.
376376
377- The :term: `assignment <assign> ` functions have more obscure capabilities needed
377+ The :term: `assignment <assign> ` methods have more obscure capabilities needed
378378for scheduling, such as processing location and ordering constraints. For
379379example, ``pcmk__create_internal_constraints() `` simply calls the
380380``internal_constraints() `` method for each top-level resource in the cluster.
@@ -390,25 +390,33 @@ with the highest :term:`score` for a given resource. The scheduler does a bunch
390390of processing to generate the scores, then the actual assignment is
391391straightforward.
392392
393+ The scheduler node implementation is a little confusing.
394+
395+ ``pcmk_node_t `` (``struct pcmk__scored_node ``) is the primary object used.
396+
397+ It contains two sub-structs, ``pcmk__node_private_t *priv `` (which is internal)
398+ and ``struct pcmk__node_details *details `` (which is public for backward
399+ compatibility reasons), that contain all node information that is independent
400+ of resource assignment (the node name, etc.).
401+
402+ It contains one other (internal) sub-struct, ``struct pcmk__node_assignment
403+ *assign ``, which contains information particular to a specific resource being
404+ assigned.
405+
393406Node lists are frequently used. For example, ``pcmk_scheduler_t `` has a
394- ``nodes `` member which is a list of all nodes in the cluster, and
395- ``pcmk_resource_t `` has a ``running_on `` member which is a list of all nodes on
396- which the resource is (or might be) active. These are lists of ``pcmk_node_t ``
397- objects.
407+ ``nodes `` member which is a list of all nodes in the cluster, and the internal
408+ resource object has an ``active_nodes `` member which is a list of all nodes on
409+ which the resource is (or might be) active.
398410
399- The ``pcmk_node_t `` object contains a ``struct pe_node_shared_s *details ``
400- member with all node information that is independent of resource assignment
401- (the node name, etc.).
411+ Only the scheduler's ``nodes `` list has the full, original node instances. All
412+ other node lists have shallow copies created by ``pe__copy_node() ``, which
413+ share ``details `` and ``priv `` from the main list (but can differ in their
414+ ``assign `` member).
402415
403- The working set's ``nodes `` member contains the original of this information.
404- All other node lists contain copies of ``pcmk_node_t `` where only the
405- ``details `` member points to the originals in the working set's ``nodes `` list.
406- In this way, the other members of ``pcmk_node_t `` (such as ``weight ``, which is
407- the node score) may vary by node list, while the common details are shared.
408416
409417.. index ::
410418 single: pcmk_action_t
411- single: pe_action_flags
419+ single: pcmk__action_flags
412420
413421Actions
414422_______
@@ -418,16 +426,16 @@ taken. These could be resource actions, cluster-wide actions such as fencing a
418426node, or "pseudo-actions" which are abstractions used as convenient points for
419427ordering other actions against.
420428
421- It has a ``flags `` member which is a bitmask of `` enum pe_action_flags ``. The
422- most important of these are `` pe_action_runnable `` (if not set, the action is
423- "blocked" and cannot be added to the transition graph) and
424- `` pe_action_optional `` (actions with this set will not be added to the
425- transition graph; actions often start out as optional, and may become required
426- later).
429+ Its (internal) implementation has a ``flags `` member which is a bitmask of
430+ `` enum pcmk__action_flags ``. The most important of these are
431+ `` pcmk__action_runnable `` (if not set, the action is "blocked" and cannot be
432+ added to the transition graph) and `` pcmk__action_optional `` (actions with this
433+ set will not be added to the transition graph; actions often start out as
434+ optional, and may become required later).
427435
428436
429437.. index ::
430- single: pe__colocation_t
438+ single: pcmk__colocation_t
431439
432440Colocations
433441___________
@@ -462,30 +470,45 @@ The resource assignment functions have several methods related to colocations:
462470
463471
464472.. index ::
465- single: pe__ordering_t
466- single: pe_ordering
473+ single: pcmk__action_relation_t
474+ single: action; relation
467475
468- Orderings
469- _________
476+ Action Relations
477+ ________________
470478
471479Ordering constraints are simple in concept, but they are one of the most
472480important, powerful, and difficult to follow aspects of the scheduler code.
473481
474- ``pe__ordering_t `` is the data object representing an ordering, better thought
475- of as a relationship between two actions, since the relation can be more
476- complex than just "this one runs after that one".
482+ ``pcmk__action_relation_t `` is the data object representing an ordering, better
483+ thought of as a relationship between two actions, since the relation can be
484+ more complex than just "this one runs after that one".
477485
478- For an ordering "A then B", the code generally refers to A as "first" or
486+ For a relation "A then B", the code generally refers to A as "first" or
479487"before", and B as "then" or "after".
480488
481- Much of the power comes from ``enum pe_ordering ``, which are flags that
482- determine how an ordering behaves. There are many obscure flags with big
483- effects. A few examples:
484-
485- * ``pe_order_none `` means the ordering is disabled and will be ignored. It's 0,
486- meaning no flags set, so it must be compared with equality rather than
487- ``pcmk_is_set() ``.
488- * ``pe_order_optional `` means the ordering does not make either action
489- required, so it only applies if they both become required for other reasons.
490- * ``pe_order_implies_first `` means that if action B becomes required for any
491- reason, then action A will become required as well.
489+ Much of the power comes from ``enum pcmk__action_relation_flags ``, which are
490+ flags that determine how a relation behaves. There are many obscure flags with
491+ big effects. A few examples:
492+
493+ * ``pcmk__ar_none `` means the relation is disabled and will be ignored. The
494+ value is 0, meaning no flags set, so it must be compared with equality rather
495+ than ``pcmk_is_set() ``.
496+ * ``pcmk__ar_ordered `` without any other flags set means the relation does not
497+ make either action required, so it applies only if they both become required
498+ for other reasons.
499+ * ``pcmk__ar_then_implies_first `` means that if action B becomes required for
500+ any reason, then action A will become required as well.
501+
502+ Adding a New Scheduler Regression Test
503+ ______________________________________
504+
505+ #. Choose a test name.
506+ #. Copy the uncompressed input CIB to cts/scheduler/xml/TESTNAME.xml. It's
507+ helpful to add an XML comment at the top describing the essential features of
508+ the test (which configuration and status scenarios are being tested).
509+ #. Edit ``cts/cts-scheduler.in `` and add the test name and description to the
510+ ``TESTS `` array.
511+ #. Run ``cts/cts-scheduler --update --run TESTNAME `` to generate the expected
512+ transition graph, scores, etc. Look over the generated files to make sure
513+ they are as expected.
514+ #. Commit your changes.
0 commit comments