Skip to content

Commit ff661ee

Browse files
committed
Merge tag 'cgroup-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - cpuset changes: - Continue separating v1 and v2 implementations by moving more v1-specific logic into cpuset-v1.c - Improve partition handling. Sibling partitions are no longer invalidated on cpuset.cpus conflict, cpuset.cpus changes no longer fail in v2, and effective_xcpus computation is made consistent - Fix partition effective CPUs overlap that caused a warning on cpuset removal when sibling partitions shared CPUs - Increase the maximum cgroup subsystem count from 16 to 32 to accommodate future subsystem additions - Misc cleanups and selftest improvements including switching to css_is_online() helper, removing dead code and stale documentation references, using lockdep_assert_cpuset_lock_held() consistently, and adding polling helpers for asynchronously updated cgroup statistics * tag 'cgroup-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: fix overlap of partition effective CPUs cgroup: increase maximum subsystem count from 16 to 32 cgroup: Remove stale cpu.rt.max reference from documentation cpuset: replace direct lockdep_assert_held() with lockdep_assert_cpuset_lock_held() cgroup/cpuset: Move the v1 empty cpus/mems check to cpuset1_validate_change() cgroup/cpuset: Don't invalidate sibling partitions on cpuset.cpus conflict cgroup/cpuset: Don't fail cpuset.cpus change in v2 cgroup/cpuset: Consistently compute effective_xcpus in update_cpumasks_hier() cgroup/cpuset: Streamline rm_siblings_excl_cpus() cpuset: remove dead code in cpuset-v1.c cpuset: remove v1-specific code from generate_sched_domains cpuset: separate generate_sched_domains for v1 and v2 cpuset: move update_domain_attr_tree to cpuset_v1.c cpuset: add cpuset1_init helper for v1 initialization cpuset: add cpuset1_online_css helper for v1-specific operations cpuset: add lockdep_assert_cpuset_lock_held helper cpuset: Remove unnecessary checks in rebuild_sched_domains_locked cgroup: switch to css_is_online() helper selftests: cgroup: Replace sleep with cg_read_key_long_poll() for waiting on nr_dying_descendants selftests: cgroup: make test_memcg_sock robust against delayed sock stats ...
2 parents 9bdc648 + 8b1f3c5 commit ff661ee

File tree

20 files changed

+593
-475
lines changed

20 files changed

+593
-475
lines changed

Documentation/admin-guide/cgroup-v2.rst

Lines changed: 27 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -737,9 +737,6 @@ combinations are invalid and should be rejected. Also, if the
737737
resource is mandatory for execution of processes, process migrations
738738
may be rejected.
739739

740-
"cpu.rt.max" hard-allocates realtime slices and is an example of this
741-
type.
742-
743740

744741
Interface Files
745742
===============
@@ -2561,10 +2558,10 @@ Cpuset Interface Files
25612558
Users can manually set it to a value that is different from
25622559
"cpuset.cpus". One constraint in setting it is that the list of
25632560
CPUs must be exclusive with respect to "cpuset.cpus.exclusive"
2564-
of its sibling. If "cpuset.cpus.exclusive" of a sibling cgroup
2565-
isn't set, its "cpuset.cpus" value, if set, cannot be a subset
2566-
of it to leave at least one CPU available when the exclusive
2567-
CPUs are taken away.
2561+
and "cpuset.cpus.exclusive.effective" of its siblings. Another
2562+
constraint is that it cannot be a superset of "cpuset.cpus"
2563+
of its sibling in order to leave at least one CPU available to
2564+
that sibling when the exclusive CPUs are taken away.
25682565

25692566
For a parent cgroup, any one of its exclusive CPUs can only
25702567
be distributed to at most one of its child cgroups. Having an
@@ -2584,9 +2581,9 @@ Cpuset Interface Files
25842581
of this file will always be a subset of its parent's
25852582
"cpuset.cpus.exclusive.effective" if its parent is not the root
25862583
cgroup. It will also be a subset of "cpuset.cpus.exclusive"
2587-
if it is set. If "cpuset.cpus.exclusive" is not set, it is
2588-
treated to have an implicit value of "cpuset.cpus" in the
2589-
formation of local partition.
2584+
if it is set. This file should only be non-empty if either
2585+
"cpuset.cpus.exclusive" is set or when the current cpuset is
2586+
a valid partition root.
25902587

25912588
cpuset.cpus.isolated
25922589
A read-only and root cgroup only multiple values file.
@@ -2618,20 +2615,33 @@ Cpuset Interface Files
26182615
There are two types of partitions - local and remote. A local
26192616
partition is one whose parent cgroup is also a valid partition
26202617
root. A remote partition is one whose parent cgroup is not a
2621-
valid partition root itself. Writing to "cpuset.cpus.exclusive"
2622-
is optional for the creation of a local partition as its
2623-
"cpuset.cpus.exclusive" file will assume an implicit value that
2624-
is the same as "cpuset.cpus" if it is not set. Writing the
2625-
proper "cpuset.cpus.exclusive" values down the cgroup hierarchy
2626-
before the target partition root is mandatory for the creation
2627-
of a remote partition.
2618+
valid partition root itself.
2619+
2620+
Writing to "cpuset.cpus.exclusive" is optional for the creation
2621+
of a local partition as its "cpuset.cpus.exclusive" file will
2622+
assume an implicit value that is the same as "cpuset.cpus" if it
2623+
is not set. Writing the proper "cpuset.cpus.exclusive" values
2624+
down the cgroup hierarchy before the target partition root is
2625+
mandatory for the creation of a remote partition.
2626+
2627+
Not all the CPUs requested in "cpuset.cpus.exclusive" can be
2628+
used to form a new partition. Only those that were present
2629+
in its parent's "cpuset.cpus.exclusive.effective" control
2630+
file can be used. For partitions created without setting
2631+
"cpuset.cpus.exclusive", exclusive CPUs specified in sibling's
2632+
"cpuset.cpus.exclusive" or "cpuset.cpus.exclusive.effective"
2633+
also cannot be used.
26282634

26292635
Currently, a remote partition cannot be created under a local
26302636
partition. All the ancestors of a remote partition root except
26312637
the root cgroup cannot be a partition root.
26322638

26332639
The root cgroup is always a partition root and its state cannot
26342640
be changed. All other non-root cgroups start out as "member".
2641+
Even though the "cpuset.cpus.exclusive*" and "cpuset.cpus"
2642+
control files are not present in the root cgroup, they are
2643+
implicitly the same as the "/sys/devices/system/cpu/possible"
2644+
sysfs file.
26352645

26362646
When set to "root", the current cgroup is the root of a new
26372647
partition or scheduling domain. The set of exclusive CPUs is

fs/fs-writeback.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -981,7 +981,7 @@ void wbc_account_cgroup_owner(struct writeback_control *wbc, struct folio *folio
981981

982982
css = mem_cgroup_css_from_folio(folio);
983983
/* dead cgroups shouldn't contribute to inode ownership arbitration */
984-
if (!(css->flags & CSS_ONLINE))
984+
if (!css_is_online(css))
985985
return;
986986

987987
id = css->id;

include/linux/cgroup-defs.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -535,10 +535,10 @@ struct cgroup {
535535
* one which may have more subsystems enabled. Controller knobs
536536
* are made available iff it's enabled in ->subtree_control.
537537
*/
538-
u16 subtree_control;
539-
u16 subtree_ss_mask;
540-
u16 old_subtree_control;
541-
u16 old_subtree_ss_mask;
538+
u32 subtree_control;
539+
u32 subtree_ss_mask;
540+
u32 old_subtree_control;
541+
u32 old_subtree_ss_mask;
542542

543543
/* Private pointers for each registered subsystem */
544544
struct cgroup_subsys_state __rcu *subsys[CGROUP_SUBSYS_COUNT];

include/linux/cpuset.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ extern void inc_dl_tasks_cs(struct task_struct *task);
7676
extern void dec_dl_tasks_cs(struct task_struct *task);
7777
extern void cpuset_lock(void);
7878
extern void cpuset_unlock(void);
79+
extern void lockdep_assert_cpuset_lock_held(void);
7980
extern void cpuset_cpus_allowed_locked(struct task_struct *p, struct cpumask *mask);
8081
extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask);
8182
extern bool cpuset_cpus_allowed_fallback(struct task_struct *p);
@@ -196,6 +197,7 @@ static inline void inc_dl_tasks_cs(struct task_struct *task) { }
196197
static inline void dec_dl_tasks_cs(struct task_struct *task) { }
197198
static inline void cpuset_lock(void) { }
198199
static inline void cpuset_unlock(void) { }
200+
static inline void lockdep_assert_cpuset_lock_held(void) { }
199201

200202
static inline void cpuset_cpus_allowed_locked(struct task_struct *p,
201203
struct cpumask *mask)

include/linux/memcontrol.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -893,7 +893,7 @@ static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
893893
{
894894
if (mem_cgroup_disabled())
895895
return true;
896-
return !!(memcg->css.flags & CSS_ONLINE);
896+
return css_is_online(&memcg->css);
897897
}
898898

899899
void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,

include/trace/events/cgroup.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ DECLARE_EVENT_CLASS(cgroup_root,
1616

1717
TP_STRUCT__entry(
1818
__field( int, root )
19-
__field( u16, ss_mask )
19+
__field( u32, ss_mask )
2020
__string( name, root->name )
2121
),
2222

kernel/cgroup/cgroup-internal.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ struct cgroup_fs_context {
5252
bool cpuset_clone_children;
5353
bool none; /* User explicitly requested empty subsystem */
5454
bool all_ss; /* Seen 'all' option */
55-
u16 subsys_mask; /* Selected subsystems */
55+
u32 subsys_mask; /* Selected subsystems */
5656
char *name; /* Hierarchy name */
5757
char *release_agent; /* Path for release notifications */
5858
};
@@ -146,7 +146,7 @@ struct cgroup_mgctx {
146146
struct cgroup_taskset tset;
147147

148148
/* subsystems affected by migration */
149-
u16 ss_mask;
149+
u32 ss_mask;
150150
};
151151

152152
#define CGROUP_TASKSET_INIT(tset) \
@@ -235,8 +235,8 @@ int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen,
235235
void cgroup_favor_dynmods(struct cgroup_root *root, bool favor);
236236
void cgroup_free_root(struct cgroup_root *root);
237237
void init_cgroup_root(struct cgroup_fs_context *ctx);
238-
int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask);
239-
int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask);
238+
int cgroup_setup_root(struct cgroup_root *root, u32 ss_mask);
239+
int rebind_subsystems(struct cgroup_root *dst_root, u32 ss_mask);
240240
int cgroup_do_get_tree(struct fs_context *fc);
241241

242242
int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp);

kernel/cgroup/cgroup-v1.c

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
#define CGROUP_PIDLIST_DESTROY_DELAY HZ
2929

3030
/* Controllers blocked by the commandline in v1 */
31-
static u16 cgroup_no_v1_mask;
31+
static u32 cgroup_no_v1_mask;
3232

3333
/* disable named v1 mounts */
3434
static bool cgroup_no_v1_named;
@@ -1037,13 +1037,13 @@ int cgroup1_parse_param(struct fs_context *fc, struct fs_parameter *param)
10371037
static int check_cgroupfs_options(struct fs_context *fc)
10381038
{
10391039
struct cgroup_fs_context *ctx = cgroup_fc2context(fc);
1040-
u16 mask = U16_MAX;
1041-
u16 enabled = 0;
1040+
u32 mask = U32_MAX;
1041+
u32 enabled = 0;
10421042
struct cgroup_subsys *ss;
10431043
int i;
10441044

10451045
#ifdef CONFIG_CPUSETS
1046-
mask = ~((u16)1 << cpuset_cgrp_id);
1046+
mask = ~((u32)1 << cpuset_cgrp_id);
10471047
#endif
10481048
for_each_subsys(ss, i)
10491049
if (cgroup_ssid_enabled(i) && !cgroup1_ssid_disabled(i) &&
@@ -1095,7 +1095,7 @@ int cgroup1_reconfigure(struct fs_context *fc)
10951095
struct kernfs_root *kf_root = kernfs_root_from_sb(fc->root->d_sb);
10961096
struct cgroup_root *root = cgroup_root_from_kf(kf_root);
10971097
int ret = 0;
1098-
u16 added_mask, removed_mask;
1098+
u32 added_mask, removed_mask;
10991099

11001100
cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp);
11011101

@@ -1343,7 +1343,7 @@ static int __init cgroup_no_v1(char *str)
13431343
continue;
13441344

13451345
if (!strcmp(token, "all")) {
1346-
cgroup_no_v1_mask = U16_MAX;
1346+
cgroup_no_v1_mask = U32_MAX;
13471347
continue;
13481348
}
13491349

kernel/cgroup/cgroup.c

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -203,13 +203,13 @@ EXPORT_SYMBOL_GPL(cgrp_dfl_root);
203203
bool cgrp_dfl_visible;
204204

205205
/* some controllers are not supported in the default hierarchy */
206-
static u16 cgrp_dfl_inhibit_ss_mask;
206+
static u32 cgrp_dfl_inhibit_ss_mask;
207207

208208
/* some controllers are implicitly enabled on the default hierarchy */
209-
static u16 cgrp_dfl_implicit_ss_mask;
209+
static u32 cgrp_dfl_implicit_ss_mask;
210210

211211
/* some controllers can be threaded on the default hierarchy */
212-
static u16 cgrp_dfl_threaded_ss_mask;
212+
static u32 cgrp_dfl_threaded_ss_mask;
213213

214214
/* The list of hierarchy roots */
215215
LIST_HEAD(cgroup_roots);
@@ -231,10 +231,10 @@ static u64 css_serial_nr_next = 1;
231231
* These bitmasks identify subsystems with specific features to avoid
232232
* having to do iterative checks repeatedly.
233233
*/
234-
static u16 have_fork_callback __read_mostly;
235-
static u16 have_exit_callback __read_mostly;
236-
static u16 have_release_callback __read_mostly;
237-
static u16 have_canfork_callback __read_mostly;
234+
static u32 have_fork_callback __read_mostly;
235+
static u32 have_exit_callback __read_mostly;
236+
static u32 have_release_callback __read_mostly;
237+
static u32 have_canfork_callback __read_mostly;
238238

239239
static bool have_favordynmods __ro_after_init = IS_ENABLED(CONFIG_CGROUP_FAVOR_DYNMODS);
240240

@@ -472,13 +472,13 @@ static bool cgroup_is_valid_domain(struct cgroup *cgrp)
472472
}
473473

474474
/* subsystems visibly enabled on a cgroup */
475-
static u16 cgroup_control(struct cgroup *cgrp)
475+
static u32 cgroup_control(struct cgroup *cgrp)
476476
{
477477
struct cgroup *parent = cgroup_parent(cgrp);
478-
u16 root_ss_mask = cgrp->root->subsys_mask;
478+
u32 root_ss_mask = cgrp->root->subsys_mask;
479479

480480
if (parent) {
481-
u16 ss_mask = parent->subtree_control;
481+
u32 ss_mask = parent->subtree_control;
482482

483483
/* threaded cgroups can only have threaded controllers */
484484
if (cgroup_is_threaded(cgrp))
@@ -493,12 +493,12 @@ static u16 cgroup_control(struct cgroup *cgrp)
493493
}
494494

495495
/* subsystems enabled on a cgroup */
496-
static u16 cgroup_ss_mask(struct cgroup *cgrp)
496+
static u32 cgroup_ss_mask(struct cgroup *cgrp)
497497
{
498498
struct cgroup *parent = cgroup_parent(cgrp);
499499

500500
if (parent) {
501-
u16 ss_mask = parent->subtree_ss_mask;
501+
u32 ss_mask = parent->subtree_ss_mask;
502502

503503
/* threaded cgroups can only have threaded controllers */
504504
if (cgroup_is_threaded(cgrp))
@@ -1633,9 +1633,9 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
16331633
* This function calculates which subsystems need to be enabled if
16341634
* @subtree_control is to be applied while restricted to @this_ss_mask.
16351635
*/
1636-
static u16 cgroup_calc_subtree_ss_mask(u16 subtree_control, u16 this_ss_mask)
1636+
static u32 cgroup_calc_subtree_ss_mask(u32 subtree_control, u32 this_ss_mask)
16371637
{
1638-
u16 cur_ss_mask = subtree_control;
1638+
u32 cur_ss_mask = subtree_control;
16391639
struct cgroup_subsys *ss;
16401640
int ssid;
16411641

@@ -1644,7 +1644,7 @@ static u16 cgroup_calc_subtree_ss_mask(u16 subtree_control, u16 this_ss_mask)
16441644
cur_ss_mask |= cgrp_dfl_implicit_ss_mask;
16451645

16461646
while (true) {
1647-
u16 new_ss_mask = cur_ss_mask;
1647+
u32 new_ss_mask = cur_ss_mask;
16481648

16491649
do_each_subsys_mask(ss, ssid, cur_ss_mask) {
16501650
new_ss_mask |= ss->depends_on;
@@ -1848,12 +1848,12 @@ static int css_populate_dir(struct cgroup_subsys_state *css)
18481848
return ret;
18491849
}
18501850

1851-
int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask)
1851+
int rebind_subsystems(struct cgroup_root *dst_root, u32 ss_mask)
18521852
{
18531853
struct cgroup *dcgrp = &dst_root->cgrp;
18541854
struct cgroup_subsys *ss;
18551855
int ssid, ret;
1856-
u16 dfl_disable_ss_mask = 0;
1856+
u32 dfl_disable_ss_mask = 0;
18571857

18581858
lockdep_assert_held(&cgroup_mutex);
18591859

@@ -2149,7 +2149,7 @@ void init_cgroup_root(struct cgroup_fs_context *ctx)
21492149
set_bit(CGRP_CPUSET_CLONE_CHILDREN, &root->cgrp.flags);
21502150
}
21512151

2152-
int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask)
2152+
int cgroup_setup_root(struct cgroup_root *root, u32 ss_mask)
21532153
{
21542154
LIST_HEAD(tmp_links);
21552155
struct cgroup *root_cgrp = &root->cgrp;
@@ -3131,7 +3131,7 @@ void cgroup_procs_write_finish(struct task_struct *task,
31313131
put_task_struct(task);
31323132
}
31333133

3134-
static void cgroup_print_ss_mask(struct seq_file *seq, u16 ss_mask)
3134+
static void cgroup_print_ss_mask(struct seq_file *seq, u32 ss_mask)
31353135
{
31363136
struct cgroup_subsys *ss;
31373137
bool printed = false;
@@ -3496,9 +3496,9 @@ static void cgroup_finalize_control(struct cgroup *cgrp, int ret)
34963496
cgroup_apply_control_disable(cgrp);
34973497
}
34983498

3499-
static int cgroup_vet_subtree_control_enable(struct cgroup *cgrp, u16 enable)
3499+
static int cgroup_vet_subtree_control_enable(struct cgroup *cgrp, u32 enable)
35003500
{
3501-
u16 domain_enable = enable & ~cgrp_dfl_threaded_ss_mask;
3501+
u32 domain_enable = enable & ~cgrp_dfl_threaded_ss_mask;
35023502

35033503
/* if nothing is getting enabled, nothing to worry about */
35043504
if (!enable)
@@ -3541,7 +3541,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
35413541
char *buf, size_t nbytes,
35423542
loff_t off)
35433543
{
3544-
u16 enable = 0, disable = 0;
3544+
u32 enable = 0, disable = 0;
35453545
struct cgroup *cgrp, *child;
35463546
struct cgroup_subsys *ss;
35473547
char *tok;
@@ -4945,7 +4945,7 @@ bool css_has_online_children(struct cgroup_subsys_state *css)
49454945

49464946
rcu_read_lock();
49474947
css_for_each_child(child, css) {
4948-
if (child->flags & CSS_ONLINE) {
4948+
if (css_is_online(child)) {
49494949
ret = true;
49504950
break;
49514951
}
@@ -5750,7 +5750,7 @@ static void offline_css(struct cgroup_subsys_state *css)
57505750

57515751
lockdep_assert_held(&cgroup_mutex);
57525752

5753-
if (!(css->flags & CSS_ONLINE))
5753+
if (!css_is_online(css))
57545754
return;
57555755

57565756
if (ss->css_offline)
@@ -6347,7 +6347,7 @@ int __init cgroup_init(void)
63476347
struct cgroup_subsys *ss;
63486348
int ssid;
63496349

6350-
BUILD_BUG_ON(CGROUP_SUBSYS_COUNT > 16);
6350+
BUILD_BUG_ON(CGROUP_SUBSYS_COUNT > 32);
63516351
BUG_ON(cgroup_init_cftypes(NULL, cgroup_base_files));
63526352
BUG_ON(cgroup_init_cftypes(NULL, cgroup_psi_files));
63536353
BUG_ON(cgroup_init_cftypes(NULL, cgroup1_base_files));

0 commit comments

Comments
 (0)