503c358cf1
Kmem accounting of memcg is unusable now, because it lacks slab shrinker support. That means when we hit the limit we will get ENOMEM w/o any chance to recover. What we should do then is to call shrink_slab, which would reclaim old inode/dentry caches from this cgroup. This is what this patch set is intended to do. Basically, it does two things. First, it introduces the notion of per-memcg slab shrinker. A shrinker that wants to reclaim objects per cgroup should mark itself as SHRINKER_MEMCG_AWARE. Then it will be passed the memory cgroup to scan from in shrink_control->memcg. For such shrinkers shrink_slab iterates over the whole cgroup subtree under the target cgroup and calls the shrinker for each kmem-active memory cgroup. Secondly, this patch set makes the list_lru structure per-memcg. It's done transparently to list_lru users - everything they have to do is to tell list_lru_init that they want memcg-aware list_lru. Then the list_lru will automatically distribute objects among per-memcg lists basing on which cgroup the object is accounted to. This way to make FS shrinkers (icache, dcache) memcg-aware we only need to make them use memcg-aware list_lru, and this is what this patch set does. As before, this patch set only enables per-memcg kmem reclaim when the pressure goes from memory.limit, not from memory.kmem.limit. Handling memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, and it is still unclear whether we will have this knob in the unified hierarchy. This patch (of 9): NUMA aware slab shrinkers use the list_lru structure to distribute objects coming from different NUMA nodes to different lists. Whenever such a shrinker needs to count or scan objects from a particular node, it issues commands like this: count = list_lru_count_node(lru, sc->nid); freed = list_lru_walk_node(lru, sc->nid, isolate_func, isolate_arg, &sc->nr_to_scan); where sc is an instance of the shrink_control structure passed to it from vmscan. To simplify this, let's add special list_lru functions to be used by shrinkers, list_lru_shrink_count() and list_lru_shrink_walk(), which consolidate the nid and nr_to_scan arguments in the shrink_control structure. This will also allow us to avoid patching shrinkers that use list_lru when we make shrink_slab() per-memcg - all we will have to do is extend the shrink_control structure to include the target memcg and make list_lru_shrink_{count,walk} handle this appropriately. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Suggested-by: Dave Chinner <david@fromorbit.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Greg Thelen <gthelen@google.com> Cc: Glauber Costa <glommer@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
---|---|---|
.. | ||
libxfs | ||
Kconfig | ||
kmem.c | ||
kmem.h | ||
Makefile | ||
mrlock.h | ||
uuid.c | ||
uuid.h | ||
xfs_acl.c | ||
xfs_acl.h | ||
xfs_aops.c | ||
xfs_aops.h | ||
xfs_attr_inactive.c | ||
xfs_attr_list.c | ||
xfs_attr.h | ||
xfs_bit.c | ||
xfs_bmap_util.c | ||
xfs_bmap_util.h | ||
xfs_buf_item.c | ||
xfs_buf_item.h | ||
xfs_buf.c | ||
xfs_buf.h | ||
xfs_dir2_readdir.c | ||
xfs_discard.c | ||
xfs_discard.h | ||
xfs_dquot_item.c | ||
xfs_dquot_item.h | ||
xfs_dquot.c | ||
xfs_dquot.h | ||
xfs_error.c | ||
xfs_error.h | ||
xfs_export.c | ||
xfs_export.h | ||
xfs_extent_busy.c | ||
xfs_extent_busy.h | ||
xfs_extfree_item.c | ||
xfs_extfree_item.h | ||
xfs_file.c | ||
xfs_filestream.c | ||
xfs_filestream.h | ||
xfs_fsops.c | ||
xfs_fsops.h | ||
xfs_globals.c | ||
xfs_icache.c | ||
xfs_icache.h | ||
xfs_icreate_item.c | ||
xfs_icreate_item.h | ||
xfs_inode_item.c | ||
xfs_inode_item.h | ||
xfs_inode.c | ||
xfs_inode.h | ||
xfs_ioctl32.c | ||
xfs_ioctl32.h | ||
xfs_ioctl.c | ||
xfs_ioctl.h | ||
xfs_iomap.c | ||
xfs_iomap.h | ||
xfs_iops.c | ||
xfs_iops.h | ||
xfs_itable.c | ||
xfs_itable.h | ||
xfs_linux.h | ||
xfs_log_cil.c | ||
xfs_log_priv.h | ||
xfs_log_recover.c | ||
xfs_log.c | ||
xfs_log.h | ||
xfs_message.c | ||
xfs_message.h | ||
xfs_mount.c | ||
xfs_mount.h | ||
xfs_mru_cache.c | ||
xfs_mru_cache.h | ||
xfs_qm_bhv.c | ||
xfs_qm_syscalls.c | ||
xfs_qm.c | ||
xfs_qm.h | ||
xfs_quota.h | ||
xfs_quotaops.c | ||
xfs_rtalloc.c | ||
xfs_rtalloc.h | ||
xfs_stats.c | ||
xfs_stats.h | ||
xfs_super.c | ||
xfs_super.h | ||
xfs_symlink.c | ||
xfs_symlink.h | ||
xfs_sysctl.c | ||
xfs_sysctl.h | ||
xfs_sysfs.c | ||
xfs_sysfs.h | ||
xfs_trace.c | ||
xfs_trace.h | ||
xfs_trans_ail.c | ||
xfs_trans_buf.c | ||
xfs_trans_dquot.c | ||
xfs_trans_extfree.c | ||
xfs_trans_inode.c | ||
xfs_trans_priv.h | ||
xfs_trans.c | ||
xfs_trans.h | ||
xfs_xattr.c | ||
xfs.h |