kernel-ark/fs/xfs
Vladimir Davydov 503c358cf1 list_lru: introduce list_lru_shrink_{count,walk}
Kmem accounting of memcg is unusable now, because it lacks slab shrinker
support.  That means when we hit the limit we will get ENOMEM w/o any
chance to recover.  What we should do then is to call shrink_slab, which
would reclaim old inode/dentry caches from this cgroup.  This is what
this patch set is intended to do.

Basically, it does two things.  First, it introduces the notion of
per-memcg slab shrinker.  A shrinker that wants to reclaim objects per
cgroup should mark itself as SHRINKER_MEMCG_AWARE.  Then it will be
passed the memory cgroup to scan from in shrink_control->memcg.  For
such shrinkers shrink_slab iterates over the whole cgroup subtree under
the target cgroup and calls the shrinker for each kmem-active memory
cgroup.

Secondly, this patch set makes the list_lru structure per-memcg.  It's
done transparently to list_lru users - everything they have to do is to
tell list_lru_init that they want memcg-aware list_lru.  Then the
list_lru will automatically distribute objects among per-memcg lists
basing on which cgroup the object is accounted to.  This way to make FS
shrinkers (icache, dcache) memcg-aware we only need to make them use
memcg-aware list_lru, and this is what this patch set does.

As before, this patch set only enables per-memcg kmem reclaim when the
pressure goes from memory.limit, not from memory.kmem.limit.  Handling
memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, and
it is still unclear whether we will have this knob in the unified
hierarchy.

This patch (of 9):

NUMA aware slab shrinkers use the list_lru structure to distribute
objects coming from different NUMA nodes to different lists.  Whenever
such a shrinker needs to count or scan objects from a particular node,
it issues commands like this:

        count = list_lru_count_node(lru, sc->nid);
        freed = list_lru_walk_node(lru, sc->nid, isolate_func,
                                   isolate_arg, &sc->nr_to_scan);

where sc is an instance of the shrink_control structure passed to it
from vmscan.

To simplify this, let's add special list_lru functions to be used by
shrinkers, list_lru_shrink_count() and list_lru_shrink_walk(), which
consolidate the nid and nr_to_scan arguments in the shrink_control
structure.

This will also allow us to avoid patching shrinkers that use list_lru
when we make shrink_slab() per-memcg - all we will have to do is extend
the shrink_control structure to include the target memcg and make
list_lru_shrink_{count,walk} handle this appropriately.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:08 -08:00
..
libxfs Merge branch 'xfs-buf-type-fixes' into for-next 2015-01-22 09:51:30 +11:00
Kconfig xfs: require 64-bit sector_t 2014-07-30 09:12:05 +10:00
kmem.c xfs: change kmem_free to use generic kvfree() 2015-02-02 09:54:18 +11:00
kmem.h xfs: change kmem_free to use generic kvfree() 2015-02-02 09:54:18 +11:00
Makefile xfs: add xfs_mount sysfs kobject 2014-07-15 08:07:01 +10:00
mrlock.h
uuid.c
uuid.h
xfs_acl.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_acl.h xfs: move acl structures to xfs_format.h 2014-11-28 14:24:37 +11:00
xfs_aops.c xfs: don't allocate an ioend for direct I/O completions 2015-02-02 10:02:09 +11:00
xfs_aops.h xfs: don't allocate an ioend for direct I/O completions 2015-02-02 10:02:09 +11:00
xfs_attr_inactive.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_attr_list.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_attr.h
xfs_bit.c xfs: fix static and extern sparse warnings 2013-10-30 13:59:56 -05:00
xfs_bmap_util.c Merge branch 'xfs-consolidate-format-defs' into for-next 2014-11-28 14:52:16 +11:00
xfs_bmap_util.h xfs: move xfs_bmap_finish prototype 2015-01-09 10:47:14 +11:00
xfs_buf_item.c Merge branch 'xfs-buf-type-fixes' into for-next 2015-01-22 09:51:30 +11:00
xfs_buf_item.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_buf.c list_lru: introduce list_lru_shrink_{count,walk} 2015-02-12 18:54:08 -08:00
xfs_buf.h xfs: split metadata and log buffer completion to separate workqueues 2014-12-04 09:43:17 +11:00
xfs_dir2_readdir.c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next 2014-12-04 09:46:17 +11:00
xfs_discard.c xfs: merge xfs_ag.h into xfs_format.h 2014-11-28 14:25:04 +11:00
xfs_discard.h
xfs_dquot_item.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_dquot_item.h xfs: remove the quotaoff log format from the quotaoff log item 2013-12-13 11:34:08 +11:00
xfs_dquot.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_dquot.h xfs: fix implicit bool to int conversion 2015-01-09 10:48:58 +11:00
xfs_error.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_error.h xfs: global error sign conversion 2014-06-25 14:58:08 +10:00
xfs_export.c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next 2014-12-04 09:46:17 +11:00
xfs_export.h
xfs_extent_busy.c xfs: merge xfs_ag.h into xfs_format.h 2014-11-28 14:25:04 +11:00
xfs_extent_busy.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_extfree_item.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_extfree_item.h
xfs_file.c Merge branch 'akpm' (patches from Andrew) 2015-02-10 16:45:56 -08:00
xfs_filestream.c xfs: merge xfs_inum.h into xfs_format.h 2014-11-28 14:27:10 +11:00
xfs_filestream.h xfs: add filestream allocator tracepoints 2014-04-23 07:11:52 +10:00
xfs_fsops.c xfs: growfs should use synchronous transactions 2015-02-05 11:13:21 +11:00
xfs_fsops.h
xfs_globals.c xfs: export log_recovery_delay to delay mount time log recovery 2014-09-09 11:56:13 +10:00
xfs_icache.c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next 2014-12-04 09:46:17 +11:00
xfs_icache.h xfs: merge xfs_ag.h into xfs_format.h 2014-11-28 14:25:04 +11:00
xfs_icreate_item.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_icreate_item.h
xfs_inode_item.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_inode_item.h xfs: remove the inode log format from the inode log item 2013-12-13 11:34:05 +11:00
xfs_inode.c Merge branch 'xfs-buf-type-fixes' into for-next 2015-01-22 09:51:30 +11:00
xfs_inode.h Merge branch 'xfs-misc-fixes-for-3.20-3' into for-next 2015-02-02 10:03:18 +11:00
xfs_ioctl32.c xfs: remove incorrect error negation in attr_multi ioctl 2015-01-22 10:04:24 +11:00
xfs_ioctl32.h xfs: compat_xfs_bstat does not have forkoff 2014-10-02 09:17:58 +10:00
xfs_ioctl.c Merge branch 'xfs-misc-fixes-for-3.20-4' into for-next 2015-02-10 09:24:25 +11:00
xfs_ioctl.h
xfs_iomap.c xfs: pass a 64-bit count argument to xfs_iomap_write_unwritten 2015-01-09 10:48:12 +11:00
xfs_iomap.h xfs: pass a 64-bit count argument to xfs_iomap_write_unwritten 2015-01-09 10:48:12 +11:00
xfs_iops.c xfs: Add support to RENAME_EXCHANGE flag 2014-12-24 08:51:42 +11:00
xfs_iops.h xfs: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
xfs_itable.c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next 2014-12-04 09:46:17 +11:00
xfs_itable.h xfs: bulkstat chunk formatting cursor is broken 2014-11-07 08:30:30 +11:00
xfs_linux.h xfs: merge xfs_dinode.h into xfs_format.h 2014-11-28 14:24:06 +11:00
xfs_log_cil.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_log_priv.h xfs: add xlog sysfs kobject and attribute handlers 2014-07-15 08:07:29 +10:00
xfs_log_recover.c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next 2014-12-04 09:46:17 +11:00
xfs_log.c Merge branch 'xfs-sb-logging-rework' into for-next 2015-01-22 09:20:53 +11:00
xfs_log.h xfs: log vector rounding leaks log space 2014-05-20 08:18:09 +10:00
xfs_message.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_message.h
xfs_mount.c xfs: sanitise sb_bad_features2 handling 2015-01-22 09:10:33 +11:00
xfs_mount.h xfs: consolidate superblock logging functions 2015-01-22 09:10:31 +11:00
xfs_mru_cache.c xfs: mark all internal workqueues as freezable 2014-09-09 11:44:46 +10:00
xfs_mru_cache.h xfs: embedd mru_elem into parent structure 2014-04-23 07:11:51 +10:00
xfs_qm_bhv.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_qm_syscalls.c xfs: update for 3.20-rc1 2015-02-10 16:15:17 -08:00
xfs_qm.c list_lru: introduce list_lru_shrink_{count,walk} 2015-02-12 18:54:08 -08:00
xfs_qm.h xfs: update for 3.20-rc1 2015-02-10 16:15:17 -08:00
xfs_quota.h xfs: split dquot buffer operations out 2013-10-23 14:28:35 -05:00
xfs_quotaops.c quota: Split ->set_xstate callback into two 2015-01-30 12:49:40 +01:00
xfs_rtalloc.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_rtalloc.h xfs: combine xfs_rtmodify_summary and xfs_rtget_summary 2014-09-09 11:58:42 +10:00
xfs_stats.c xfs: support the XFS_BTNUM_FINOBT free inode btree type 2014-04-24 16:00:52 +10:00
xfs_stats.h xfs: support the XFS_BTNUM_FINOBT free inode btree type 2014-04-24 16:00:52 +10:00
xfs_super.c Merge branch 'xfs-misc-fixes-for-3.20-4' into for-next 2015-02-10 09:24:25 +11:00
xfs_super.h xfs: require 64-bit sector_t 2014-07-30 09:12:05 +10:00
xfs_symlink.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_symlink.h xfs: push down inactive transaction mgmt for remote symlinks 2013-10-08 14:53:02 -05:00
xfs_sysctl.c xfs: remove deprecated sysctls 2015-01-09 10:47:43 +11:00
xfs_sysctl.h xfs: export log_recovery_delay to delay mount time log recovery 2014-09-09 11:56:13 +10:00
xfs_sysfs.c xfs: export log_recovery_delay to delay mount time log recovery 2014-09-09 11:56:13 +10:00
xfs_sysfs.h xfs: add debug sysfs attribute set 2014-09-09 11:52:42 +10:00
xfs_trace.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_trace.h xfs: introduce xfs_buf_submit[_wait] 2014-10-02 09:05:14 +10:00
xfs_trans_ail.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_trans_buf.c xfs: only trace buffer items if they exist 2015-02-10 09:23:40 +11:00
xfs_trans_dquot.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_trans_extfree.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_trans_inode.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_trans_priv.h xfs: remove unused ail pointer arg from xfs_trans_ail_cursor_done() 2014-04-14 19:06:05 +10:00
xfs_trans.c xfs: set superblock buffer type correctly 2015-01-22 09:30:23 +11:00
xfs_trans.h xfs: format log items write directly into the linear CIL buffer 2013-12-13 11:34:02 +11:00
xfs_xattr.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs.h