Where we can pass in LOOKUP_DIRECTORY or LOOKUP_REVAL. Any other flags
passed in here are currently ignored.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This function is expected to be called from path-based syscalls to help
them decide whether to try the lookup and call again in the event that
they got an -ESTALE return back on an earier try.
Currently, we only retry the call once on an ESTALE error, but in the
event that we decide that that's not enough in the future, we should be
able to change the logic in this helper without too much effort.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit 8e22cc88d6 removes the (un)lock_super
function definitions but forgets to remove their prototypes.
Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mark as cancelled an operation that is in progress rather than pending at the
time it is cancelled, and call fscache_complete_op() to cancel an operation so
that blocked ops can be started.
Signed-off-by: David Howells <dhowells@redhat.com>
Convert the fscache_object event IDs from #defines into an enum. Also add an
extra label to the enum to carry the event count and redefine the event mask
in terms of that.
Signed-off-by: David Howells <dhowells@redhat.com>
Make a more complete truncate operation available to CacheFiles (including
security checks and suchlike) so that it can use this to clear invalidated
cache files.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Provide a proper invalidation method rather than relying on the netfs retiring
the cookie it has and getting a new one. The problem with this is that isn't
easy for the netfs to make sure that it has completed/cancelled all its
outstanding storage and retrieval operations on the cookie it is retiring.
Instead, have the cache provide an invalidation method that will cancel or wait
for all currently outstanding operations before invalidating the cache, and
will cause new operations to queue up behind that. Whilst invalidation is in
progress, some requests will be rejected until the cache can stack a barrier on
the operation queue to cause new operations to be deferred behind it.
Signed-off-by: David Howells <dhowells@redhat.com>
Fix the state management of internal fscache operations and the accounting of
what operations are in what states.
This is done by:
(1) Give struct fscache_operation a enum variable that directly represents the
state it's currently in, rather than spreading this knowledge over a bunch
of flags, who's processing the operation at the moment and whether it is
queued or not.
This makes it easier to write assertions to check the state at various
points and to prevent invalid state transitions.
(2) Add an 'operation complete' state and supply a function to indicate the
completion of an operation (fscache_op_complete()) and make things call
it. The final call to fscache_put_operation() can then check that an op
in the appropriate state (complete or cancelled).
(3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
govern the state of an object:
(a) The ->n_ops is now the number of extant operations on the object
and is now decremented by fscache_put_operation() only.
(b) The ->n_in_progress is simply the number of objects that have been
taken off of the object's pending queue for the purposes of being
run. This is decremented by fscache_op_complete() only.
(c) The ->n_exclusive is the number of exclusive ops that have been
submitted and queued or are in progress. It is decremented by
fscache_op_complete() and by fscache_cancel_op().
fscache_put_operation() and fscache_operation_gc() now no longer try to
clean up ->n_exclusive and ->n_in_progress. That was leading to double
decrements against fscache_cancel_op().
fscache_cancel_op() now no longer decrements ->n_ops. That was leading to
double decrements against fscache_put_operation().
fscache_submit_exclusive_op() now decides whether it has to queue an op
based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
will persist in being true even after all preceding operations have been
cancelled or completed. Furthermore, if an object is active and there are
runnable ops against it, there must be at least one op running.
(4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
provide a function to record completion of the pages as they complete.
When n_pages reaches 0, the operation is deemed to be complete and
fscache_op_complete() is called.
Add calls to fscache_retrieval_complete() anywhere we've finished with a
page we've been given to read or allocate for. This includes places where
we just return pages to the netfs for reading from the server and where
accessing the cache fails and we discard the proposed netfs page.
The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs. This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.
FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY
RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
[<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
[<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
[<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
[<ffffffff810d8d47>] evict+0xa1/0x15c
[<ffffffff810d8e2e>] dispose_list+0x2c/0x38
[<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
[<ffffffff810c56b7>] prune_super+0xd5/0x140
[<ffffffff8109b615>] shrink_slab+0x102/0x1ab
[<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
[<ffffffff8103e009>] ? process_timeout+0xb/0xb
[<ffffffff8109dba3>] kswapd+0x270/0x289
[<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
[<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
[<ffffffff8104bf7a>] kthread+0x7f/0x87
[<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
[<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
[<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
[<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
[<ffffffff813ad6b0>] ? gs_change+0xb/0xb
Signed-off-by: David Howells <dhowells@redhat.com>
Make fscache_relinquish_cookie() log a warning and wait if there are any
outstanding reads left on the cookie it was given.
Signed-off-by: David Howells <dhowells@redhat.com>
Under some circumstances CacheFiles defers the marking of pages with PG_fscache
so that it can take advantage of pagevecs to reduce the number of calls to
fscache_mark_pages_cached() and the netfs's hook to keep track of this.
There are, however, two problems with this:
(1) It can lead to the PG_fscache mark being applied _after_ the page is set
PG_uptodate and unlocked (by the call to fscache_end_io()).
(2) CacheFiles's ref on the page is dropped immediately following
fscache_end_io() - and so may not still be held when the mark is applied.
This can lead to the page being passed back to the allocator before the
mark is applied.
Fix this by, where appropriate, marking the page before calling
fscache_end_io() and releasing the page. This means that we can't take
advantage of pagevecs and have to make a separate call for each page to the
marking routines.
The symptoms of this are Bad Page state errors cropping up under memory
pressure, for example:
BUG: Bad page state in process tar pfn:002da
page:ffffea0000009fb0 count:0 mapcount:0 mapping: (null) index:0x1447
page flags: 0x1000(private_2)
Pid: 4574, comm: tar Tainted: G W 3.1.0-rc4-fsdevel+ #1064
Call Trace:
[<ffffffff8109583c>] ? dump_page+0xb9/0xbe
[<ffffffff81095916>] bad_page+0xd5/0xea
[<ffffffff81095d82>] get_page_from_freelist+0x35b/0x46a
[<ffffffff810961f3>] __alloc_pages_nodemask+0x362/0x662
[<ffffffff810989da>] __do_page_cache_readahead+0x13a/0x267
[<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
[<ffffffff81098d7b>] ra_submit+0x1c/0x20
[<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
[<ffffffff81098ee2>] ? ondemand_readahead+0x163/0x29a
[<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
[<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
[<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
[<ffffffff810c22c4>] do_sync_read+0xba/0xfa
[<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
[<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
[<ffffffff810c29a4>] vfs_read+0xaa/0x13a
[<ffffffff810c2a79>] sys_read+0x45/0x6c
[<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b
As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.
Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
set appropriately showed that sometimes it wasn't. This led to the discovery
that sometimes the page has apparently been reclaimed by the time the marker
got to see it.
Reported-by: M. Stevens <m@tippett.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
The code that relied on that flag was ripped out of btrfs quite some
time ago, and never added back. Josef indicated that he was going to
take a different approach to the problem in btrfs, and that we
could just eliminate this flag.
Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up the
virtio add_buf interface while DaveM accepted the mq virtio-net patches.
You can see my solution in my pending-rebases branch, if that helps, but I
know you love merging:
https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQz/vKAAoJENkgDmzRrbjx+eYQAK/egj9T8Nnth6mkzdbCFSO7
Bciga2hDiudGCiGojTRGPRSc0VP9LgfvPbY2pxX+R9CfEqR+a8q/rRQhCS79ZwPB
/mJy3HNiCx418HZxgwNtk6vPe0PjJm6SsjbXeB9hB+PQLCbdwA0BjpG6xjF/jitP
noPqhhXreeQgYVxAKoFPvff/Byu2GlNnDdVMQxWRmo8hTKlTCzl0T/7BHRxthhJj
iOrXTFzrT/osPT0zyqlngT03T4wlBvL2Bfw8d/kuRPEZ71dpIctWeH2KzdwXVCrz
hFQGxAz4OWvW3xrNwj7c6O3SWj4VemUMjQqeA/PtRiOEI5gM0Y/Bit47dWL4wM/O
OWUKFHzq4DFs8MmwXBgDDXl5xOjOBH9Ik4FZayn3Y7COT/B8CjFdOC2MdDGmZ9yd
NInumg7FqP+u12g+9Vq8S/b0cfoQm4qFe8VHiPJu+jRmCZglyvLjk7oq/QwW8Gaq
Pkzit1Ey0DWo2KvZ4D/nuXJCuhmzN/AJ10M48lLYZhtOIVg9gsa0xjhfgq4FnvSK
xFCf3rcWnlGIXcOYh/hKU25WaCLzBuqMuSK35A72IujrQOL7OJTk4Oqote3Z3H9B
08XJmyW6SOZdfw17X4Im1jbyuLek///xQJ9Jw/tya7j9lBt8zjJ+FmLPs4mLGEOm
WJv9uZPs+QbIMNky2Lcb
=myDR
-----END PGP SIGNATURE-----
Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio update from Rusty Russell:
"Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up
the virtio add_buf interface while DaveM accepted the mq virtio-net
patches."
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
virtio_console: Add support for remoteproc serial
virtio_console: Merge struct buffer_token into struct port_buffer
virtio: add drv_to_virtio to make code clearly
virtio: use dev_to_virtio wrapper in virtio
virtio-mmio: Fix irq parsing in command line parameter
virtio_console: Free buffers from out-queue upon close
virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
virtio_console: Use kmalloc instead of kzalloc
virtio_console: Free buffer if splice fails
virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
virtio: console: don't rely on virtqueue_add_buf() returning capacity.
virtio_net: don't rely on virtqueue_add_buf() returning capacity.
virtio-net: remove unused skb_vnet_hdr->num_sg field
virtio-net: correct capacity math on ring full
virtio: move queue_index and num_free fields into core struct virtqueue.
...
This is a batch of fixes for arm-soc platforms, most of it is for OMAP
but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
broken platforms and drivers, etc. A bit all over the map really.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0BBZAAoJEIwa5zzehBx3WwgP/jS31XauTUrGLEOCUzarINB/
7ZVGkkVv9cp4AqW1lcBAyQak424ff2hxfhJWxRthBJT/fQ2OFcdZFWLkFEG2kO0y
PZ5WCxI1Q4ZNz8iW9qynIRCyhvzhaTHwA7wsGqmGRl1u2VMfoeBiAPNoxTAnpUEm
05L7EBDVSK++KgvkuoQ2KeWOII9IKNaHH4Yg5y8/guCsTbsWjzo4cjS5MGmo3s7r
6ArPr2h2WvSbayL67aPheBwg6K0ScY/JJQhL/8HgFbnnnL+mDcZ9iH5yKl/b25BX
FnQkjb37p0GUDNhXQOoElifghDF8rIAD6o5WDgTL2h5uun4WImYbMS/CktnLAQeH
wNVvrpmpxi11xf9D3SCRCM4jVAy4u1DVEL/0FElWCx1hn4iixm9hGvQ+YPq6/pMl
LOP87mzmFn+FhPtj7HIDp5B1ECw1xqcP064FHUYhMEntEHfN/Xh6gmDooesDrbUf
VuvjrSRBIeTI98xrNALepqWn5w/veZtIymZdEz9vlcK8gQ+tHNt7oI/RbDb08cap
gyMboWciAm2F6FE9QNlrjQHTbCEIrE5BoryRW87ArTbYhKvnZ1vpW01YmJHhA6LE
v6glfw6FWYAU8wlF4xbyzN1Gy3PB4kUMCGDqZzh4wU1JGXklJm3CbIf73sOwRmg7
lmnw+RYq6z/RMmPjjrb4
=4g/i
-----END PGP SIGNATURE-----
Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"This is a batch of fixes for arm-soc platforms, most of it is for OMAP
but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
broken platforms and drivers, etc. A bit all over the map really."
There was some concern about commit 68136b10 ("RM: sunxi: Change device
tree naming scheme for sunxi"), but Tony says:
"Looks like that's trivial to fix as needed, no need to rebuild the
branch to fix that AFAIK.
The fix can be done once Olof is available online again.
Linus, I suggest that you go ahead and pull this if there are no other
issues with this branch."
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits)
ARM: sunxi: Change device tree naming scheme for sunxi
ARM: ux500: fix missing include
ARM: u300: delete custom pin hog code
ARM: davinci: fix build break due to missing include
ARM: exynos: Fix warning due to missing 'inline' in stub
ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices
ARM i.MX51 clock: Fix regression since enabling MIPI/HSP clocks
ARM: dts: mx27: Fix the AIPI bus for FEC
ARM: OMAP2+: common: remove use of vram
ARM: OMAP3/4: cpuidle: fix sparse and checkpatch warnings
ARM: OMAP4: clock data: DPLLs are missing bypass clocks in their parent lists
ARM: OMAP4: clock data: div_iva_hs_clk is a power-of-two divider
ARM: OMAP4: Fix EMU clock domain always on
ARM: OMAP4460: Workaround ABE DPLL failing to turn-on
ARM: OMAP4: Enhance support for DPLLs with 4X multiplier
ARM: OMAP4: Add function table for non-M4X dplls
ARM: OMAP4: Update timer clock aliases
ARM: OMAP: Move plat/omap-serial.h to include/linux/platform_data/serial-omap.h
ARM: dts: Add build target for omap4-panda-a4
ARM: dts: OMAP2420: Correct H4 board memory size
...
Documentation says that code requiring dma-buf should add it to
select, so inline fallbacks are not going to be used. A link error
will make it obvious what went wrong, instead of silently doing
nothing at runtime.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Pull networking fixes from David Miller:
1) Really fix tuntap SKB use after free bug, from Eric Dumazet.
2) Adjust SKB data pointer to point past the transport header before
calling icmpv6_notify() so that the headers are in the state which
that function expects. From Duan Jiong.
3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason
Wang.
4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.
5) Don't destroy mutex after freeing up device private in mac802154,
fix also from Konstantin Khlebnikov.
6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.
7) SCTP HMAC kconfig rework, from Neil Horman.
8) Fix SCTP jprobes function signature, otherwise things explode, from
Daniel Borkmann.
9) Fix typo in ipv6-offload Makefile variable reference, from Simon
Arlott.
10) Don't fail USBNET open just because remote wakeup isn't supported,
from Oliver Neukum.
11) be2net driver bug fixes from Sathya Perla.
12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
Woodhouse.
13) Fix MTU changing regression in 8139cp driver, from John Greene.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
solos-pci: ensure all TX packets are aligned to 4 bytes
solos-pci: add firmware upgrade support for new models
solos-pci: remove superfluous debug output
solos-pci: add GPIO support for newer versions on Geos board
8139cp: Prevent dev_close/cp_interrupt race on MTU change
net: qmi_wwan: add ZTE MF880
drivers/net: Use of_match_ptr() macro in smsc911x.c
drivers/net: Use of_match_ptr() macro in smc91x.c
ipv6: addrconf.c: remove unnecessary "if"
bridge: Correctly encode addresses when dumping mdb entries
bridge: Do not unregister all PF_BRIDGE rtnl operations
use generic usbnet_manage_power()
usbnet: generic manage_power()
usbnet: handle PM failure gracefully
ksz884x: fix receive polling race condition
qlcnic: update driver version
qlcnic: fix unused variable warnings
net: fec: forbid FEC_PTP on SoCs that do not support
be2net: fix wrong frag_idx reported by RX CQ
be2net: fix be_close() to ensure all events are ack'ed
...
error and a missing symbol export.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0l2KAAoJEEFnBt12D9kB+tUQAKMjdtBO4MV9LaSham/yj+bf
f7aGoslEFHloXOvP0/Hg8+bP+Z0El5p7ncCZIROtN2pfhYad1CjbZHhwhJeeRMuJ
vQT+uy9BZ09pWoSvkLCWdUQyEGYWMxaOQtDMVHEdURU7nRBOPlntrCijoS68pcK8
Tw9XsX69Qurk/Z8q8DXf8hmNF49Cyv8ax1rywqjTTT0yzR+UzQGldXUDqEg9bg/t
Qcf03xwUSojdEBQrLo8aaFm32EUguqB02WqS1KMQBaz6FnbqmvVdVyIgVea7kgNi
mws1um6/3B/yDYB3ESKNXeZiF1bW03ccdITOeRe+mBDEz0JjnZQmQb30xaT6kjT/
z4VV1DVFAKFQ4M0HlKmv7xvcBcNdzuhrhiFteGNqDkU+zGwgl4GkiSP/c6l0CRVt
Ij8jHM+YtLmeI1ajx2V9OhM4xzK1Upo6+zi5zgGxLqflvBnFysBVuLjV8/KdeJ6Z
FW5J0iOMbIdRKdBZugj+c47qMuBjXx+BZwUxoaAbgHnFLctkF3cWhc3sLpqNi6pe
6F8GkfEIW8nc6RYLb+Rh4e29mZgAMub+GS3bqKA0bcpOjgm6zwNV27ws7lLxwz2u
d5Xf6uB2nfZGSZJtRa8mvqwEFkMoFPwaAD3XX2DmpSPZ0jTX5X1bRMa4u3UA7uJF
VIdMKi+ZzuVhLpaMtnXc
=srgY
-----END PGP SIGNATURE-----
mergetag object bc1008cf7d
type commit
tag gpio-for-linus
tagger Grant Likely <grant.likely@secretlab.ca> 1355962627 +0000
GPIO device driver bug fixes:
- gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
- gpio/ich: Add missing spinlock init
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0lmWAAoJEEFnBt12D9kBhxEP/0RQ87g52xXFxYpfIRRw6YIY
YlXR0noPEJqYnOUGHI0pi+P6mFDvv95etT3khEsuwVkSbb0c0fEl8m6w8O05aTd9
DHdQ4ilvUE77+xO2uZsRN6VxBgnApaj8qMdLWi09yFnSm43wegWTi38IpnY2W4PE
/qgTWT+pBraW+l4TQ/dA4hvhsTorH1McnbaNWQG1oNHbrnQ4Q43UII8qR02exz6Y
CTgo5qpgvzbz334VQ/VYPgd2NOE1wZ6BihMZr8wNVv25Ip1tWhEh4BbCjpMmGBYC
FnHfmKB3zeDdHbsnV2TZc2vC48qnMblGSkMjKsKZn0IOfH2YSym94L0VaXZhCLkw
hjtZxrDnbkHzpkLc7inc8IxbTG1wyhtDQhIf9HBzO33QPtdXw2v6GVtLJ0Ca9ikB
1T3XJZPZW+JwaEUdsw4UP7ZkJ7cwRG+fxB3iN57QDKj6Hjy+i/hA3GOjbF1VAOsb
xTrLcfnYvQ81BXdMP2rzPo2c/XdwroW6SNNxYq8xkzlVZuRT0kZSObfGSM4zFxKp
idfxwHz6ctXB1oni3XBaHmuyRXMZ9zERuyoNARPIJVw0ylSb93eq7dx+j02JOBtc
RtrBb1oVtOLiVdn6mIDA4LSELdmcbZdXHW4k84CqyNPWSRG4Sbhx3539r1m90vHw
EBnI5XpggxPt/U9qqpxH
=JGvn
-----END PGP SIGNATURE-----
mergetag object d3601e56cf
type commit
tag spi-for-linus
tagger Grant Likely <grant.likely@secretlab.ca> 1355962904 +0000
SPI device driver bug fixes branch for the v3.8 merge window. Most of
this is bug fixes to the core code and the sh-hspi and s3c64xx device
drivers.
There is also a patch here to add DT support to the Atmel driver. This
one should have been in the first round, but I missed it. It's a low
risk change contained within a single driver and the Atmel maintainer
has requested it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0lxNAAoJEEFnBt12D9kBCTMP/0WvS/7iiilM00b5pmclGkRc
Ct3832KnZXn5dDYoSsP1AS2PRJAKbrMzvKMZu+ggjOAlwza87rKJJpt+KdU3r15b
+KElnaStCC88ohxEOwQV4+CJpF3dS1AmrGMQvo9RK8hddpR1dH+FgMdmex104X/A
Ysr0WgqQMY3iGVmnlPxU4EYlR1scnqOpaNADNbMY7ibuJ7LOYk1RYtqmX5yxdnJq
yABILFa10qV9N/cqPBu2Cuhw8Xtp/u70dsQPkALG//pyZnFyhw6rMJ6bowOC2IHM
p3R4IrVqJq1a3nL1IC1XX5/k21b0PBNxfOGqpeU4QoDK4/cIKeiPny1dtoI1UM+2
mpTU99VvitZ0ywolenKnbPdU61UwZ6Sd+ptsZM0Y/wIMiXDWSBDTZqVJRAbINqFb
JwwjtdaFDPgtIb6Yg1WuhLmhcOPfXFIRys7JmnS0VNQCpvvEIdFsR9l5jjMcmluR
W7V69z9FZj5lf7WjNstbAxbCvrw1i65OD6Nr1UBAA17bqNU6a9BCWdNJGwLozraR
k2ZomvfmIMhTP82z5by5UY39M9B5uGOHDxxXZOOvxzB/fxlYgLhiAGPfMY1FMrPH
48gDqQtdOpakL1B/gVELpBLMoKjwEx9jJa/8tJb5wy8EgNSw06BDSN69OclLgmV/
uVnSZWrp62odDLz0qjSR
=trRO
-----END PGP SIGNATURE-----
Merge tags 'dt-for-linus', 'gpio-for-linus' and 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull devicetree, gpio and spi bugfixes from Grant Likely:
"Device tree v3.8 bug fix:
- Fixes an undefined struct device build error and a missing symbol
export.
GPIO device driver bug fixes:
- gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
- gpio/ich: Add missing spinlock init
SPI device driver bug fixes:
- Most of this is bug fixes to the core code and the sh-hspi and
s3c64xx device drivers.
- There is also a patch here to add DT support to the Atmel driver.
This one should have been in the first round, but I missed it.
It's a low risk change contained within a single driver and the
Atmel maintainer has requested it."
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux-2.6:
of: define struct device in of_platform.h if !OF_DEVICE and !OF_ADDRESS
of: Fix export of of_find_matching_node_and_match()
* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
gpio/ich: Add missing spinlock init
* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
spi/sh-hspi: fix return value check in hspi_probe().
spi: fix tegra SPI binding examples
spi/atmel: add DT support
of/spi: Fix SPI module loading by using proper "spi:" modalias prefixes.
spi: Change FIFO flush operation and spi channel off
spi: Keep chipselect assertion during one message
- Various cleanups especially in NAND tests
- Add support for NAND flash on BCMA bus
- DT support for sh_flctl and denali NAND drivers
- Kill obsolete/superceded drivers (fortunet, nomadik_nand)
- Fix JFFS2 locking bug in ENOMEM failure path
- New SPI flash chips, as usual
- Support writing in 'reliable mode' for DiskOnChip G4
- Debugfs support in nandsim
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAlDSAa4ACgkQdwG7hYl686MMcACeNYa//ghPtccb5L+IRXsqaFDL
Yi4AoLWOaOjN8qM4KUF/bfMEkwNGAePz
=DaAQ
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd
Pull MTD updates from David Woodhouse:
- Various cleanups especially in NAND tests
- Add support for NAND flash on BCMA bus
- DT support for sh_flctl and denali NAND drivers
- Kill obsolete/superceded drivers (fortunet, nomadik_nand)
- Fix JFFS2 locking bug in ENOMEM failure path
- New SPI flash chips, as usual
- Support writing in 'reliable mode' for DiskOnChip G4
- Debugfs support in nandsim
* tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd: (96 commits)
mtd: nand: typo in nand_id_has_period() comments
mtd: nand/gpio: use io{read,write}*_rep accessors
mtd: block2mtd: throttle writes by calling balance_dirty_pages_ratelimited.
mtd: nand: gpmi: reset BCH earlier, too, to avoid NAND startup problems
mtd: nand/docg4: fix and improve read of factory bbt
mtd: nand/docg4: reserve bb marker area in ecclayout
mtd: nand/docg4: add support for writing in reliable mode
mtd: mxc_nand: reorder part_probes to let cmdline override other sources
mtd: mxc_nand: fix unbalanced clk_disable() in error path
mtd: nandsim: Introduce debugfs infrastructure
mtd: physmap_of: error checking to prevent a NULL pointer dereference
mtg: docg3: potential divide by zero in doc_write_oob()
mtd: bcm47xxnflash: writing support
mtd: tests/read: initialize buffer for whole next page
mtd: at91: atmel_nand: return bit flips for the PMECC read_page()
mtd: fix recovery after failed write-buffer operation in cfi_cmdset_0002.c
mtd: nand: onfi need to be probed in 8 bits mode
mtd: nand: add NAND_BUSWIDTH_AUTO to autodetect bus width
mtd: nand: print flash size during detection
mted: nand_wait_ready timeout fix
...
Centralise common code for manage_power() in usbnet
by making a generic simple implementation
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a device fails to do remote wakeup, this is no reason
to abort an open totally. This patch just continues without
runtime PM.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
A new driver has been added for the SPEAr platform and the TWL4030/6030
driver has been replaced by two drivers that control the regular PWMs
and the PWM driven LEDs provided by the chips.
The vt8500, tiecap, tiehrpwm, i.MX, LPC32xx and Samsung drivers have all
been improved and the device tree bindings now support the PWM signal
polarity.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQ0XF+AAoJEN0jrNd/PrOhk5IP/RxjjfVM8z7i0xc6ykNRyv/3
y8jRh1miwPXeamLdW07vF2NILBtDBmZ8TUbfAMt1esf25UST79Rgol/ia+QlBb3q
w/pCDVGwlOW+2qUd34Nlb+CyKXRROIbPcFy29RZn+Z29qdkWn4LVS0nZ0UJPSPel
80P6qxkrFGG8eKsYKB7InA0g4Ds0neWRJAoYsVp5jzyOPFpUILPdaptX+iSEk1v3
VvkIx8eTDPZO9aqn8qL6bcE0g0AneF+dJ4qzLiswlsPMxsFIoVysw6n2JuEyy+FD
x0Ml86Zl4SiNzZa4Pwa1250MwFT3cvnjWbAdLb2CGJVMV/uGU6nuyfXV0Aa1XtjQ
C0k8LNshgiID8/m1/H3+/aUy9Cx/hj1KM4jchrwkCphlBiAEEryKjTPJXDwfjuBI
s5NP4rUwfNVSQT66RaNVZ7atOKyeVu+hwAKO0h6PHOsD1GwhsT+b51/YmWXmb2E5
OgLLOOHVFORfxCrsXRhWcHydMzfplOtfZ4smr4WG0hKUVn2Zp1DK1zDE2rCBB4X1
ZMas4OO9uDRY9IDXZUZUtXcDPDppI6Zx3YeE1/MWzmjWqzyEYFf5OXBbakPr40Rq
lKEwYPNf5yiqIURfFZiGDk61mrwA0Vi3i8vYfTFq1zyX9u8KbXeY8g7AhcyC1qsM
YhfCmJZ0njopu8oENMgn
=m+7I
-----END PGP SIGNATURE-----
Merge tag 'for-3.8-rc1' of git://gitorious.org/linux-pwm/linux-pwm
Pull pwm changes from Thierry Reding:
"A new driver has been added for the SPEAr platform and the
TWL4030/6030 driver has been replaced by two drivers that control the
regular PWMs and the PWM driven LEDs provided by the chips.
The vt8500, tiecap, tiehrpwm, i.MX, LPC32xx and Samsung drivers have
all been improved and the device tree bindings now support the PWM
signal polarity."
Fix up trivial conflicts due to __devinit/exit removal.
* tag 'for-3.8-rc1' of git://gitorious.org/linux-pwm/linux-pwm: (21 commits)
pwm: samsung: add missing s3c->pwm_id assignment
pwm: lpc32xx: Set the chip base for dynamic allocation
pwm: lpc32xx: Properly disable the clock on device removal
pwm: lpc32xx: Fix the PWM polarity
pwm: i.MX: eliminate build warning
pwm: Export of_pwm_xlate_with_flags()
pwm: Remove pwm-twl6030 driver
pwm: New driver to support PWM driven LEDs on TWL4030/6030 series of PMICs
pwm: New driver to support PWMs on TWL4030/6030 series of PMICs
pwm: pwm-tiehrpwm: pinctrl support
pwm: tiehrpwm: Add device-tree binding
pwm: pwm-tiehrpwm: Adding TBCLK gating support.
pwm: pwm-tiecap: pinctrl support
pwm: tiecap: Add device-tree binding
pwm: Add TI PWM subsystem driver
pwm: Device tree support for PWM polarity
pwm: vt8500: Ensure PWM clock is enabled during pwm_config
pwm: vt8500: Fix build error
pwm: spear: Staticize spear_pwm_config()
pwm: Add SPEAr PWM chip driver support
...
Fixes the following warning:
include/linux/of_platform.h:106:13: warning: 'struct device' declared
inside parameter list [enabled by default]
include/linux/of_platform.h:106:13: warning: its scope is only this
definition or declaration, which is probably not what you want [enabled
by default]
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Rob Herring <robherring2@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
to verify the source of the module (ChromeOS) and/or use standard IMA on it
or other security hooks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0VKlAAoJENkgDmzRrbjxjuEQALVHpD1cSmryOzVwkNn7rVGP
PV3KVbUs+qzUCm2c3AafIIlSBm2LOUl+cR3uNC7di8aHarRF3VHkK2OQ4Fx97ECd
KKBqAyY3R0q1mAKujb/MWwiK0YgosEDIOzGGn2yQhNFsxKqnMB02P4j82IO7+g+w
Cc3XuDyWHoH2I+ySgz0Q8NHAqufD/DMZUKud7jw2Lsv6PuICJ1Oqgl/Gd/muxort
4a5tV3tjhRGywHS/8b2fbDUXkybC5NKK0FN+gyoaROmJ/THeHEQDGXZT9bc2vmVx
HvRy/5k8dzQ6LAJ2mLnPvy0pmv0u7NYMvjxTxxUlUkFMkYuVticikQfwSYDbDPt4
mbsLxchpgi8z4x8HltEERffCX5tldo/5hz1uemqhqIsMRIrRFnlHkSIgkGjVHf2u
LXQBLT8uTm6C0VyNQPrI/hUZzIax7WtKbPSoK9lmExNbKqloEFh/mVXvfQxei2kp
wnUZcnmPIqSvw7b4CWu7HibMYu2VvGBgm3YIfJRi4AQme1mzFYLpZoxF5Pj+Ykbt
T//Hb1EsNQTTFCg7MZhnJSAw/EVUvNDUoullORClyqw6+xxjVKqWpPJgYDRfWOlJ
Xa+s7DNrL+Oo1WWR8l5ruoQszbR8szIyeyPKKxRUcQj2zsqghoWuzKAx2saSEw3W
pNkoJU+dGC7kG/yVAS8N
=uoJj
-----END PGP SIGNATURE-----
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
"Nothing all that exciting; a new module-from-fd syscall for those who
want to verify the source of the module (ChromeOS) and/or use standard
IMA on it or other security hooks."
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Fix kbuild output when using default extra_certificates
MODSIGN: Avoid using .incbin in C source
modules: don't hand 0 to vmalloc.
module: Remove a extra null character at the top of module->strtab.
ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
ASN.1: Define indefinite length marker constant
moduleparam: use __UNIQUE_ID()
__UNIQUE_ID()
MODSIGN: Add modules_sign make target
powerpc: add finit_module syscall.
ima: support new kernel module syscall
add finit_module syscall to asm-generic
ARM: add finit_module syscall to ARM
security: introduce kernel_module_from_file hook
module: add flags arg to sys_finit_module()
module: add syscall to load module from fd
to opt in to using GCC's __builtin_bswapXX() intrinsics for byteswapping,
and if we merge this now then the architecture maintainers can enable it
for their arch during the next cycle without dependency issues.
It's worth making it a par-arch opt-in, because although in *theory* the
compiler should never do worse than hand-coded assembler (and of course
it also ought to do a lot better on platforms like Atom and PowerPC which
have load-and-swap or store-and-swap instructions), that isn't always the
case. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46453 for example.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAlDRvNsACgkQdwG7hYl686O7KACeKQMiuZMLB9ctF5u0Iql+33PF
+WAAnisvZ8HCUjG5E8DF6HWy45r4BjUp
=eeUs
-----END PGP SIGNATURE-----
Merge tag 'byteswap-for-linus-20121219' of git://git.infradead.org/users/dwmw2/byteswap
Pull preparatory gcc intrisics bswap patch from David Woodhouse:
"This single patch is effectively a no-op for now. It enables
architectures to opt in to using GCC's __builtin_bswapXX() intrinsics
for byteswapping, and if we merge this now then the architecture
maintainers can enable it for their arch during the next cycle without
dependency issues.
It's worth making it a par-arch opt-in, because although in *theory*
the compiler should never do worse than hand-coded assembler (and of
course it also ought to do a lot better on platforms like Atom and
PowerPC which have load-and-swap or store-and-swap instructions), that
isn't always the case. See
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46453
for example."
* tag 'byteswap-for-linus-20121219' of git://git.infradead.org/users/dwmw2/byteswap:
byteorder: allow arch to opt to use GCC intrinsics for byteswapping
Commit 8dd2cb7e88 ("block: discard granularity might not be power of
2") changed a couple of 'binary and' operations into modulus operations.
Which turned the harmless case of a zero discard_granularity into a
possible divide-by-zero.
The code also had a much more subtle bug: it was doing the modulus of a
value in bytes using 'sector_t'. That was always conceptually wrong,
but didn't actually matter back when the code assumed a power-of-two
granularity: we only looked at the low bits anyway.
But with potentially arbitrary sector numbers, using a 'sector_t' to
express bytes is very very wrong: depending on configuration it limits
the starting offset of the device to just 32 bits, and any overflow
would result in a wrong value if the modulus wasn't a power-of-two.
So re-write the code to not only protect against the divide-by-zero, but
to do the starting sector arithmetic in sectors, and using the proper
types.
[ For any mathematicians out there: it also looks monumentally stupid to
do the 'modulo granularity' operation *twice*, never mind having a "+
granularity" in the second modulus op.
But that's the easiest way to avoid negative values or overflow, and
it is how the original code was done. ]
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Doug Anderson <dianders@chromium.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i2c-embedded changes from Wolfram Sang:
- CBUS driver (an I2C variant)
- continued rework of the omap driver
- s3c2410 gets lots of fixes and gains pinctrl support
- at91 gains DMA support
- the GPIO muxer gains devicetree probing
- typical fixes and additions all over
* 'i2c-embedded/for-next' of git://git.pengutronix.de/git/wsa/linux: (45 commits)
i2c: omap: Remove the OMAP_I2C_FLAG_RESET_REGS_POSTIDLE flag
i2c: at91: add dma support
i2c: at91: change struct members indentation
i2c: at91: fix compilation warning
i2c: mxs: Do not disable the I2C SMBus quick mode
i2c: mxs: Handle i2c DMA failure properly
i2c: s3c2410: Remove recently introduced performance overheads
i2c: ocores: Move grlib set/get functions into #ifdef CONFIG_OF block
i2c: s3c2410: Add fix for i2c suspend/resume
i2c: s3c2410: Fix code to free gpios
i2c: i2c-cbus-gpio: introduce driver
i2c: ocores: Add support for the GRLIB port of the controller and use function pointers for getreg and setreg functions
i2c: ocores: Add irq support for sparc
i2c: omap: Move the remove constraint
ARM: dts: cfa10049: Add the i2c muxer buses to the CFA-10049
i2c: s3c2410: do not special case HDMIPHY stuck bus detection
i2c: s3c2410: use exponential back off while polling for bus idle
i2c: s3c2410: do not generate STOP for QUIRK_HDMIPHY
i2c: s3c2410: grab adapter lock while changing i2c clock
i2c: s3c2410: Add support for pinctrl
...
Merge patches from Andrew Morton:
"Most of the rest of MM, plus a few dribs and drabs.
I still have quite a few irritating patches left around: ones with
dubious testing results, lack of review, ones which should have gone
via maintainer trees but the maintainers are slack, etc.
I need to be more activist in getting these things wrapped up outside
the merge window, but they're such a PITA."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits)
mm/vmscan.c: avoid possible deadlock caused by too_many_isolated()
vmscan: comment too_many_isolated()
mm/kmemleak.c: remove obsolete simple_strtoul
mm/memory_hotplug.c: improve comments
mm/hugetlb: create hugetlb cgroup file in hugetlb_init
mm/mprotect.c: coding-style cleanups
Documentation: ABI: /sys/devices/system/node/
slub: drop mutex before deleting sysfs entry
memcg: add comments clarifying aspects of cache attribute propagation
kmem: add slab-specific documentation about the kmem controller
slub: slub-specific propagation changes
slab: propagate tunable values
memcg: aggregate memcg cache values in slabinfo
memcg/sl[au]b: shrink dead caches
memcg/sl[au]b: track all the memcg children of a kmem_cache
memcg: destroy memcg caches
sl[au]b: allocate objects from memcg cache
sl[au]b: always get the cache from its page in kmem_cache_free()
memcg: skip memcg kmem allocations in specified code regions
memcg: infrastructure to match an allocation to the right cache
...
Build kernel with CONFIG_HUGETLBFS=y,CONFIG_HUGETLB_PAGE=y and
CONFIG_CGROUP_HUGETLB=y, then specify hugepagesz=xx boot option, system
will fail to boot.
This failure is caused by following code path:
setup_hugepagesz
hugetlb_add_hstate
hugetlb_cgroup_file_init
cgroup_add_cftypes
kzalloc <--slab is *not available* yet
For this path, slab is not available yet, so memory allocated will be
failed, and cause WARN_ON() in hugetlb_cgroup_file_init().
So I move hugetlb_cgroup_file_init() into hugetlb_init().
[akpm@linux-foundation.org: tweak coding-style, remove pointless __init on inlined function]
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch clarifies two aspects of cache attribute propagation.
First, the expected context for the for_each_memcg_cache macro in
memcontrol.h. The usages already in the codebase are safe. In mm/slub.c,
it is trivially safe because the lock is acquired right before the loop.
In mm/slab.c, it is less so: the lock is acquired by an outer function a
few steps back in the stack, so a VM_BUG_ON() is added to make sure it is
indeed safe.
A comment is also added to detail why we are returning the value of the
parent cache and ignoring the children's when we propagate the attributes.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB allows us to tune a particular cache behavior with sysfs-based
tunables. When creating a new memcg cache copy, we'd like to preserve any
tunables the parent cache already had.
This can be done by tapping into the store attribute function provided by
the allocator. We of course don't need to mess with read-only fields.
Since the attributes can have multiple types and are stored internally by
sysfs, the best strategy is to issue a ->show() in the root cache, and
then ->store() in the memcg cache.
The drawback of that, is that sysfs can allocate up to a page in buffering
for show(), that we are likely not to need, but also can't guarantee. To
avoid always allocating a page for that, we can update the caches at store
time with the maximum attribute size ever stored to the root cache. We
will then get a buffer big enough to hold it. The corolary to this, is
that if no stores happened, nothing will be propagated.
It can also happen that a root cache has its tunables updated during
normal system operation. In this case, we will propagate the change to
all caches that are already active.
[akpm@linux-foundation.org: tweak code to avoid __maybe_unused]
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLAB allows us to tune a particular cache behavior with tunables. When
creating a new memcg cache copy, we'd like to preserve any tunables the
parent cache already had.
This could be done by an explicit call to do_tune_cpucache() after the
cache is created. But this is not very convenient now that the caches are
created from common code, since this function is SLAB-specific.
Another method of doing that is taking advantage of the fact that
do_tune_cpucache() is always called from enable_cpucache(), which is
called at cache initialization. We can just preset the values, and then
things work as expected.
It can also happen that a root cache has its tunables updated during
normal system operation. In this case, we will propagate the change to
all caches that are already active.
This change will require us to move the assignment of root_cache in
memcg_params a bit earlier. We need this to be already set - which
memcg_kmem_register_cache will do - when we reach __kmem_cache_create()
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we create caches in memcgs, we need to display their usage
information somewhere. We'll adopt a scheme similar to /proc/meminfo,
with aggregate totals shown in the global file, and per-group information
stored in the group itself.
For the time being, only reads are allowed in the per-group cache.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This enables us to remove all the children of a kmem_cache being
destroyed, if for example the kernel module it's being used in gets
unloaded. Otherwise, the children will still point to the destroyed
parent.
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement destruction of memcg caches. Right now, only caches where our
reference counter is the last remaining are deleted. If there are any
other reference counters around, we just leave the caches lying around
until they go away.
When that happens, a destruction function is called from the cache code.
Caches are only destroyed in process context, so we queue them up for
later processing in the general case.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We are able to match a cache allocation to a particular memcg. If the
task doesn't change groups during the allocation itself - a rare event,
this will give us a good picture about who is the first group to touch a
cache page.
This patch uses the now available infrastructure by calling
memcg_kmem_get_cache() before all the cache allocations.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct page already has this information. If we start chaining caches,
this information will always be more trustworthy than whatever is passed
into the function.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create a mechanism that skip memcg allocations during certain pieces of
our core code. It basically works in the same way as
preempt_disable()/preempt_enable(): By marking a region under which all
allocations will be accounted to the root memcg.
We need this to prevent races in early cache creation, when we
allocate data using caches that are not necessarily created already.
Signed-off-by: Glauber Costa <glommer@parallels.com>
yCc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The page allocator is able to bind a page to a memcg when it is
allocated. But for the caches, we'd like to have as many objects as
possible in a page belonging to the same cache.
This is done in this patch by calling memcg_kmem_get_cache in the
beginning of every allocation function. This function is patched out by
static branches when kernel memory controller is not being used.
It assumes that the task allocating, which determines the memcg in the
page allocator, belongs to the same cgroup throughout the whole process.
Misaccounting can happen if the task calls memcg_kmem_get_cache() while
belonging to a cgroup, and later on changes. This is considered
acceptable, and should only happen upon task migration.
Before the cache is created by the memcg core, there is also a possible
imbalance: the task belongs to a memcg, but the cache being allocated from
is the global cache, since the child cache is not yet guaranteed to be
ready. This case is also fine, since in this case the GFP_KMEMCG will not
be passed and the page allocator will not attempt any cgroup accounting.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every cache that is considered a root cache (basically the "original"
caches, tied to the root memcg/no-memcg) will have an array that should be
large enough to store a cache pointer per each memcg in the system.
Theoreticaly, this is as high as 1 << sizeof(css_id), which is currently
in the 64k pointers range. Most of the time, we won't be using that much.
What goes in this patch, is a simple scheme to dynamically allocate such
an array, in order to minimize memory usage for memcg caches. Because we
would also like to avoid allocations all the time, at least for now, the
array will only grow. It will tend to be big enough to hold the maximum
number of kmem-limited memcgs ever achieved.
We'll allocate it to be a minimum of 64 kmem-limited memcgs. When we have
more than that, we'll start doubling the size of this array every time the
limit is reached.
Because we are only considering kmem limited memcgs, a natural point for
this to happen is when we write to the limit. At that point, we already
have set_limit_mutex held, so that will become our natural synchronization
mechanism.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow a memcg parameter to be passed during cache creation. When the slub
allocator is being used, it will only merge caches that belong to the same
memcg. We'll do this by scanning the global list, and then translating
the cache to a memcg-specific cache
Default function is created as a wrapper, passing NULL to the memcg
version. We only merge caches that belong to the same memcg.
A helper is provided, memcg_css_id: because slub needs a unique cache name
for sysfs. Since this is visible, but not the canonical location for slab
data, the cache name is not used, the css_id should suffice.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For the kmem slab controller, we need to record some extra information in
the kmem_cache structure.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because those architectures will draw their stacks directly from the page
allocator, rather than the slab cache, we can directly pass __GFP_KMEMCG
flag, and issue the corresponding free_pages.
This code path is taken when the architecture doesn't define
CONFIG_ARCH_THREAD_INFO_ALLOCATOR (only ia64 seems to), and has
THREAD_SIZE >= PAGE_SIZE. Luckily, most - if not all - of the remaining
architectures fall in this category.
This will guarantee that every stack page is accounted to the memcg the
process currently lives on, and will have the allocations to fail if they
go over limit.
For the time being, I am defining a new variant of THREADINFO_GFP, not to
mess with the other path. Once the slab is also tracked by memcg, we can
get rid of that flag.
Tested to successfully protect against :(){ :|:& };:
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Frederic Weisbecker <fweisbec@redhat.com>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We can use static branches to patch the code in or out when not used.
Because the _ACTIVE bit on kmem_accounted is only set after the increment
is done, we guarantee that the root memcg will always be selected for kmem
charges until all call sites are patched (see memcg_kmem_enabled). This
guarantees that no mischarges are applied.
Static branch decrement happens when the last reference count from the
kmem accounting in memcg dies. This will only happen when the charges
drop down to 0.
When that happens, we need to disable the static branch only on those
memcgs that enabled it. To achieve this, we would be forced to complicate
the code by keeping track of which memcgs were the ones that actually
enabled limits, and which ones got it from its parents.
It is a lot simpler just to do static_key_slow_inc() on every child
that is accounted.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is useful to know how many charges are still left after a call to
res_counter_uncharge. While it is possible to issue a res_counter_read
after uncharge, this can be racy.
If we need, for instance, to take some action when the counters drop down
to 0, only one of the callers should see it. This is the same semantics
as the atomic variables in the kernel.
Since the current return value is void, we don't need to worry about
anything breaking due to this change: nobody relied on that, and only
users appearing from now on will be checking this value.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a process tries to allocate a page with the __GFP_KMEMCG flag, the
page allocator will call the corresponding memcg functions to validate
the allocation. Tasks in the root memcg can always proceed.
To avoid adding markers to the page - and a kmem flag that would
necessarily follow, as much as doing page_cgroup lookups for no reason,
whoever is marking its allocations with __GFP_KMEMCG flag is responsible
for telling the page allocator that this is such an allocation at
free_pages() time. This is done by the invocation of
__free_accounted_pages() and free_accounted_pages().
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce infrastructure for tracking kernel memory pages to a given
memcg. This will happen whenever the caller includes the flag
__GFP_KMEMCG flag, and the task belong to a memcg other than the root.
In memcontrol.h those functions are wrapped in inline acessors. The idea
is to later on, patch those with static branches, so we don't incur any
overhead when no mem cgroups with limited kmem are being used.
Users of this functionality shall interact with the memcg core code
through the following functions:
memcg_kmem_newpage_charge: will return true if the group can handle the
allocation. At this point, struct page is not
yet allocated.
memcg_kmem_commit_charge: will either revert the charge, if struct page
allocation failed, or embed memcg information
into page_cgroup.
memcg_kmem_uncharge_page: called at free time, will revert the charge.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This flag is used to indicate to the callees that this allocation is a
kernel allocation in process context, and should be accounted to current's
memcg.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull second round of input updates from Dmitry Torokhov:
"As usual, there are a couple of new drivers, input core now supports
managed input devices (devres), a slew of drivers now have device tree
support and a bunch of fixes and cleanups."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (71 commits)
Input: walkera0701 - fix crash on startup
Input: matrix-keymap - provide a proper module license
Input: gpio_keys_polled - switch to using gpio_request_one()
Input: gpio_keys - switch to using gpio_request_one()
Input: wacom - fix touch support for Bamboo Fun CTH-461
Input: xpad - add a few new VID/PID combinations
Input: xpad - minor formatting fixes
Input: gpio-keys-polled - honor 'autorepeat' setting in platform data
Input: tca8418-keypad - switch to using managed resources
Input: tca8418_keypad - increase severity of failures in probe()
Input: tca8418_keypad - move device ID tables closer to where they are used
Input: tca8418_keypad - use dev_get_platdata() to retrieve platform data
Input: tca8418_keypad - use a temporary variable for parent device
Input: tca8418_keypad - add support for shared interrupt
Input: tca8418_keypad - add support for device tree bindings
Input: remove Compaq iPAQ H3600 (Bitsy) touchscreen driver
Input: bu21013_ts - add support for Device Tree booting
Input: bu21013_ts - move GPIO init and exit functions into the driver
Input: bu21013_ts - request regulator that actually exists
ARM: ux500: Strip out duplicate touch screen platform information
...