kernel-ark

Author	SHA1	Message	Date
Linus Torvalds	a53b75b37a	Bug fixes and cleanups for ext4. We also enable the punch hole functionality for bigalloc file systems. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJS59knAAoJENNvdpvBGATwdPMQAOqi3O7DdDKt6wUkv6LnYSH5 y5MpvsKlvAL9WmS9Uj/S9VnG0eIakLbdl1DcTrnsqgbCryHPlWC8jOsPn22yanlN T+QGfw+1CkPrionsQVzKH17sFImITrcvz9z0vaX9//imSRmhm8JHMgVaboKnMXK2 pdYdwNUGPlG4sUiaZEgX2lD4q57MUYRmGBracTRA4rC7roxRrZRv4MpRe3vdqC9D RNkUfrv6Bv63JsMVYAMzanMCH0IB+96ZhoRAym82H1U7nyAMvDKFWASf8dHbOXDu 64RQlB55Ehqv4wAJ2JBwkH1vi6NIF/LEysfmksctMRDB+rP6i6HSQMOQ3SsjDUWL Llq6J6raeyfHa0sCDh5A4hQhtOFMuhhbqHWFcfoUgGTI3aSnt7zGuh7/afIw++Go tLzqJFRzCgGdnmihLRbjIy9d6eUeEj2m40bg/QOg89Y3EhTsq0kwVf0TXAOskvx4 kBWH4ZFemfUxfDMNIa9uPN99I4/FlBO/9xGneDb6vYtPB/fA3jnaPYu/LIorF6Sc Mw1wH4Qy3oX5j9VcJlvZPY5cGmBLNyQqEdbOBgi6zeEFF1g2kTMf1PQQ++9rcLlV Cf9qc4IrjAUGbwo59gQtuB8OzhNxkl4mhOTVNZ/PW8G2rpufdqGhndUsjTAPNmKM +tf+CjdJuHzPnj7Ig21u =ofDv -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 update from Ted Ts'o: "Bug fixes and cleanups for ext4. We also enable the punch hole functionality for bigalloc file systems" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: delete "set but not used" variables ext4: don't pass freed handle to ext4_walk_page_buffers ext4: avoid clearing beyond i_blocks when truncating an inline data file ext4: ext4_inode_is_fast_symlink should use EXT4_CLUSTER_SIZE ext4: fix a typo in extents.c ext4: use %pd printk specificer ext4: standardize error handling in ext4_da_write_inline_data_begin() ext4: retry allocation when inline->extent conversion failed ext4: enable punch hole for bigalloc	2014-01-28 08:54:16 -08:00
Linus Torvalds	bf3d846b78	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted stuff; the biggest pile here is Christoph's ACL series. Plus assorted cleanups and fixes all over the place... There will be another pile later this week" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits) __dentry_path() fixes vfs: Remove second variable named error in __dentry_path vfs: Is mounted should be testing mnt_ns for NULL or error. Fix race when checking i_size on direct i/o read hfsplus: remove can_set_xattr nfsd: use get_acl and ->set_acl fs: remove generic_acl nfs: use generic posix ACL infrastructure for v3 Posix ACLs gfs2: use generic posix ACL infrastructure jfs: use generic posix ACL infrastructure xfs: use generic posix ACL infrastructure reiserfs: use generic posix ACL infrastructure ocfs2: use generic posix ACL infrastructure jffs2: use generic posix ACL infrastructure hfsplus: use generic posix ACL infrastructure f2fs: use generic posix ACL infrastructure ext2/3/4: use generic posix ACL infrastructure btrfs: use generic posix ACL infrastructure fs: make posix_acl_create more useful fs: make posix_acl_chmod more useful ...	2014-01-28 08:38:04 -08:00
Christoph Hellwig	64e178a711	ext2/3/4: use generic posix ACL infrastructure Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 23:58:19 -05:00
Christoph Hellwig	37bc15392a	fs: make posix_acl_create more useful Rename the current posix_acl_created to __posix_acl_create and add a fully featured helper to set up the ACLs on file creation that uses get_acl(). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 23:58:18 -05:00
Christoph Hellwig	5bf3258fd2	fs: make posix_acl_chmod more useful Rename the current posix_acl_chmod to __posix_acl_chmod and add a fully featured ACL chmod helper that uses the ->set_acl inode operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 23:58:18 -05:00
Cody P Schafer	d1866bd061	fs/ext4: use rbtree postorder iteration helper instead of opencoding Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Michel Lespinasse <walken@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:37:03 -08:00
Linus Torvalds	1d32bdafaa	xfs: update for v3.14-rc1 For 3.14-rc1 there are fixes in the areas of remote attributes, discard, growfs, memory leaks in recovery, directory v2, quotas, the MAINTAINERS file, allocation alignment, extent list locking, and in xfs_bmapi_allocate. There are cleanups in xfs_setsize_buftarg, removing unused macros, quotas, setattr, and freeing of inode clusters. The in-memory and on-disk log format have been decoupled, a common helper to calculate the number of blocks in an inode cluster has been added, and handling of i_version has been pulled into the filesystems that use it. - cleanup in xfs_setsize_buftarg - removal of remaining unused flags for vop toss/flush/flushinval - fix for memory corruption in xfs_attrlist_by_handle - fix for out-of-date comment in xfs_trans_dqlockedjoin - fix for discard if range length is less than one block - fix for overrun of agfl buffer using growfs on v4 superblock filesystems - pull i_version handling out into the filesystems that use it - don't leak recovery items on error - fix for memory leak in xfs_dir2_node_removename - several cleanups for quotas - fix bad assertion in xfs_qm_vop_create_dqattach - cleanup for xfs_setattr_mode, and add xfs_setattr_time - fix quota assert in xfs_setattr_nonsize - fix an infinite loop when turning off group/project quota before user quota - fix for temporary buffer allocation failure in xfs_dir2_block_to_sf with large directory block sizes - fix Dave's email address in MAINTAINERS - cleanup calculation of freed inode cluster blocks - fix alignment of initial file allocations to match filesystem geometry - decouple in-memory and on-disk log format - introduce a common helper to calculate the number of filesystem blocks in an inode cluster - fixes for extent list locking - fix for off-by-one in xfs_attr3_rmt_verify - fix for missing destroy_work_on_stack in xfs_bmapi_allocate -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJS4C2sAAoJENaLyazVq6ZOF/UP/A/FmscsEbgz+KHtsg1UXP+g EMkrD8WOOzLnm7/GhkfvmmFLQrWrrNPmiP54MPJ/rxpfDb7rV60HgS0iHm13U+IR 0uPyk2NIQKG/Sj6FHCrgn4oh19B0tmer6lLE32UPrlNM7w/0NRm32Ms/jKz7WNvc 9Tqw3qNdtmP7leHG8EyD1ISzqUz6FNSC+qGyOTuBQUqG24LqsmZ1qD2Nw/HEz0ir 1YCqS+cjj8c3WaWMsdjEFdCvEIKS/sMYM+ZihATIl/5ggwtVobP7PH/4cG8Biby3 5wvcl3/jM+xjgGDC1YMQQeSADieus6ER4aZTjCvGLn9RajQTOBjP0QZTu2OHntTI kD/52DfYErthGijS2qJcqXy11AaYtWQU2yTZXuMpaACIM1DDSD4NyGFP99BzbBaX D4BgnC+vmfpbXO37PmophkeZwAvj+9K2BBG7X+g4sLynj/BtmvFCxIyK+SLHE3hN kn+Pn03yTw0VGgvj0krXSllbYKE0LyLElaA6h0LejQgcOZM0W6Q8qaj+RODLitbw HkQNKYCOMCzbdNu8rKAfup9UPZloiMukyMdMaw1MeCgCQSQLkZVxTegmVW9gzpIh QJEARDaOnKB/G/CLCwbPZR09DBpg3yBt/eHjt0VnV+z2x+u9DTInAp6f013g7E/W gU+vnBBPY1AYI8iiDraU =t8nB -----END PGP SIGNATURE----- Merge tag 'xfs-for-linus-v3.14-rc1' of git://oss.sgi.com/xfs/xfs Pull xfs update from Ben Myers: "This is primarily bug fixes, many of which you already have. New stuff includes a series to decouple the in-memory and on-disk log format, helpers in the area of inode clusters, and i_version handling. We decided to try to use more topic branches this release, so there are some merge commits in there on account of that. I'm afraid I didn't do a good job of putting meaningful comments in the first couple of merges. Sorry about that. I think I have the hang of it now. For 3.14-rc1 there are fixes in the areas of remote attributes, discard, growfs, memory leaks in recovery, directory v2, quotas, the MAINTAINERS file, allocation alignment, extent list locking, and in xfs_bmapi_allocate. There are cleanups in xfs_setsize_buftarg, removing unused macros, quotas, setattr, and freeing of inode clusters. The in-memory and on-disk log format have been decoupled, a common helper to calculate the number of blocks in an inode cluster has been added, and handling of i_version has been pulled into the filesystems that use it. - cleanup in xfs_setsize_buftarg - removal of remaining unused flags for vop toss/flush/flushinval - fix for memory corruption in xfs_attrlist_by_handle - fix for out-of-date comment in xfs_trans_dqlockedjoin - fix for discard if range length is less than one block - fix for overrun of agfl buffer using growfs on v4 superblock filesystems - pull i_version handling out into the filesystems that use it - don't leak recovery items on error - fix for memory leak in xfs_dir2_node_removename - several cleanups for quotas - fix bad assertion in xfs_qm_vop_create_dqattach - cleanup for xfs_setattr_mode, and add xfs_setattr_time - fix quota assert in xfs_setattr_nonsize - fix an infinite loop when turning off group/project quota before user quota - fix for temporary buffer allocation failure in xfs_dir2_block_to_sf with large directory block sizes - fix Dave's email address in MAINTAINERS - cleanup calculation of freed inode cluster blocks - fix alignment of initial file allocations to match filesystem geometry - decouple in-memory and on-disk log format - introduce a common helper to calculate the number of filesystem blocks in an inode cluster - fixes for extent list locking - fix for off-by-one in xfs_attr3_rmt_verify - fix for missing destroy_work_on_stack in xfs_bmapi_allocate" * tag 'xfs-for-linus-v3.14-rc1' of git://oss.sgi.com/xfs/xfs: (51 commits) xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK() xfs: fix off-by-one error in xfs_attr3_rmt_verify xfs: assert that we hold the ilock for extent map access xfs: use xfs_ilock_attr_map_shared in xfs_attr_list_int xfs: use xfs_ilock_attr_map_shared in xfs_attr_get xfs: use xfs_ilock_data_map_shared in xfs_qm_dqiterate xfs: use xfs_ilock_data_map_shared in xfs_qm_dqtobp xfs: take the ilock around xfs_bmapi_read in xfs_zero_remaining_bytes xfs: reinstate the ilock in xfs_readdir xfs: add xfs_ilock_attr_map_shared xfs: rename xfs_ilock_map_shared xfs: remove xfs_iunlock_map_shared xfs: no need to lock the inode in xfs_find_handle xfs: use xfs_icluster_size_fsb in xfs_imap xfs: use xfs_icluster_size_fsb in xfs_ifree_cluster xfs: use xfs_icluster_size_fsb in xfs_ialloc_inode_init xfs: use xfs_icluster_size_fsb in xfs_bulkstat xfs: introduce a common helper xfs_icluster_size_fsb xfs: get rid of XFS_IALLOC_BLOCKS macros xfs: get rid of XFS_INODE_CLUSTER_SIZE macros ...	2014-01-23 09:16:20 -08:00
jon ernst	d7092ae297	ext4: delete "set but not used" variables Signed-off-by: Jon Ernst <jonernst07@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2014-01-11 13:26:56 -05:00
Theodore Ts'o	8c9367fd9b	ext4: don't pass freed handle to ext4_walk_page_buffers This is harmless, since ext4_walk_page_buffers only passes the handle onto the callback function, and in this call site the function in question, bput_one(), doesn't actually use the handle. But there's no point passing in an invalid handle, and it creates a Coverity warning, so let's just clean it up. Addresses-Coverity-Id: #1091168 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-01-07 13:08:03 -05:00
Theodore Ts'o	09c455aaa8	ext4: avoid clearing beyond i_blocks when truncating an inline data file A missing cast means that when we are truncating a file which is less than 60 bytes, we don't clear the correct area of memory, and in fact we can end up truncating the next inode in the inode table, or worse yet, some other kernel data structure. Addresses-Coverity-Id: #751987 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-01-07 12:58:19 -05:00
Yongqiang Yang	65eddb56f4	ext4: ext4_inode_is_fast_symlink should use EXT4_CLUSTER_SIZE Can be reproduced by xfstests 62 with bigalloc and 128bit size inode. Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>	2014-01-06 14:06:18 -05:00
Yongqiang Yang	9e740568bc	ext4: fix a typo in extents.c Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>	2014-01-06 14:05:23 -05:00
David Howells	a34e15cc35	ext4: use %pd printk specificer Use the new %pd printk() specifier in Ext4 to replace passing of dentry name or dentry name and name length * 2 with just passing the dentry. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> cc: Andreas Dilger <adilger.kernel@dilger.ca> cc: linux-ext4@vger.kernel.org	2014-01-06 14:04:23 -05:00
Jan Kara	52e4477758	ext4: standardize error handling in ext4_da_write_inline_data_begin() The function has a bit non-standard (for ext4) error recovery in that it used a mix of 'out' labels and testing for 'handle' being NULL. There isn't a good reason for that in the function so clean it up a bit. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-01-06 14:03:23 -05:00
Jan Kara	bc0ca9df3b	ext4: retry allocation when inline->extent conversion failed Similarly as other ->write_begin functions in ext4, also ext4_da_write_inline_data_begin() should retry allocation if the conversion failed because of ENOSPC. This avoids returning ENOSPC prematurely because of uncommitted block deletions. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-01-06 14:02:23 -05:00
Zheng Liu	9cb00419fa	ext4: enable punch hole for bigalloc After applied this commit (`d23142c6`), ext4 has supported punch hole for a file system with bigalloc feature. But we forgot to enable it. This commit fixes it. Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-01-06 14:01:23 -05:00
Eric Whitney	d0abafac8c	ext4: fix bigalloc regression Commit `f5a44db5d2` introduced a regression on filesystems created with the bigalloc feature (cluster size > blocksize). It causes xfstests generic/006 and /013 to fail with an unexpected JBD2 failure and transaction abort that leaves the test file system in a read only state. Other xfstests run on bigalloc file systems are likely to fail as well. The cause is the accidental use of a cluster mask where a cluster offset was needed in ext4_ext_map_blocks(). Signed-off-by: Eric Whitney <enwlinux@gmail.com>	2014-01-06 14:00:23 -05:00
Theodore Ts'o	f5a44db5d2	ext4: add explicit casts when masking cluster sizes The missing casts can cause the high 64-bits of the physical blocks to be lost. Set up new macros which allows us to make sure the right thing happen, even if at some point we end up supporting larger logical block numbers. Thanks to the Emese Revfy and the PaX security team for reporting this issue. Reported-by: PaX Team <pageexec@freemail.hu> Reported-by: Emese Revfy <re.emese@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-20 09:29:35 -05:00
Jan Kara	34cf865d54	ext4: fix deadlock when writing in ENOSPC conditions Akira-san has been reporting rare deadlocks of his machine when running xfstests test 269 on ext4 filesystem. The problem turned out to be in ext4_da_reserve_metadata() and ext4_da_reserve_space() which called ext4_should_retry_alloc() while holding i_data_sem. Since ext4_should_retry_alloc() can force a transaction commit, this is a lock ordering violation and leads to deadlocks. Fix the problem by just removing the retry loops. These functions should just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that function must take care of retrying after dropping all necessary locks. Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-18 00:44:44 -05:00
Jan Kara	30fac0f75d	ext4: Do not reserve clusters when fs doesn't support extents When the filesystem doesn't support extents (like in ext2/3 compatibility modes), there is no need to reserve any clusters. Space estimates for writing are exact, hole punching doesn't need new metadata, and there are no unwritten extents to convert. This fixes a problem when filesystem still having some free space when accessed with a native ext2/3 driver suddently reports ENOSPC when accessed with ext4 driver. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-08 21:11:59 -05:00
Al Viro	9105bb149b	ext4: fix del_timer() misuse for ->s_err_report That thing should be del_timer_sync(); consider what happens if ext4_put_super() call of del_timer() happens to come just as it's getting run on another CPU. Since that timer reschedules itself to run next day, you are pretty much guaranteed that you'll end up with kfree'd scheduled timer, with usual fun consequences. AFAICS, that's -stable fodder all way back to 2010... [the second del_timer_sync() is almost certainly not needed, but it doesn't hurt either] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-08 20:52:31 -05:00
Christoph Hellwig	dff6efc326	fs: fix iversion handling Currently notify_change directly updates i_version for size updates, which not only is counter to how all other fields are updated through struct iattr, but also breaks XFS, which need inode updates to happen under its own lock, and synchronized to the structure that gets written to the log. Remove the update in the common code, and it to btrfs and ext4, XFS already does a proper updaste internally and currently gets a double update with the existing code. IMHO this is 3.13 and -stable material and should go in through the XFS tree. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Acked-by: Jan Kara <jack@suse.cz> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-12-05 16:36:21 -06:00
Eryu Guan	5946d08937	ext4: check for overlapping extents in ext4_valid_extent_entries() A corrupted ext4 may have out of order leaf extents, i.e. extent: lblk 0--1023, len 1024, pblk 9217, flags: LEAF UNINIT extent: lblk 1000--2047, len 1024, pblk 10241, flags: LEAF UNINIT ^^^^ overlap with previous extent Reading such extent could hit BUG_ON() in ext4_es_cache_extent(). BUG_ON(end < lblk); The problem is that __read_extent_tree_block() tries to cache holes as well but assumes 'lblk' is greater than 'prev' and passes underflowed length to ext4_es_cache_extent(). Fix it by checking for overlapping extents in ext4_valid_extent_entries(). I hit this when fuzz testing ext4, and am able to reproduce it by modifying the on-disk extent by hand. Also add the check for (ee_block + len - 1) in ext4_valid_extent() to make sure the value is not overflow. Ran xfstests on patched ext4 and no regression. Cc: Lukáš Czerner <lczerner@redhat.com> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-03 21:22:21 -05:00
Junho Ryu	4e8d213980	ext4: fix use-after-free in ext4_mb_new_blocks ext4_mb_put_pa should hold pa->pa_lock before accessing pa->pa_count. While ext4_mb_use_preallocated checks pa->pa_deleted first and then increments pa->count later, ext4_mb_put_pa decrements pa->pa_count before holding pa->pa_lock and then sets pa->pa_deleted. * Free sequence ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa * Use sequence ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_use_preallocated (4): increase pa->pa_count ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_release_context: access pa * Use-after-free sequence [initial status] <pa->pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count [pa_count decremented] <pa->pa_deleted = 0, pa_count = 0> ext4_mb_use_preallocated (4): increase pa->pa_count [pa_count incremented] <pa->pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 [race condition!] <pa->pa_deleted = 1, pa_count = 1> ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa ext4_mb_release_context: access pa AddressSanitizer has detected use-after-free in ext4_mb_new_blocks Bug report: http://goo.gl/rG1On3 Signed-off-by: Junho Ryu <jayr@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-03 18:10:28 -05:00
Theodore Ts'o	ae1495b12d	ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() fails While it's true that errors can only happen if there is a bug in jbd2_journal_dirty_metadata(), if a bug does happen, we need to halt the kernel or remount the file system read-only in order to avoid further data loss. The ext4_journal_abort_handle() function doesn't do any of this, and while it's likely that this call (since it doesn't adjust refcounts) will likely result in the file system eventually deadlocking since the current transaction will never be able to close, it's much cleaner to call let ext4's error handling system deal with this situation. There's a separate bug here which is that if certain jbd2 errors errors occur and file system is mounted errors=continue, the file system will probably eventually end grind to a halt as described above. But things have been this way in a long time, and usually when we have these sorts of errors it's pretty much a disaster --- and that's why the jbd2 layer aggressively retries memory allocations, which is the most likely cause of these jbd2 errors. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org	2013-12-02 09:31:36 -05:00
Linus Torvalds	4fbf888acc	Ext4 updates for 3.13. Mostly bug fixes and cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABCAAGBQJSgzHaAAoJENNvdpvBGATwPKkQAIydz5qBwyFyj3S2JieCe0L4 kIqy4QNJtRCPtDKid3aHBjrFmY1QB0bwqFGC/m5zFBlLGxVoPpRB+w01pjZ0uLBl DYDdefsQyjmzEDkcawQsQGuvyXIMJPhD7EGKDgpzxOJUBSC9hytrnfGAue1zzFt1 GVBtjpoJAfS32mxf8Kl/N0biFF3VasFkuvm8gtby2eXM06n5O5/yy/5k8Pv6X/3r ZhfYRq4nUZ2vUQx7Tw5x9Bn0IR5xMDnfj2Bs/gfPBo+GbdhJQzdPK1+l4e03Vp62 sAO9J8Pz+hZs6e7QfZ+FwiBQY8uHVkgTJ/rPWfTEY3bGxvPahfKrFuqvPLh2flpu RVdO8Dxw9NuMqR96mpMxQua7XtnJBknytr82QfVzoYBPIkBMwKUzUKoYxob/FcL3 ztdnnj0YygOan/E/unZn1mnIz6RyO1YjmL7S150kGtyUYWAI/pVHWufHqgY6gUqO 4AQLLPEOXnF9+OITLGgnncWNunimPl0VWCCUyvOLiFK9n3WVFZibR7JDBKM4PzAb pYfgIlySofadALQefs8G1IV+k+3GuiobdmdVUN1GZ609julBi82DhQz31tHy32mO gE0YJVmorvWNO+ap3Sf4UJ6Bk5of29xtSx+cTBQ0CNhs+TmWjT25iTuLAvbvxWnh FUsRfzLrXy0SqZ6+gm5q =F4g3 -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 changes from Ted Ts'o: "Ext4 updates for 3.13. Mostly bug fixes and cleanups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: add prototypes for macro-generated functions ext4: return non-zero st_blocks for inline data ext4: use prandom_u32() instead of get_random_bytes() ext4: remove unreachable code after ext4_can_extents_be_merged() ext4: remove unreachable code in ext4_can_extents_be_merged() ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea() ext4: don't count free clusters from a corrupt block group ext4: fix FITRIM in no journal mode ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline() ext4: change ext4_read_inline_dir() to return 0 on success ext4: pair trace_ext4_writepages & trace_ext4_writepages_result ext4: add ratelimiting to ext4 messages ext4: fix performance regression in ext4_writepages ext4: fixup kerndoc annotation of mpage_map_and_submit_extent() ext4: fix assertion in ext4_add_complete_io()	2013-11-14 17:19:58 +09:00
Linus Torvalds	9bc9ccd7db	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "All kinds of stuff this time around; some more notable parts: - RCU'd vfsmounts handling - new primitives for coredump handling - files_lock is gone - Bruce's delegations handling series - exportfs fixes plus misc stuff all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits) ecryptfs: ->f_op is never NULL locks: break delegations on any attribute modification locks: break delegations on link locks: break delegations on rename locks: helper functions for delegation breaking locks: break delegations on unlink namei: minor vfs_unlink cleanup locks: implement delegations locks: introduce new FL_DELEG lock flag vfs: take i_mutex on renamed file vfs: rename I_MUTEX_QUOTA now that it's not used for quotas vfs: don't use PARENT/CHILD lock classes for non-directories vfs: pull ext4's double-i_mutex-locking into common code exportfs: fix quadratic behavior in filehandle lookup exportfs: better variable name exportfs: move most of reconnect_path to helper function exportfs: eliminate unused "noprogress" counter exportfs: stop retrying once we race with rename/remove exportfs: clear DISCONNECTED on all parents sooner exportfs: more detailed comment for path_reconnect ...	2013-11-13 15:34:18 +09:00
Andreas Dilger	3f61c0cc70	ext4: add prototypes for macro-generated functions It isn't very easy to find the declarations for the functions created by EXT4_INODE_BIT_FNS() because the names are generated by macros: ext4_test_inode_flag, ext4_set_inode_flag, ext4_clear_inode_flag ext4_test_inode_state, ext4_set_inode_state, ext4_clear_inode_state Add explicit declarations for these functions so that grep and tags can find them. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-11 22:40:40 -05:00
Andreas Dilger	9206c56155	ext4: return non-zero st_blocks for inline data Return a non-zero st_blocks to userspace for statfs() and friends. Some versions of tar will assume that files with st_blocks == 0 do not contain any data and will skip reading them entirely. Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-11 22:38:12 -05:00
J. Bruce Fields	375e289ea8	vfs: pull ext4's double-i_mutex-locking into common code We want to do this elsewhere as well. Also catch any attempts to use it for directories (where this ordering would conflict with ancestor-first directory ordering in lock_rename). Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Acked-by: Jeff Layton <jlayton@redhat.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:39 -05:00
Theodore Ts'o	dd1f723bf5	ext4: use prandom_u32() instead of get_random_bytes() Many of the uses of get_random_bytes() do not actually need cryptographically secure random numbers. Replace those uses with a call to prandom_u32(), which is faster and which doesn't consume entropy from the /dev/random driver. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-08 00:14:53 -05:00
Eric Sandeen	f275411440	ext4: remove unreachable code after ext4_can_extents_be_merged() Commit `ec22ba8e` ("ext4: disable merging of uninitialized extents") ensured that if either extent under consideration is uninit, we decline to merge, and ext4_can_extents_be_merged() returns false. So there is no need for the caller to then test whether the extent under consideration is unitialized; if it were, we wouldn't have gotten that far. The comments were also inaccurate; ext4_can_extents_be_merged() no longer XORs the states, it fails if either is uninit. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-11-07 22:22:08 -05:00
Eric Sandeen	da0169b3b9	ext4: remove unreachable code in ext4_can_extents_be_merged() Commit `ec22ba8e` ("ext4: disable merging of uninitialized extents") ensured that if either extent under consideration is uninit, we decline to merge, and immediately return. But right after that test, we test again for an uninit extent; we can never hit this. So just remove the impossible test and associated variable. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-11-04 09:58:26 -05:00
Theodore Ts'o	dcb9917ba0	ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea() Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-10-31 23:00:24 -04:00
Darrick J. Wong	2746f7a170	ext4: don't count free clusters from a corrupt block group A bg that's been flagged "corrupt" by definition has no free blocks, so that the allocator won't be tempted to use the damaged bg. Therefore, we shouldn't count the clusters in the damaged group when calculating free counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-10-31 11:46:31 -04:00
Lukas Czerner	8f9ff18920	ext4: fix FITRIM in no journal mode When using FITRIM ioctl on a file system without journal it will only trim the block group once, no matter how many times you invoke FITRIM ioctl and how many block you release from the block group. It is because we only clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT in journal callback. Fix this by clearing the bit in no journal mode as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Jorge Fábregas <jorge.fabregas@gmail.com>	2013-10-30 11:10:52 -04:00
Azat Khuzhin	5ba052fe33	ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline() Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-30 10:53:10 -04:00
BoxiLiu	48ffdab1c1	ext4: change ext4_read_inline_dir() to return 0 on success In ext4_read_inline_dir(), if there is inline data, the successful return value is the return value of ext4_read_inline_data(). Howewer, this is used by ext4_readdir(), and while it seems harmless to return a positive value on success, it's inconsistent, since historically we've always return 0 on success. Signed-off-by: BoxiLiu <lewis.liulei@huawei.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Tao Ma <boyu.mt@taobao.com>	2013-10-30 08:07:20 -04:00
Ming Lei	bbf023c74d	ext4: pair trace_ext4_writepages & trace_ext4_writepages_result Pair the two trace events to make troubeshooting writepages easier, and it should be more convinient to write a simple script to parse the traces. Cc: linux-ext4@vger.kernel.org Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-30 07:27:16 -04:00
Theodore Ts'o	efbed4dc58	ext4: add ratelimiting to ext4 messages In the case of a storage device that suddenly disappears, or in the case of significant file system corruption, this can result in a huge flood of messages being sent to the console. This can overflow the file system containing /var/log/messages, or if a serial console is configured, this can slow down the system so much that a hardware watchdog can end up triggering forcing a system reboot. Google-Bug-Id: `7258357` Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-17 21:11:01 -04:00
Ming Lei	aeac589a74	ext4: fix performance regression in ext4_writepages Commit 4e7ea81db5(ext4: restructure writeback path) introduces another performance regression on random write: - one more page may be added to ext4 extent in mpage_prepare_extent_to_map, and will be submitted for I/O so nr_to_write will become -1 before 'done' is set - the worse thing is that dirty pages may still be retrieved from page cache after nr_to_write becomes negative, so lots of small chunks can be submitted to block device when page writeback is catching up with write path, and performance is hurted. On one arm A15 board with sata 3.0 SSD(CPU: 1.5GHz dura core, RAM: 2GB, SATA controller: 3.0Gbps), this patch can improve below test's result from 157MB/sec to 174MB/sec(>10%): dd if=/dev/zero of=./z.img bs=8K count=512K The above test is actually prototype of block write in bonnie++ utility. This patch makes sure no more pages than nr_to_write can be added to extent for mapping, so that nr_to_write won't become negative. Cc: linux-ext4@vger.kernel.org Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-17 18:56:16 -04:00
Linus Torvalds	0056019da4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull tmpfile fix from Al Viro: "A fix for double iput() in ->tmpfile() on ext3 and ext4; I'd fucked it up, Miklos has caught it" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ext[34]: fix double put in tmpfile	2013-10-16 17:18:18 -07:00
Jan Kara	7534e854b9	ext4: fixup kerndoc annotation of mpage_map_and_submit_extent() Document give_up_on_write argument of mpage_map_and_submit_extent(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-16 08:26:08 -04:00
Jan Kara	78371a45df	ext4: fix assertion in ext4_add_complete_io() It doesn't make sense to require io_end->handle when we are in nojournal mode. So update the assertion accordingly to avoid false warnings from ext4_add_complete_io(). Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-16 08:25:11 -04:00
Miklos Szeredi	43ae9e3fc7	ext[34]: fix double put in tmpfile d_tmpfile() already swallowed the inode ref. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-10-15 12:14:06 -04:00
Linus Torvalds	be5090da4a	A bug fix and performance regression fix for ext4. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABCAAGBQJSWZeNAAoJENNvdpvBGATwx6kP/2mVlKlNBXVfGUmLVP3Xb68v 4JhBzlC3ra3TRqVkw6C6kx4fbdq0cW/mDkecmYg2s+aDnswG/94/+yRdU4kQkyne iqN22ZYA7CumZgJvR0Z2ptWksDRpv8H5twgdVbPtad6/2cKmjseUraPo7YZhjDCe O9eRCXyVII305soAddZzZUgWOWCSWpdTW5zBitKaGq5x/K//rY9UlPVSuAo+9KPZ vyBiKJ1R6fDbtyH7JhCdXydMPKzlAPmyqYBQGLyq2GsRsXDp/VljGci6QN0iuZ5k lZsxFg8q0P6/R4Pjr3DDtE0tUbPXEyMxuquh/m4b3pAXRoMMCynyLP2zy7Gc7ec0 ek2ty+sVG06JjseqigHSmS/a+PdZgDY5xEMKhaK4X38lxRPb7apNktolXxxEt6eU OPZsuvma1g+lbkkCdRO5FVwMllb7cuPhuZPGyxZvmP+ON59oT5QOVsDC+55WnHNs Ib11PCTN93Mwhrm1YPNWVV+gWG50eLZQYJam6H4mE4knaXnba6htEhYrdNczoFH4 lcHaJzCDJLnYVRRbKXKdLSSnyz1X9cYJBP9g5ks1iNy7/JreF7WoIAOWvZWCp432 7NC0IOmV4Q4itiCTcSh85rGlsXU8ZA7wK5HILhp9qZmNkw30OMvihNoWoTFiWTJR mVCkm+isBbqMP0nhV5km =ZwlW -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "A bug fix and performance regression fix for ext4" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix memory leak in xattr ext4: fix performance regression in writeback of random writes	2013-10-12 12:55:15 -07:00
Dave Jones	6e4ea8e33b	ext4: fix memory leak in xattr If we take the 2nd retry path in ext4_expand_extra_isize_ea, we potentionally return from the function without having freed these allocations. If we don't do the return, we over-write the previous allocation pointers, so we leak either way. Spotted with Coverity. [ Fixed by tytso to set is and bs to NULL after freeing these pointers, in case in the retry loop we later end up triggering an error causing a jump to cleanup, at which point we could have a double free bug. -- Ted ] Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Cc: stable@vger.kernel.org	2013-10-12 14:39:49 -04:00
Jan Kara	9c12a831d7	ext4: fix performance regression in writeback of random writes The Linux Kernel Performance project guys have reported that commit `4e7ea81db5` introduces a performance regression for the following fio workload: [global] direct=0 ioengine=mmap size=1500M bs=4k pre_read=1 numjobs=1 overwrite=1 loops=5 runtime=300 group_reporting invalidate=0 directory=/mnt/ file_service_type=random:36 file_service_type=random:36 [job0] startdelay=0 rw=randrw filename=data0/f1:data0/f2 [job1] startdelay=0 rw=randrw filename=data0/f2:data0/f1 ... [job7] startdelay=0 rw=randrw filename=data0/f2:data0/f1 The culprit of the problem is that after the commit ext4_writepages() are more aggressive in writing back pages. Thus we have less consecutive dirty pages resulting in more seeking. This increased aggressivity is caused by a bug in the condition terminating ext4_writepages(). We start writing from the beginning of the file even if we should have terminated ext4_writepages() because wbc->nr_to_write <= 0. After fixing the condition the throughput of the fio workload is about 20% better than before writeback reorganization. Reported-by: "Yan, Zheng" <zheng.z.yan@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-09-16 08:24:26 -04:00
Linus Torvalds	ac4de9543a	Merge branch 'akpm' (patches from Andrew Morton) Merge more patches from Andrew Morton: "The rest of MM. Plus one misc cleanup" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (35 commits) mm/Kconfig: add MMU dependency for MIGRATION. kernel: replace strict_strto() with kstrto() mm, thp: count thp_fault_fallback anytime thp fault fails thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page() thp: do_huge_pmd_anonymous_page() cleanup thp: move maybe_pmd_mkwrite() out of mk_huge_pmd() mm: cleanup add_to_page_cache_locked() thp: account anon transparent huge pages into NR_ANON_PAGES truncate: drop 'oldsize' truncate_pagecache() parameter mm: make lru_add_drain_all() selective memcg: document cgroup dirty/writeback memory statistics memcg: add per cgroup writeback pages accounting memcg: check for proper lock held in mem_cgroup_update_page_stat memcg: remove MEMCG_NR_FILE_MAPPED memcg: reduce function dereference memcg: avoid overflow caused by PAGE_ALIGN memcg: rename RESOURCE_MAX to RES_COUNTER_MAX memcg: correct RESOURCE_MAX to ULLONG_MAX mm: memcg: do not trap chargers with full callstack on OOM mm: memcg: rework and document OOM waiting and wakeup ...	2013-09-12 15:44:27 -07:00
Kirill A. Shutemov	7caef26767	truncate: drop 'oldsize' truncate_pagecache() parameter truncate_pagecache() doesn't care about old size since commit `cedabed49b` ("vfs: Fix vmtruncate() regression"). Let's drop it. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-12 15:38:02 -07:00

1 2 3 4 5 ...

2128 Commits