kernel-ark

Author	SHA1	Message	Date
Mikulas Patocka	7acedc5b98	dm exception store: fix misordered writes We must zero the next chunk on disk before writing out the current chunk, not after. Otherwise if the machine crashes at the wrong time, the "end of metadata" marker may be missing. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: stable@kernel.org	2008-10-21 17:44:56 +01:00
Alasdair G Kergon	7c9e6c1732	dm exception store: refactor zero_area Use a separate buffer for writing zeroes to the on-disk snapshot exception store, make the updating of ps->current_area explicit and refactor the code in preparation for the fix in the next patch. No functional change. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@kernel.org	2008-10-21 17:44:55 +01:00
Mikulas Patocka	f68d4f3d39	dm snapshot: drop unused last_percent The last_percent field is unused - remove it. (It dates from when events were triggered as each X% filled up.) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-21 17:44:53 +01:00
Mikulas Patocka	7c5f78b9d7	dm snapshot: fix primary_pe race Fix a race condition with primary_pe ref_count handling. put_pending_exception runs under dm_snapshot->lock, it does atomic_dec_and_test on primary_pe->ref_count, and later does atomic_read primary_pe->ref_count. __origin_write does atomic_dec_and_test on primary_pe->ref_count without holding dm_snapshot->lock. This opens the following race condition: Assume two CPUs, CPU1 is executing put_pending_exception (and holding dm_snapshot->lock). CPU2 is executing __origin_write in parallel. primary_pe->ref_count == 2. CPU1: if (primary_pe && atomic_dec_and_test(&primary_pe->ref_count)) origin_bios = bio_list_get(&primary_pe->origin_bios); ... decrements primary_pe->ref_count to 1. Doesn't load origin_bios CPU2: if (first && atomic_dec_and_test(&primary_pe->ref_count)) { flush_bios(bio_list_get(&primary_pe->origin_bios)); free_pending_exception(primary_pe); /* If we got here, pe_queue is necessarily empty. */ return r; } ... decrements primary_pe->ref_count to 0, submits pending bios, frees primary_pe. CPU1: if (!primary_pe \|\| primary_pe != pe) free_pending_exception(pe); ... this has no effect. if (primary_pe && !atomic_read(&primary_pe->ref_count)) free_pending_exception(primary_pe); ... sees ref_count == 0 (written by CPU 2), does double free !! This bug can happen only if someone is simultaneously writing to both the origin and the snapshot. If someone is writing only to the origin, __origin_write will submit kcopyd request after it decrements primary_pe->ref_count (so it can't happen that the finished copy races with primary_pe->ref_count decrementation). If someone is writing only to the snapshot, __origin_write isn't invoked at all and the race can't happen. The race happens when someone writes to the snapshot --- this creates pending_exception with primary_pe == NULL and starts copying. Then, someone writes to the same chunk in the snapshot, and __origin_write races with termination of already submitted request in pending_complete (that calls put_pending_exception). This race may be reason for bugs: http://bugzilla.kernel.org/show_bug.cgi?id=11636 https://bugzilla.redhat.com/show_bug.cgi?id=465825 The patch fixes the code to make sure that: 1. If atomic_dec_and_test(&primary_pe->ref_count) returns false, the process must no longer dereference primary_pe (because someone else may free it under us). 2. If atomic_dec_and_test(&primary_pe->ref_count) returns true, the process is responsible for freeing primary_pe. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: stable@kernel.org	2008-10-21 17:44:51 +01:00
Kazuo Ito	b673c3a819	dm kcopyd: avoid queue shuffle Write throughput to LVM snapshot origin volume is an order of magnitude slower than those to LV without snapshots or snapshot target volumes, especially in the case of sequential writes with O_SYNC on. The following patch originally written by Kevin Jamieson and Jan Blunck and slightly modified for the current RCs by myself tries to improve the performance by modifying the behaviour of kcopyd, so that it pushes back an I/O job to the head of the job queue instead of the tail as process_jobs() currently does when it has to wait for free pages. This way, write requests aren't shuffled to cause extra seeks. I tested the patch against 2.6.27-rc5 and got the following results. The test is a dd command writing to snapshot origin followed by fsync to the file just created/updated. A couple of filesystem benchmarks gave me similar results in case of sequential writes, while random writes didn't suffer much. dd if=/dev/zero of=<somewhere on snapshot origin> bs=4096 count=... [conv=notrunc when updating] 1) linux 2.6.27-rc5 without the patch, write to snapshot origin, average throughput (MB/s) 10M 100M 1000M create,dd 511.46 610.72 11.81 create,dd+fsync 7.10 6.77 8.13 update,dd 431.63 917.41 12.75 update,dd+fsync 7.79 7.43 8.12 compared with write throughput to LV without any snapshots, all dd+fsync and 1000 MiB writes perform very poorly. 10M 100M 1000M create,dd 555.03 608.98 123.29 create,dd+fsync 114.27 72.78 76.65 update,dd 152.34 1267.27 124.04 update,dd+fsync 130.56 77.81 77.84 2) linux 2.6.27-rc5 with the patch, write to snapshot origin, average throughput (MB/s) 10M 100M 1000M create,dd 537.06 589.44 46.21 create,dd+fsync 31.63 29.19 29.23 update,dd 487.59 897.65 37.76 update,dd+fsync 34.12 30.07 26.85 Although still not on par with plain LV performance - cannot be avoided because it's copy on write anyway - this simple patch successfully improves throughtput of dd+fsync while not affecting the rest. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Kazuo Ito <ito.kazuo@oss.ntt.co.jp> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: stable@kernel.org	2008-10-21 17:44:50 +01:00
Linus Torvalds	ed09441dac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (39 commits) [SCSI] sd: fix compile failure with CONFIG_BLK_DEV_INTEGRITY=n libiscsi: fix locking in iscsi_eh_device_reset libiscsi: check reason why we are stopping iscsi session to determine error value [SCSI] iscsi_tcp: return a descriptive error value during connection errors [SCSI] libiscsi: rename host reset to target reset [SCSI] iscsi class: fix endpoint id handling [SCSI] libiscsi: Support drivers initiating session removal [SCSI] libiscsi: fix data corruption when target has to resend data-in packets [SCSI] sd: Switch kernel printing level for DIF messages [SCSI] sd: Correctly handle all combinations of DIF and DIX [SCSI] sd: Always print actual protection_type [SCSI] sd: Issue correct protection operation [SCSI] scsi_error: fix target reset handling [SCSI] lpfc 8.2.8 v2 : Add statistical reporting control and additional fc vendor events [SCSI] lpfc 8.2.8 v2 : Add sysfs control of target queue depth handling [SCSI] lpfc 8.2.8 v2 : Revert target busy in favor of transport disrupted [SCSI] scsi_dh_alua: remove REQ_NOMERGE [SCSI] lpfc 8.2.8 : update driver version to 8.2.8 [SCSI] lpfc 8.2.8 : Add MSI-X support [SCSI] lpfc 8.2.8 : Update driver to use new Host byte error code DID_TRANSPORT_DISRUPTED ...	2008-10-17 09:00:23 -07:00
Linus Torvalds	c472273f86	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: md: fix input truncation in safe_delay_store() md: check for memory allocation failure in faulty personality md: build failure due to missing delay.h md: Relax minimum size restrictions on chunk_size. md: remove space after function name in declaration and call. md: Remove unnecessary #includes, #defines, and function declarations. md: Convert remaining 1k representations in linear.c to sectors. md: linear.c: Make two local variables sector-based. md: linear: Represent dev_info->size and dev_info->offset in sectors. md: linear.c: Remove broken debug code. md: linear.c: Remove pointless initialization of curr_offset. md: linear.c: Fix typo in comment. md: Don't try to set an array to 'read-auto' if it is already in that state. md: Allow metadata_version to be updated for externally managed metadata. md: Fix rdev_size_store with size == 0	2008-10-16 11:55:11 -07:00
Dan Williams	97ce0a7f9c	md: fix input truncation in safe_delay_store() safe_delay_store() currently truncates the last character of input since it tells strlcpy that the buffer can only hold 'len' characters, off by one. sysfs already null terminates the buffer, so just increase the last argument to strlcpy. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-16 17:03:08 +11:00
Sven Wegener	08ff39f1c8	md: check for memory allocation failure in faulty personality It's a fault injection module, but I don't think we should oops here. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Neil Brown <neilb@suse.de>	2008-10-16 14:16:53 +11:00
Stephen Rothwell	255707274e	md: build failure due to missing delay.h Today's linux-next build (powerpc ppc64_defconfig) failed like this: drivers/md/raid1.c: In function 'sync_request': drivers/md/raid1.c:1759: error: implicit declaration of function 'msleep_interruptible' make[3]: * [drivers/md/raid1.o] Error 1 make[3]: * Waiting for unfinished jobs.... drivers/md/raid10.c: In function 'sync_request': drivers/md/raid10.c:1749: error: implicit declaration of function 'msleep_interruptible' make[3]: *** [drivers/md/raid10.o] Error 1 drivers/md/md.c: In function 'md_do_sync': drivers/md/md.c:5915: error: implicit declaration of function 'msleep' Caused by commit 6caa3b0bbdb474647f6bdd8a958ffc46f78d8d58 ("md: Remove unnecessary #includes, #defines, and function declarations"). I added the following patch. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-15 21:57:05 +11:00
Mike Christie	6000a368cd	[SCSI] block: separate failfast into multiple bits. Multipath is best at handling transport errors. If it gets a device error then there is not much the multipath layer can do. It will just access the same device but from a different path. This patch breaks up failfast into device, transport and driver errors. The multipath layers (md and dm mutlipath) only ask the lower levels to fast fail transport errors. The user of failfast, read ahead, will ask to fast fail on all errors. Note that blk_noretry_request will return true if any failfast bit is set. This allows drivers that do not support the multipath failfast bits to continue to fail on any failfast error like before. Drivers like scsi that are able to fail fast specific errors can check for the specific fail fast type. In the next patch I will convert scsi. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-13 09:28:52 -04:00
NeilBrown	4bbf3771ca	md: Relax minimum size restrictions on chunk_size. Currently, the 'chunk_size' of an array must be at-least PAGE_SIZE. This makes moving an array to a machine with a larger PAGE_SIZE, or changing the kernel to use a larger PAGE_SIZE, can stop an array from working. For RAID10 and RAID4/5/6, this is non-trivial to fix as the resync process works on whole pages at a time, and assumes them to be wholly within a stripe. For other raid personalities, this restriction is not needed at all and can be dropped. So remove the test on chunk_size from common can, and add it in just the places where it is needed: raid10 and raid4/5/6. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
NeilBrown	d710e13812	md: remove space after function name in declaration and call. Having function (args) instead of function(args) make is harder to search for calls of particular functions. So remove all those spaces. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
NeilBrown	fb4d8c76e5	md: Remove unnecessary #includes, #defines, and function declarations. A lot of cruft has gathered over the years. Time to remove it. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
Andre Noll	ab5bd5cbc8	md: Convert remaining 1k representations in linear.c to sectors. This patch renames hash_spacing and preshift to spacing and sector_shift respectively with the following change of semantics: Case 1: (sizeof(sector_t) <= sizeof(u32)). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In this case, we have sector_shift = preshift = 0 and spacing = 2 * hash_spacing. Hence, the index for the hash table which is computed by the new code in which_dev() as sector / spacing equals the old value which was (sector/2) / hash_spacing. Note also that the value of nb_zone stays the same because both sz and base double. Case 2: (sizeof(sector_t) > sizeof(u32)). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (aka the shifting dance case). Here we have sector_shift = preshift + 1 and spacing = 2 * hash_spacing during the computation of nb_zone and curr_sector, but spacing = hash_spacing in which_dev() because in the last hunk of the patch for linear.c we shift down conf->spacing (= 2 * hash_spacing) by one more bit than in the old code. Hence in the computation of nb_zone, sz and base have the same value as before, so nb_zone is not affected. Also curr_sector in the next hunk stays the same. In which_dev() the hash table index is computed as (sector >> sector_shift) / spacing In view of sector_shift = preshift + 1 and spacing = hash_spacing, this equals ((sector/2) >> preshift) / hash_spacing which is the value computed by the old code. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
Andre Noll	23242fbb47	md: linear.c: Make two local variables sector-based. This is a preparation for representing also the remaining fields of struct linear_private_data as sectors. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
Andre Noll	6283815d18	md: linear: Represent dev_info->size and dev_info->offset in sectors. Rename them to num_sectors and start_sector which is more descriptive. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
Andre Noll	451708d2a4	md: linear.c: Remove broken debug code. conf->smallest_size is undefined since day one of the git repo.. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
Andre Noll	481d86c7eb	md: linear.c: Remove pointless initialization of curr_offset. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
Andre Noll	e61130228e	md: linear.c: Fix typo in comment. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
NeilBrown	80268ee927	md: Don't try to set an array to 'read-auto' if it is already in that state. 'read-auto' is a variant of 'readonly' which will switch to writable on the first write attempt. Calling do_md_stop to set the array readonly when it is already readonly returns an error. So make sure not to do that. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:12 +11:00
NeilBrown	ea43ddd849	md: Allow metadata_version to be updated for externally managed metadata. For externally managed metadata, the 'metadata_version' sysfs attribute is really just a channel for user-space programs to communicate about how the array is being managed. It can be useful for this to be changed while the array is active. Normally changes to metadata_version are not permitted while the array is active. Change that so that if the metadata is externally managed, the metadata_version can be changed to a different flavour of external management. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:11 +11:00
Chris Webb	7d3c6f8717	md: Fix rdev_size_store with size == 0 Fix rdev_size_store with size == 0. size == 0 means to use the largest size allowed by the underlying device and is used when modifying an active array. This fixes a regression introduced by commit `d7027458d6` Cc: <stable@kernel.org> Signed-off-by: Chris Webb <chris@arachsys.com> Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-13 11:55:11 +11:00
Alan Jenkins	ce52aebd02	raid, fastboot: hide RAID autodetect option if MD is compiled as a module Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-12 08:25:14 -07:00
Arjan van de Ven	a364092a41	raid: make RAID autodetect default a KConfig option RAID autodetect has the side effect of requiring synchronisation of all device drivers, which can make the boot several seconds longer (I've measured 7 on one of my laptops).... even for systems that don't have RAID setup for the root filesystem (the only FS where this matters). This patch makes the default for autodetect a config option; either way the user can always override via the kernel command line. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: NeilBrown <neilb@suse.de>	2008-10-12 08:25:02 -07:00
Linus Torvalds	b0af205afb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm: detect lost queue dm: publish dm_vcalloc dm: publish dm_table_unplug_all dm: publish dm_get_mapinfo dm: export struct dm_dev dm crypt: avoid unnecessary wait when splitting bio dm crypt: tidy ctx pending dm crypt: fix async inc_pending dm crypt: move dec_pending on error into write_io_submit dm crypt: remove inc_pending from write_io_submit dm crypt: tidy write loop pending dm crypt: tidy crypt alloc dm crypt: tidy inc pending dm exception store: use chunk_t for_areas dm exception store: introduce area_location function dm raid1: kcopyd should stop on error if errors handled dm mpath: remove is_active from struct dm_path dm mpath: use more error codes Fixed up trivial conflict in drivers/md/dm-mpath.c manually.	2008-10-10 11:11:47 -07:00
Alasdair G Kergon	0c2322e4ce	dm: detect lost queue Detect and report buggy drivers that destroy their request_queue. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: Stefan Raspl <raspl@linux.vnet.ibm.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org>	2008-10-10 13:37:13 +01:00
Mikulas Patocka	5416090426	dm: publish dm_vcalloc Publish dm_vcalloc in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:12 +01:00
Mikulas Patocka	ea0ec64094	dm: publish dm_table_unplug_all Publish dm_table_unplug_all in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:11 +01:00
Mikulas Patocka	89343da077	dm: publish dm_get_mapinfo Publish dm_get_mapinfo in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:10 +01:00
Mikulas Patocka	82b1519b34	dm: export struct dm_dev Split struct dm_dev in two and publish the part that other targets need in include/linux/device-mapper.h. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:09 +01:00
Milan Broz	933f01d433	dm crypt: avoid unnecessary wait when splitting bio Don't wait between submitting crypt requests for a bio unless we are short of memory. There are two situations when we must split an encrypted bio: 1) there are no free pages; 2) the new bio would violate underlying device restrictions (e.g. max hw segments). In case (2) we do not need to wait. Add output variable to crypt_alloc_buffer() to distinguish between these cases. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:08 +01:00
Milan Broz	c8081618a9	dm crypt: tidy ctx pending Move the initialisation of ctx->pending into one place, at the start of crypt_convert(). Introduce crypt_finished to indicate whether or not the encryption is finished, for use in a later patch. No functional change. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:08 +01:00
Milan Broz	4e59409891	dm crypt: fix async inc_pending The pending reference count must be incremented before the async work is queued to another thread, not after. Otherwise there's a race if the work completes and decrements the reference count before it gets incremented. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:07 +01:00
Milan Broz	6c031f41db	dm crypt: move dec_pending on error into write_io_submit Make kcryptd_crypt_write_io_submit() responsible for decrementing the pending count after an error. Also fixes a bug in the async path that forgot to decrement it. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:06 +01:00
Alasdair G Kergon	1e37bb8e55	dm crypt: remove inc_pending from write_io_submit Make the caller reponsible for incrementing the pending count before calling kcryptd_crypt_write_io_submit() in the non-async case to bring it into line with the async case. Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:05 +01:00
Milan Broz	fc5a5e9aa8	dm crypt: tidy write loop pending Move kcryptd_crypt_write_convert_loop inside kcryptd_crypt_write_convert. This change is needed for a later patch. No functional change. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:04 +01:00
Milan Broz	dc440d1e56	dm crypt: tidy crypt alloc Factor out crypt io allocation code. Later patches will call it from another place. No functional change. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:03 +01:00
Milan Broz	3e1a8bdd05	dm crypt: tidy inc pending Move io pending to one place. No functional change, usefull to simplify debugging. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:02 +01:00
Mikulas Patocka	fd14acf6fc	dm exception store: use chunk_t for_areas Change uint32_t into chunk_t to remove 32-bit limitation on the number of chunks on systems with 64-bit sector numbers. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:01 +01:00
Mikulas Patocka	a481db7846	dm exception store: introduce area_location function Move this logic to a function, because it will be reused later. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:37:00 +01:00
Jonathan Brassow	f7c83e2e47	dm raid1: kcopyd should stop on error if errors handled dm-raid1 is setting the 'DM_KCOPYD_IGNORE_ERROR' flag unconditionally when assigning kcopyd work. kcopyd is responsible for copying an assigned section of disk to one or more other disks. The 'DM_KCOPYD_IGNORE_ERROR' flag affects kcopyd in the following way: When not set: kcopyd will immediately stop the copy operation when an error is encountered. When set: kcopyd will try to proceed regardless of errors and try to continue copying any remaining amount. Since dm-raid1 tracks regions of the address space that are (or are not) in sync and it now has the ability to handle these errors, we can safely enable this optimization. This optimization is conditional on whether mirror error handling has been enabled. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:36:59 +01:00
Kiyoshi Ueda	6680073d3e	dm mpath: remove is_active from struct dm_path This patch moves 'is_active' from struct dm_path to struct pgpath as it does not need exporting. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:36:58 +01:00
Benjamin Marzinski	01460f3520	dm mpath: use more error codes This patch allows path errors from the multipath ctr function to propagate up to userspace as errno values from the ioctl() call. This is in response to https://www.redhat.com/archives/dm-devel/2008-May/msg00000.html and https://bugzilla.redhat.com/show_bug.cgi?id=444421 The patch only lets through the errors that it needs to in order to get the path errors from parse_path(). Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-10 13:36:57 +01:00
Denis ChengRq	6feef531f5	block: mark bio_split_pool static Since all bio_split calls refer the same single bio_split_pool, the bio_split function can use bio_split_pool directly instead of the mempool_t parameter; then the mempool_t parameter can be removed from bio_split param list, and bio_split_pool is only referred in fs/bio.c file, can be marked static. Signed-off-by: Denis ChengRq <crquan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:57:05 +02:00
Mike Anderson	224cb3e981	dm: Call blk_abort_queue on failed paths Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:14 +02:00
Tejun Heo	074a7aca7a	block: move stats from disk to part0 Move stats related fields - stamp, in_flight, dkstats - from disk to part0 and unify stat handling such that... * part_stat_() now updates part0 together if the specified partition is not part0. ie. part_stat_() are now essentially all_stat_(). {disk\|all}_stat_() are gone. part_round_stats() is updated similary. It handles part0 stats automatically and disk_round_stats() is killed. * part_{inc\|dec}_in_fligh() is implemented which automatically updates part0 stats for parts other than part0. * disk_map_sector_rcu() is updated to return part0 if no part matches. Combined with the above changes, this makes NULL special case handling in callers unnecessary. * Separate stats show code paths for disk are collapsed into part stats show code paths. * Rename disk_stat_lock/unlock() to part_stat_lock/unlock() While at it, reposition stat handling macros a bit and add missing parentheses around macro parameters. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:08 +02:00
Tejun Heo	0762b8bde9	block: always set bdev->bd_part Till now, bdev->bd_part is set only if the bdev was for parts other than part0. This patch makes bdev->bd_part always set so that code paths don't have to differenciate common handling. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:08 +02:00
Tejun Heo	b7db9956e5	block: move policy from disk to part0 Move disk->policy to part0->policy. Implement and use get_disk_ro(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:07 +02:00
Tejun Heo	ed9e198234	block: implement and use {disk\|part}_to_dev() Implement {disk\|part}_to_dev() and use them to access generic device instead of directly dereferencing {disk\|part}->dev. To make sure no user is left behind, rename generic devices fields to __dev. This is in preparation of unifying partition 0 handling with other partitions. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:07 +02:00

1 2 3 4 5 ...

1008 Commits