A lot of files pull in module.h when all they are really
looking for is the basic EXPORT_SYMBOL functionality. The
recent data from Ingo[1] shows that this is one of several
instances that has a significant impact on compile times,
and it should be targeted for factoring out (as done here).
Note that several commonly used header files in include/*
directly include <linux/module.h> themselves (some 34 of them!)
The most commonly used ones of these will have to be made
independent of module.h before the full benefit of this change
can be realized.
We also transition THIS_MODULE from module.h to export.h,
since there are lots of files with subsystem structs that
in turn will have a struct module *owner and only be doing:
.owner = THIS_MODULE;
and absolutely nothing else modular. So, we also want to have
the THIS_MODULE definition present in the lightweight header.
[1] https://lkml.org/lkml/2011/5/23/76
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c: Functions for byte-swapped smbus_write/read_word_data
i2c-algo-pca: Return standard fault codes
i2c-algo-bit: Return standard fault codes
i2c-algo-bit: Be verbose on bus testing failure
i2c-algo-bit: Let user test buses without failing
i2c/scx200_acb: Fix section mismatch warning in scx200_pci_drv
i2c: I2C_ELEKTOR should depend on HAS_IOPORT
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (33 commits)
iommu/core: Remove global iommu_ops and register_iommu
iommu/msm: Use bus_set_iommu instead of register_iommu
iommu/omap: Use bus_set_iommu instead of register_iommu
iommu/vt-d: Use bus_set_iommu instead of register_iommu
iommu/amd: Use bus_set_iommu instead of register_iommu
iommu/core: Use bus->iommu_ops in the iommu-api
iommu/core: Convert iommu_found to iommu_present
iommu/core: Add bus_type parameter to iommu_domain_alloc
Driver core: Add iommu_ops to bus_type
iommu/core: Define iommu_ops and register_iommu only with CONFIG_IOMMU_API
iommu/amd: Fix wrong shift direction
iommu/omap: always provide iommu debug code
iommu/core: let drivers know if an iommu fault handler isn't installed
iommu/core: export iommu_set_fault_handler()
iommu/omap: Fix build error with !IOMMU_SUPPORT
iommu/omap: Migrate to the generic fault report mechanism
iommu/core: Add fault reporting mechanism
iommu/core: Use PAGE_SIZE instead of hard-coded value
iommu/core: use the existing IS_ALIGNED macro
iommu/msm: ->unmap() should return order of unmapped page
...
Fixup trivial conflicts in drivers/iommu/Makefile: "move omap iommu to
dedicated iommu folder" vs "Rename the DMAR and INTR_REMAP config
options" just happened to touch lines next to each other.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
amd64_edac: Cleanup return type of amd64_determine_edac_cap()
amd64_edac: Add a fix for Erratum 505
EDAC, MCE, AMD: Simplify NB MCE decoder interface
EDAC, MCE, AMD: Drop local coreid reporting
EDAC, MCE, AMD: Print valid addr when reporting an error
EDAC, MCE, AMD: Print CPU number when reporting the error
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (75 commits)
KVM: SVM: Keep intercepting task switching with NPT enabled
KVM: s390: implement sigp external call
KVM: s390: fix register setting
KVM: s390: fix return value of kvm_arch_init_vm
KVM: s390: check cpu_id prior to using it
KVM: emulate lapic tsc deadline timer for guest
x86: TSC deadline definitions
KVM: Fix simultaneous NMIs
KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode
KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode
KVM: x86 emulator: streamline decode of segment registers
KVM: x86 emulator: simplify OpMem64 decode
KVM: x86 emulator: switch src decode to decode_operand()
KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack
KVM: x86 emulator: switch OpImmUByte decode to decode_imm()
KVM: x86 emulator: free up some flag bits near src, dst
KVM: x86 emulator: switch src2 to generic decode_operand()
KVM: x86 emulator: expand decode flags to 64 bits
KVM: x86 emulator: split dst decode to a generic decode_operand()
KVM: x86 emulator: move memop, memopp into emulation context
...
* 'fbdev-next' of git://github.com/schandinat/linux-2.6: (270 commits)
video: platinumfb: Add __devexit_p at necessary place
drivers/video: fsl-diu-fb: merge diu_pool into fsl_diu_data
drivers/video: fsl-diu-fb: merge diu_hw into fsl_diu_data
drivers/video: fsl-diu-fb: only DIU modes 0 and 1 are supported
drivers/video: fsl-diu-fb: remove unused panel operating mode support
drivers/video: fsl-diu-fb: use an enum for the AOI index
drivers/video: fsl-diu-fb: add several new video modes
drivers/video: fsl-diu-fb: remove broken screen blanking support
drivers/video: fsl-diu-fb: move some definitions out of the header file
drivers/video: fsl-diu-fb: fix some ioctls
video: da8xx-fb: Increased resolution configuration of revised LCDC IP
OMAPDSS: picodlp: add missing #include <linux/module.h>
fb: fix au1100fb bitrot.
mx3fb: fix NULL pointer dereference in screen blanking.
video: irq: Remove IRQF_DISABLED
smscufx: change edid data to u8 instead of char
OMAPDSS: DISPC: zorder support for DSS overlays
OMAPDSS: DISPC: VIDEO3 pipeline support
OMAPDSS/OMAP_VOUT: Fix incorrect OMAP3-alpha compatibility setting
video/omap: fix build dependencies
...
Fix up conflicts in:
- drivers/staging/xgifb/XGI_main_26.c
Changes to XGIfb_pan_var()
- drivers/video/omap/{lcd_apollon.c,lcd_ldp.c,lcd_overo.c}
Removed (or in the case of apollon.c, merged into the generic
DSS panel in drivers/video/omap2/displays/panel-generic-dpi.c)
Reimplemented at least 17 times discounting error mangling cases
where it could be used.
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Adjust i2c-algo-pca to return fault codes compliant with
Documentation/i2c/fault-codes, rather than the undocumented and
vague -EREMOTEIO.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Adjust i2c-algo-bit to return fault codes compliant with
Documentation/i2c/fault-codes, rather than the undocumented and
vague -EREMOTEIO.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
If bus testing fails due to the bus being seen as busy, it might be
helpful for developers to know which line is unexpectedly low.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Always failing to register I2C buses when the line testing fails is a
little harsh. While such a failure is definitely a bug in the driver
that exposes the affected I2C bus, things may still work fine if the
missing initialization steps are done later, before the I2C bus is
used. So it seems a better debugging tool to just report the test
failure by default. I introduce bit_test=2 if anyone really misses the
original behavior of bit_test=1.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
WARNING: drivers/i2c/busses/built-in.o(.data+0x47c8): Section mismatch in reference from the variable scx200_pci_drv to the function .devinit.text:scx200_probe()
The variable scx200_pci_drv references
the function __devinit scx200_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
Signed-off-by: Harvey Yang <harvey.huawei.yang@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
On m68k, I get:
drivers/i2c/busses/i2c-elektor.c: In function ‘pcf_isa_init’:
drivers/i2c/busses/i2c-elektor.c:153: error: implicit declaration of function ‘ioport_map’
drivers/i2c/busses/i2c-elektor.c:153: warning: assignment makes pointer from integer without a cast
drivers/i2c/busses/i2c-elektor.c: In function ‘elektor_probe’:
drivers/i2c/busses/i2c-elektor.c:287: error: implicit declaration of function ‘ioport_unmap’
Since commit 82ed223c26 ("iomap: make IOPORT/PCI
mapping functions conditional"), ioport_map() is only available on platforms
that set HAS_IOPORT.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
AMD processors apparently have a bug in the hardware task switching
support when NPT is enabled. If the task switch triggers a NPF, we can
get wrong EXITINTINFO along with that fault. On resume, spurious
exceptions may then be injected into the guest.
We were able to reproduce this bug when our guest triggered #SS and the
handler were supposed to run over a separate task with not yet touched
stack pages.
Work around the issue by continuing to emulate task switches even in
NPT mode.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Implement sigp external call, which might be required for guests that
issue an external call instead of an emergency signal for IPI.
This fixes an issue with "KVM: unknown SIGP: 0x02" when booting
such an SMP guest.
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM common code does vcpu_load prior to calling our arch ioctls and
vcpu_put after we're done here. Via the kvm_arch_vcpu_load/put
callbacks we do load the fpu and access register state into the
processor, which saves us moving the state on every SIE exit the
kernel handles. However this breaks register setting from userspace,
because of the following sequence:
1a. vcpu load stores userspace register content
1b. vcpu load loads guest register content
2. kvm_arch_vcpu_ioctl_set_fpu/sregs updates saved guest register content
3a. vcpu put stores the guest registers and overwrites the new content
3b. vcpu put loads the userspace register set again
This patch loads the new guest register state into the cpu, so that the correct
(new) set of guest registers will be stored in step 3a.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch fixes the return value of kvm_arch_init_vm in case a memory
allocation goes wrong.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We use the cpu id provided by userspace as array index here. Thus we
clearly need to check it first. Ooops.
CC: <stable@vger.kernel.org>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
ARM: mark empty gpio.h files empty
gpio: Fix ARM versatile-express build failure
of: include errno.h
* 'spi/next' of git://git.secretlab.ca/git/linux-2.6:
drivercore: Add helper macro for platform_driver boilerplate
spi: irq: Remove IRQF_DISABLED
OMAP: SPI: Fix the trying to free nonexistent resource error
spi/spi-ep93xx: add module.h include
spi/tegra: fix compilation error in spi-tegra.c
spi: spi-dw: fix all sparse warnings
spi/spi-pl022: Call pl022_dma_remove(pl022) only if enable_dma is true
spi/spi-pl022: calculate_effective_freq() must set rate <= requested rate
spi/spi-pl022: Don't allocate more sg than required.
spi/spi-pl022: Use GFP_ATOMIC for allocation from tasklet
spi/spi-pl022: Resolve formatting issues
* 'gpio/next' of git://git.secretlab.ca/git/linux-2.6:
h8300: Move gpio.h to gpio-internal.h
gpio: pl061: add DT binding support
gpio: fix build error in include/asm-generic/gpio.h
gpiolib: Ensure struct gpio is always defined
irq: Add EXPORT_SYMBOL_GPL to function of irq generic-chip
gpio-ml-ioh: Use NUMA_NO_NODE not GFP_KERNEL
gpio-pch: Use NUMA_NO_NODE not GFP_KERNEL
gpio: langwell: ensure alternate function is cleared
gpio-pch: Support interrupt function
gpio-pch: Save register value in suspend()
gpio-pch: modify gpio_nums and mask
gpio-pch: support ML7223 IOH n-Bus
gpio-pch: add spinlock in suspend/resume processing
gpio-pch: Delete invalid "restore" code in suspend()
gpio-ml-ioh: Fix suspend/resume issue
gpio-ml-ioh: Support interrupt function
gpio-ml-ioh: Delete unnecessary code
gpio/mxc: add chained_irq_enter/exit() to mx3_gpio_irq_handler()
gpio/nomadik: use genirq core to track enablement
gpio/nomadik: disable clocks when unused
It is generally a better idea to make intentionally empty files
contain the human-readable /* empty */ comment, also it makes
the files play nice with "make distclean".
Reported-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
A missing mach/gpio.h prevents building gpiolib on versatile express.
CC drivers/gpio/gpiolib.o
In file included from /.../linux/include/linux/gpio.h:18:0,
from /.../linux/drivers/gpio/gpiolib.c:10:
/.../linux/arch/arm/include/asm/gpio.h:5:23: fatal error: mach/gpio.h: No such file or directory
compilation terminated.
make[3]: *** [drivers/gpio/gpiolib.o] Error 1
make[2]: *** [drivers/gpio] Error 2
make[1]: *** [drivers] Error 2
make: *** [sub-make] Error 2
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
When compiling ath6kl for beagleboard (omap2plus_defconfig plus
CONFIG_ATH6KL, CONFIG_OF disable) with current linux-next compilation
fails:
include/linux/of.h:269: error: 'ENOSYS' undeclared (first use in this function)
include/linux/of.h:276: error: 'ENOSYS' undeclared (first use in this function)
include/linux/of.h:289: error: 'ENOSYS' undeclared (first use in this function)
Fix this by including errno.h from of.h.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* 'for-linus' of git://ceph.newdream.net/git/ceph-client:
libceph: fix double-free of page vector
ceph: fix 32-bit ino numbers
libceph: force resend of osd requests if we skip an osdmap
ceph: use kernel DNS resolver
ceph: fix ceph_monc_init memory leak
ceph: let the set_layout ioctl set single traits
Revert "ceph: don't truncate dirty pages in invalidate work thread"
ceph: replace leading spaces with tabs
libceph: warn on msg allocation failures
libceph: don't complain on msgpool alloc failures
libceph: always preallocate mon connection
libceph: create messenger with client
ceph: document ioctls
ceph: implement (optional) max read size
ceph: rename rsize -> rasize
ceph: make readpages fully async
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (549 commits)
ALSA: hda - Fix ADC input-amp handling for Cx20549 codec
ALSA: hda - Keep EAPD turned on for old Conexant chips
ALSA: hda/realtek - Fix missing volume controls with ALC260
ASoC: wm8940: Properly set codec->dapm.bias_level
ALSA: hda - Fix pin-config for ASUS W90V
ALSA: hda - Fix surround/CLFE headphone and speaker pins order
ALSA: hda - Fix typo
ALSA: Update the sound git tree URL
ALSA: HDA: Add new revision for ALC662
ASoC: max98095: Convert codec->hw_write to snd_soc_write
ASoC: keep pointer to resource so it can be freed
ASoC: sgtl5000: Fix wrong mask in some snd_soc_update_bits calls
ASoC: wm8996: Fix wrong mask for setting WM8996_AIF_CLOCKING_2
ASoC: da7210: Add support for line out and DAC
ASoC: da7210: Add support for DAPM
ALSA: hda/realtek - Fix DAC assignments of multiple speakers
ASoC: Use SGTL5000_LINREG_VDDD_MASK instead of hardcoded mask value
ASoC: Set sgtl5000->ldo in ldo_regulator_register
ASoC: wm8996: Use SND_SOC_DAPM_AIF_OUT for AIF2 Capture
ASoC: wm8994: Use SND_SOC_DAPM_AIF_OUT for AIF3 Capture
...
* 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI: Clean-up MPS debug output
pci: Clamp pcie_set_readrq() when using "performance" settings
PCI: enable MPS "performance" setting to properly handle bridge MPS
PCI: Workaround for Intel MPS errata
PCI: Add support for PASID capability
PCI: Add implementation for PRI capability
PCI: Export ATS functions to modules
PCI: Move ATS implementation into own file
PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
PCI / PM: Extend PME polling to all PCI devices
PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
PCI: Make pci_setup_bridge() non-static for use by arch code
x86: constify PCI raw ops structures
PCI: Add quirk for known incorrect MPSS
PCI: Add Solarflare vendor ID and SFC4000 device IDs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (83 commits)
mmc: fix compile error when CONFIG_BLOCK is not enabled
mmc: core: Cleanup eMMC4.5 conditionals
mmc: omap_hsmmc: if multiblock reads are broken, disable them
mmc: core: add workaround for controllers with broken multiblock reads
mmc: core: Prevent too long response times for suspend
mmc: recognise SDIO cards with SDIO_CCCR_REV 3.00
mmc: sd: Handle SD3.0 cards not supporting UHS-I bus speed mode
mmc: core: support HPI send command
mmc: core: Add cache control for eMMC4.5 device
mmc: core: Modify the timeout value for writing power class
mmc: core: new discard feature support at eMMC v4.5
mmc: core: mmc sanitize feature support for v4.5
mmc: dw_mmc: modify DATA register offset
mmc: sdhci-pci: add flag for devices that can support runtime PM
mmc: omap_hsmmc: ensure pbias configuration is always done
mmc: core: Add Power Off Notify Feature eMMC 4.5
mmc: sdhci-s3c: fix potential NULL dereference
mmc: replace printk with appropriate display macro
mmc: core: Add default timeout value for CMD6
mmc: sdhci-pci: add runtime pm support
...
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
leases: fix write-open/read-lease race
nfs: drop unnecessary locking in llseek
ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
vfs: add generic_file_llseek_size
vfs: do (nearly) lockless generic_file_llseek
direct-io: merge direct_io_walker into __blockdev_direct_IO
direct-io: inline the complete submission path
direct-io: separate map_bh from dio
direct-io: use a slab cache for struct dio
direct-io: rearrange fields in dio/dio_submit to avoid holes
direct-io: fix a wrong comment
direct-io: separate fields only used in the submission path from struct dio
vfs: fix spinning prevention in prune_icache_sb
vfs: add a comment to inode_permission()
vfs: pass all mask flags check_acl and posix_acl_permission
vfs: add hex format for MAY_* flag values
vfs: indicate that the permission functions take all the MAY_* flags
compat: sync compat_stats with statfs.
vfs: add "device" tag to /proc/self/mountstats
cleanup: vfs: small comment fix for block_invalidatepage
...
Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
* http://sucs.org/~rohan/git/gfs2-3.0-nmw: (24 commits)
GFS2: Move readahead of metadata during deallocation into its own function
GFS2: Remove two unused variables
GFS2: Misc fixes
GFS2: rewrite fallocate code to write blocks directly
GFS2: speed up delete/unlink performance for large files
GFS2: Fix off-by-one in gfs2_blk2rgrpd
GFS2: Clean up ->page_mkwrite
GFS2: Correctly set goal block after allocation
GFS2: Fix AIL flush issue during fsync
GFS2: Use cached rgrp in gfs2_rlist_add()
GFS2: Call do_strip() directly from recursive_scan()
GFS2: Remove obsolete assert
GFS2: Cache the most recently used resource group in the inode
GFS2: Make resource groups "append only" during life of fs
GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme
GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added
GFS2: Clean up gfs2_create
GFS2: Use ->dirty_inode()
GFS2: Fix bug trap and journaled data fsync
GFS2: Fix inode allocation error path
...
* '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits)
Fix build break when freezer not configured
Add definition for share encryption
CIFS: Make cifs_push_locks send as many locks at once as possible
CIFS: Send as many mandatory unlock ranges at once as possible
CIFS: Implement caching mechanism for posix brlocks
CIFS: Implement caching mechanism for mandatory brlocks
CIFS: Fix DFS handling in cifs_get_file_info
CIFS: Fix error handling in cifs_readv_complete
[CIFS] Fixup trivial checkpatch warning
[CIFS] Show nostrictsync and noperm mount options in /proc/mounts
cifs, freezer: add wait_event_freezekillable and have cifs use it
cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters
cifs: tune bdi.ra_pages in accordance with the rsize
cifs: allow for larger rsize= options and change defaults
cifs: convert cifs_readpages to use async reads
cifs: add cifs_async_readv
cifs: fix protocol definition for READ_RSP
cifs: add a callback function to receive the rest of the frame
cifs: break out 3rd receive phase into separate function
cifs: find mid earlier in receive codepath
...
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (69 commits)
xfs: add AIL pushing tracepoints
xfs: put in missed fix for merge problem
xfs: do not flush data workqueues in xfs_flush_buftarg
xfs: remove XFS_bflush
xfs: remove xfs_buf_target_name
xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks
xfs: clean up xfs_ioerror_alert
xfs: clean up buffer allocation
xfs: remove buffers from the delwri list in xfs_buf_stale
xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE
xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF
xfs: remove XFS_BUF_FINISH_IOWAIT
xfs: remove xfs_get_buftarg_list
xfs: fix buffer flushing during unmount
xfs: optimize fsync on directories
xfs: reduce the number of log forces from tail pushing
xfs: Don't allocate new buffers on every call to _xfs_buf_find
xfs: simplify xfs_trans_ijoin* again
xfs: unlock the inode before log force in xfs_change_file_space
xfs: unlock the inode before log force in xfs_fs_nfs_commit_metadata
...
In setlease, we use i_writecount to decide whether we can give out a
read lease.
In open, we break leases before incrementing i_writecount.
There is therefore a window between the break lease and the i_writecount
increment when setlease could add a new read lease.
This would leave us with a simultaneous write open and read lease, which
shouldn't happen.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This makes NFS follow the standard generic_file_llseek locking scheme.
Cc: Trond.Myklebust@netapp.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This gives ext4 the benefits of unlocked llseek.
Cc: tytso@mit.edu
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add a generic_file_llseek variant to the VFS that allows passing in
the maximum file size of the file system, instead of always
using maxbytes from the superblock.
This can be used to eliminate some cut'n'paste seek code in ext4.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The i_mutex lock use of generic _file_llseek hurts. Independent processes
accessing the same file synchronize over a single lock, even though
they have no need for synchronization at all.
Under high utilization this can cause llseek to scale very poorly on larger
systems.
This patch does some rethinking of the llseek locking model:
First the 64bit f_pos is not necessarily atomic without locks
on 32bit systems. This can already cause races with read() today.
This was discussed on linux-kernel in the past and deemed acceptable.
The patch does not change that.
Let's look at the different seek variants:
SEEK_SET: Doesn't really need any locking.
If there's a race one writer wins, the other loses.
For 32bit the non atomic update races against read()
stay the same. Without a lock they can also happen
against write() now. The read() race was deemed
acceptable in past discussions, and I think if it's
ok for read it's ok for write too.
=> Don't need a lock.
SEEK_END: This behaves like SEEK_SET plus it reads
the maximum size too. Reading the maximum size would have the
32bit atomic problem. But luckily we already have a way to read
the maximum size without locking (i_size_read), so we
can just use that instead.
Without i_mutex there is no synchronization with write() anymore,
however since the write() update is atomic on 64bit it just behaves
like another racy SEEK_SET. On non atomic 32bit it's the same
as SEEK_SET.
=> Don't need a lock, but need to use i_size_read()
SEEK_CUR: This has a read-modify-write race window
on the same file. One could argue that any application
doing unsynchronized seeks on the same file is already broken.
But for the sake of not adding a regression here I'm
using the file->f_lock to synchronize this. Using this
lock is much better than the inode mutex because it doesn't
synchronize between processes.
=> So still need a lock, but can use a f_lock.
This patch implements this new scheme in generic_file_llseek.
I dropped generic_file_llseek_unlocked and changed all callers.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This doesn't change anything for the compiler, but hch thought it would
make the code clearer.
I moved the reference counting into its own little inline.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add inlines to all the submission path functions. While this increases
code size it also gives gcc a lot of optimization opportunities
in this critical hotpath.
In particular -- together with some other changes -- this
allows gcc to get rid of the unnecessary clearing of
sdio at the beginning and optimize the messy parameter passing.
Any non inlining of a function which takes a sdio parameter
would break this optimization because they cannot be done if the
address of a structure is taken.
Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING
and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off.
This gives about 2.2% improvement on a large database benchmark
with a high IOPS rate.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Only a single b_private field in the map_bh buffer head is needed after
the submission path. Move map_bh separately to avoid storing
this information in the long term slab.
This avoids the weird 104 byte hole in struct dio_submit which also needed
to be memseted early.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
A direct slab call is slightly faster than kmalloc and can be better cached
per CPU. It also avoids rounding to the next kmalloc slab.
In addition this enforces cache line alignment for struct dio to avoid
any false sharing.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fix most problems reported by pahole.
There is still a weird 104 byte hole after map_bh. I'm not sure what
causes this.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There's nothing on the stack, even before my changes.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This large, but largely mechanic, patch moves all fields in struct dio
that are only used in the submission path into a separate on stack
data structure. This has the advantage that the memory is very likely
cache hot, which is not guaranteed for memory fresh out of kmalloc.
This also gives gcc more optimization potential because it can easier
determine that there are no external aliases for these variables.
The sdio initialization is a initialization now instead of memset.
This allows gcc to break sdio into individual fields and optimize
away unnecessary zeroing (after all the functions are inlined)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We need to move the inode to the end of the list to actually make the
spinning prevention explained in the comment above it work. With a
plain list_move it will simply stay in place as we're always reclaiming
from the head of the list.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>