Fix memory corruption and crash on 32-bit x86 systems.
If a !PAE x86 kernel is booted on a 32-bit system with more than 4GB of
RAM, then we call memory_present() with a start/end that goes outside
the scope of MAX_PHYSMEM_BITS.
That causes this loop to happily walk over the limit of the sparse
memory section map:
for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
unsigned long section = pfn_to_section_nr(pfn);
struct mem_section *ms;
sparse_index_init(section, nid);
set_section_nid(section, nid);
ms = __nr_to_section(section);
if (!ms->section_mem_map)
ms->section_mem_map = sparse_encode_early_nid(nid) |
SECTION_MARKED_PRESENT;
'ms' will be out of bounds and we'll corrupt a small amount of memory by
encoding the node ID and writing SECTION_MARKED_PRESENT (==0x1) over it.
The corruption might happen when encoding a non-zero node ID, or due to
the SECTION_MARKED_PRESENT which is 0x1:
mmzone.h:#define SECTION_MARKED_PRESENT (1UL<<0)
The fix is to sanity check anything the architecture passes to
sparsemem.
This bug seems to be rather old (as old as sparsemem support itself),
but the exact incarnation depended on random details like configs, which
made this bug more prominent in v2.6.25-to-be.
An additional enhancement might be to print a warning about ignored or
trimmed memory ranges.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <Yinghai.Lu@sun.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The signal trampolines were accidently flushing the kernel I$ instead of
the users. Fix that up, and also add a missing user D$ flush while
we're at it.
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When I cleaned up printk() and split up the printk locking logic in
commit 266c2e0abe ("Make printk() console
semaphore accesses sensible") I had incorrectly moved the call to
have_callable_console() outside of the console semaphore.
That was buggy. The console semaphore protects the console_drivers list
that is used by have_callable_console().
Thanks go to Bongani Hlope who saw this as a hang on shutdown and reboot
and bisected the bug to the right commit, and tested this patch. See
http://lkml.org/lkml/2008/4/11/315
Bisected-and-tested-by: Bongani Hlope <bonganilinux@mweb.co.za>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/sh/kernel/traps_32.c: In function `do_reserved_inst':
arch/sh/kernel/traps_32.c:667: error: implicit declaration of function `do_fpu_inst'
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
commit 54a0151041 broke zImage build on sh arch:
LD vmlinux
SYSMAP System.map
SYSMAP .tmp_System.map
AS arch/sh/boot/compressed/head_32.o
In file included from /k/arch/sh/boot/compressed/head_32.S:11:
/k/include/linux/linkage.h:34: error: syntax error in macro parameter list
Fix it for both sh and sh64.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch fixes some compile errors due to missing save_fpu()
prototypes on sh64 caused by
commit 9bbafce2ee
(sh: Fix occasional FPU register corruption under preempt).
Signed-off-by: Adrian Bunk <adrian.bunk@movial.fi>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This fixes a regression introduced in commit
205c109a7a when switching to
write_begin/write_end operations in JFFS2.
The page offset is miscalculated, leading to corruption of the fragment
lists and subsequently to memory corruption and panics.
[ Side note: the bug is a fairly direct result of the naming. Nick was
likely misled by the use of "offs", since we tend to use the notion of
"offset" not as an absolute position, but as an offset _within_ a page
or allocation.
Alternatively, a "pgoff_t" is a page index, but not a byte offset -
our VM naming can be a bit confusing.
So in this case, a VM person would likely have called this a "pos",
not an "offs", or perhaps talked about byte offsets rather than page
offsets (since it's counted in bytes, not pages). - Linus ]
Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Signed-off-by: Vasiliy Leonenko <vasiliy.leonenko@mail.ru>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Miklos Szeredi found the bug:
"Basically what happens is that on the server nlm_fopen() calls
nfsd_open() which returns -EACCES, to which nlm_fopen() returns
NLM_LCK_DENIED.
"On the client this will turn into a -EAGAIN (nlm_stat_to_errno()),
which in will cause fcntl_setlk() to retry forever."
So, for example, opening a file on an nfs filesystem, changing
permissions to forbid further access, then trying to lock the file,
could result in an infinite loop.
And Trond Myklebust identified the culprit, from Marc Eshel and I:
7723ec9777 "locks: factor out
generic/filesystem switch from setlock code"
That commit claimed to just be reshuffling code, but actually introduced
a behavioral change by calling the lock method repeatedly as long as it
returned -EAGAIN.
We assumed this would be safe, since we assumed a lock of type SETLKW
would only return with either success or an error other than -EAGAIN.
However, nfs does can in fact return -EAGAIN in this situation, and
independently of whether that behavior is correct or not, we don't
actually need this change, and it seems far safer not to depend on such
assumptions about the filesystem's ->lock method.
Therefore, revert the problematic part of the original commit. This
leaves vfs_lock_file() and its other callers unchanged, while returning
fcntl_setlk and fcntl_setlk64 to their former behavior.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] make ali_atapi_dma static
[libata] sata_svw: fix reversed port count
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
[BRIDGE]: Fix crash in __ip_route_output_key with bridge netfilter
[NETFILTER]: ipt_CLUSTERIP: fix race between clusterip_config_find_get and _entry_put
[IPV6] ADDRCONF: Don't generate temporary address for ip6-ip6 interface.
[IPV6] ADDRCONF: Ensure disabling multicast RS even if privacy extensions are disabled.
[IPV6]: Use appropriate sock tclass setting for routing lookup.
[IPV6]: IPv6 extension header structures need to be packed.
[IPV6]: Fix ipv6 address fetching in raw6_icmp_error().
[NET]: Return more appropriate error from eth_validate_addr().
[ISDN]: Do not validate ISDN net device address prior to interface-up
[NET]: Fix kernel-doc for skb_segment
[SOCK] sk_stamp: should be initialized to ktime_set(-1L, 0)
net: check for underlength tap writes
net: make struct tun_struct private to tun.c
[SCTP]: IPv4 vs IPv6 addresses mess in sctp_inet[6]addr_event.
[SCTP]: Fix compiler warning about const qualifiers
[SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK
[SCTP]: Add check for hmac_algo parameter in sctp_verify_param()
[NET_SCHED] cls_u32: refcounting fix for u32_delete()
[DCCP]: Fix skb->cb conflicts with IP
[AX25]: Potential ax25_uid_assoc-s leaks on module unload.
...
Correctly determine the address of an illegal instruction. The EPCR0 register
holds this value (masked by EPCR0_PC) if the validity bit is set (masked by
EPCR0_V). So the test as to whether the contents of the register are usable
should be involve checking the _V bit, not the _PC bits.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
revert "sched: fix fair sleepers" (e22ecef1d2),
because it is causing audio skipping, see:
http://bugzilla.kernel.org/show_bug.cgi?id=10428
the patch is correct and the real cause of the skipping is not
understood (tracing makes it go away), but time has run out so we'll
revert it and re-try in 2.6.26.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The bridge netfilter code attaches a fake dst_entry with a pointer to a
fake net_device structure to skbs it passes up to IPv4 netfilter. This
leads to crashes when the skb is passed to __ip_route_output_key when
dereferencing the namespace pointer.
Since bridging can currently only operate in the init_net namespace,
the easiest fix for now is to initialize the nd_net pointer of the
fake net_device struct to &init_net.
Should fix bugzilla 10323: http://bugzilla.kernel.org/show_bug.cgi?id=10323
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consider we are putting a clusterip_config entry with the "entries"
count == 1, and on the other CPU there's a clusterip_config_find_get
in progress:
CPU1: CPU2:
clusterip_config_entry_put: clusterip_config_find_get:
if (atomic_dec_and_test(&c->entries)) {
/* true */
read_lock_bh(&clusterip_lock);
c = __clusterip_config_find(clusterip);
/* found - it's still in list */
...
atomic_inc(&c->entries);
read_unlock_bh(&clusterip_lock);
write_lock_bh(&clusterip_lock);
list_del(&c->list);
write_unlock_bh(&clusterip_lock);
...
dev_put(c->dev);
Oops! We have an entry returned by the clusterip_config_find_get,
which is a) not in list b) has a stale dev pointer.
The problems will happen when the CPU2 will release the entry - it
will remove it from the list for the 2nd time, thus spoiling it, and
will put a stale dev pointer.
The fix is to make atomic_dec_and_test under the clusterip_lock.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
As far as I can remember, I was going to disable privacy extensions
on all "tunnel" interfaces. Disable it on ip6-ip6 interface as well.
Also, just remove ifdefs for SIT for simplicity.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct ipv6_opt_hdr is the common structure for IPv6 extension
headers, and it is common to increment the pointer to get
the real content. On the other hand, since the structure
consists only of 1-byte next-header field and 1-byte length
field, size of that structure depends on architecture; 2 or 4.
Add "packed" attribute to get 2.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes kernel bugzilla 10437
Based almost entirely upon a patch by Dmitry Butskoy.
When deciding what raw sockets to deliver the ICMPv6
to, we should use the addresses in the ICMPv6 quoted
IPV6 header, not the top-level one.
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Bolle wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9923 would have been much easier to
> track down if eth_validate_addr() would somehow complain aloud if an address
> is invalid. Shouldn't it make at least some noise?
I guess it should return -EADDRNOTAVAIL similar to eth_mac_addr()
when validation fails.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bada339 (Validate device addr prior to interface-up) caused a regression
in the ISDN network code, see: http://bugzilla.kernel.org/show_bug.cgi?id=9923
The trivial fix is to remove the pointer to eth_validate_addr() in the
net_device struct in isdn_net_init().
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel-doc comment for skb_segment is clearly wrong. This states
what it actually does.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user gives a packet under 14 bytes, we'll end up reading off the end
of the skb (not oopsing, just reading off the end).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Max Krasnyanskiy <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no reason for this to be in the header, and it just hurts
recompile time.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Max Krasnyanskiy <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All IP addresses that are present in a system are duplicated on
struct sctp_sockaddr_entry. They are linked in the global list
called sctp_local_addr_list. And this struct unions IPv4 and IPv6
addresses.
So, there can be rare case, when a sockaddr_in.sin_addr coincides
with the corresponding part of the sockaddr_in6 and the notifier
for IPv4 will carry away an IPv6 entry.
The fix is to check the family before comparing the addresses.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix 3 warnings about discarding const qualifiers:
net/sctp/ulpevent.c:862: warning: passing argument 1 of 'sctp_event2skb' discards qualifiers from pointer target type
net/sctp/sm_statefuns.c:4393: warning: passing argument 1 of 'SCTP_ASOC' discards qualifiers from pointer target type
net/sctp/socket.c:5874: warning: passing argument 1 of 'cmsg_nxthdr' discards qualifiers from pointer target type
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When receiving an error length INIT-ACK during COOKIE-WAIT,
a 0-vtag ABORT will be responsed. This action violates the
protocol apparently. This patch achieves the following things.
1 If the INIT-ACK contains all the fixed parameters, use init-tag
recorded from INIT-ACK as vtag.
2 If the INIT-ACK doesn't contain all the fixed parameters,
just reflect its vtag.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 4890 has the following text:
The HMAC algorithm based on SHA-1 MUST be supported and
included in the HMAC-ALGO parameter.
As a result, we need to check in sctp_verify_param() that HMAC_SHA1 is
present in the list. If not, we should probably treat this as a
protocol violation.
It should also be a protocol violation if the HMAC parameter is empty.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Deleting of nonroot hnodes mostly doesn't work in u32_delete():
refcnt == 1 is expected, but such hnodes' refcnts are initialized
with 0 and charged only with "link" nodes. Now they'll start with
1 like usual. Thanks to Patrick McHardy for an improving suggestion.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_queue_xmit() and the other IP output functions expect to get a skb
with clear or properly initialized skb->cb. Unlike TCP and UDP, the
dccp_skb_cb doesn't contain a struct inet_skb_parm at the beginning,
so the DCCP-specific data is interpreted by the IP output functions.
This can cause false negatives for the conditional POST_ROUTING hook
invocation, making the packet bypass the hook.
Add a inet_skb_parm/inet6_skb_parm union to the beginning of
dccp_skb_cb to avoid clashes. Also add a BUILD_BUG_ON to make
sure it fits in the cb.
[ Combined with patch from Gerrit Renker to remove two now unnecessary
memsets of IPCB(skb)->opt ]
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ax25_uid_free call walks the ax25_uid_list and releases entries
from it. The problem is that after the fisrt call to hlist_del_init
the hlist_for_each_entry (which hides behind the ax25_uid_for_each)
will consider the current position to be the last and will return.
Thus, the whole list will be left not freed.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver stores the PCI resource addresses into 'unsigned long' variable
before calling ioremap_nocache() on them. This warrants kernel oops when the
registers are accessed on PPC 44x platforms which (being 32-bit) have PCI
memory space mapped beyond 4 GB.
The arch/ppc/ kernel has a fixup in ioremap() that creates an illusion that
the PCI memory resource is mapped below 4 GB, but arch/powerpc/ code got rid
of this trick, having instead CONFIG_RESOURCES_64BIT enabled.
[ Bump driver version and release date -DaveM ]
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PPP support in generic HDLC in Linux 2.6.25 is broken and will cause
a kernel panic when a device configured in PPP mode is activated.
It will be replaced by new PPP implementation after Linux 2.6.25 is
released.
This affects only PPP support in generic HDLC (mostly Hitachi SCA
and SCA-II based drivers, wanxl, and few others). Standalone syncppp
and async PPP support are not affected.
Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch fixes two weaknesses in send/receive packet handling which may
lead to kernel panics during DLPAR memory add operations.
Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This critical patch fixes a mac address issue recently introduced. If the
device's mac address was in correct order and the flag
NVREG_TRANSMITPOLL_MAC_ADDR_REV was set, during nv_remove the flag would get
cleared. During next load, the mac address would get reversed because the
flag is missing.
As it has been indicated previously, the flag is cleared across a low power
transition. Therefore, the driver should set the mac address back into the
reversed order when clearing the flag.
Also, the driver should set back the flag after a low power transition to
protect against kexec command calling nv_probe a second time.
Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Call phy_disconnect() on remove routine. Otherwise the phy timer
causes a kernel crash when unloading.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The new Fixed PHY method, fixed-link property, isn't
impl. for ucc_geth which makes fixed PHYs non functional.
Add support for the new method to restore the Fixed PHY
functionality.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Any usage of sky2 on new Yukon Supreme would cause a NULL dereference.
The chip is very new, so the support is still untested; vendor has
not sent any eval hardware.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch makes the needlessly global ali_atapi_dma static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
According to Broadcom, two chips have their port counts flipped. The proper
count is:
0x241 is 8 ports
0x242 is 4 ports
Reported by Yohei Honda on kernel bz 10424.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* 'docs' of git://git.lwn.net/linux-2.6:
Add additional examples in Documentation/spinlocks.txt
Move sched-rt-group.txt to scheduler/
Documentation: move rpc-cache.txt to filesystems/
Documentation: move nfsroot.txt to filesystems/
Spell out behavior of atomic_dec_and_lock() in kerneldoc
Fix a typo in highres.txt
Fixes to the seq_file document
Fill out information on patch tags in SubmittingPatches
Add the seq_file documentation