kernel-ark

Author	SHA1	Message	Date
Marcel Holtmann	e9c4bec63e	[Bluetooth] Make use of virtual devices tree The Bluetooth subsystem currently uses a platform device for devices with no parent. It is a better idea to use the new virtual devices tree for these. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-10-15 23:14:29 -07:00
Marcel Holtmann	df5c37ea9a	[Bluetooth] Handle return values from driver core functions Some return values of the driver core register and create functions are not handled and so might cause unexpected problems. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-10-15 23:14:28 -07:00
Marcel Holtmann	e9c5702e3c	[Bluetooth] Fix compat ioctl for BNEP, CMTP and HIDP There exists no attempt do deal with the fact that a structure with a uint32_t followed by a pointer is going to be different for 32-bit and 64-bit userspace. Any 32-bit process trying to use it will be failing with -EFAULT if it's lucky; suffering from having data dumped at a random address if it's not. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-10-15 23:14:27 -07:00
Jan Dittmer	39c850863d	[IPV6] sit: Add missing MODULE_LICENSE This is missing the MODULE_LICENSE statements and taints the kernel upon loading. License is obvious from the beginning of the file. Signed-off-by: Jan Dittmer <jdi@l4x.org> Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:21 -07:00
YOSHIFUJI Hideaki	f1a95859a8	[IPV6]: Remove bogus WARN_ON in Proxy-NA handling. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:20 -07:00
Thomas Graf	adaa70bbdf	[IPv6] rules: Use RT6_LOOKUP_F_HAS_SADDR and fix source based selectors Fixes rt6_lookup() to provide the source address in the flow and sets RT6_LOOKUP_F_HAS_SADDR whenever it is present in the flow. Avoids unnecessary prefix comparisons by checking for a prefix length first. Fixes the rule logic to not match packets if a source selector has been specified but no source address is available. Thanks to Kim Nordlund <kim.nordlund@nokia.com> for working on this patch with me. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:19 -07:00
David S. Miller	918049f013	[XFRM]: Fix xfrm_state_num going negative. Missing counter bump when hashing in a new ACQ xfrm_state. Now that we have two spots to do the hash grow check, break it out into a helper function. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:18 -07:00
Eric Dumazet	4663afe2c8	[NET]: reduce sizeof(struct inet_peer), cleanup, change in peer_check_expire() 1) shrink struct inet_peer on 64 bits platforms.	2006-10-15 23:14:17 -07:00
Paul Moore	ea614d7f4f	NetLabel: the CIPSOv4 passthrough mapping does not pass categories correctly The CIPSO passthrough mapping had a problem when sending categories which would cause no or incorrect categories to be sent on the wire with a packet. This patch fixes the problem which was a simple off-by-one bug. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-15 23:14:16 -07:00
Paul Moore	044a68ed8a	NetLabel: only deref the CIPSOv4 standard map fields when using standard mapping Fix several places in the CIPSO code where it was dereferencing fields which did not have valid pointers by moving those pointer dereferences into code blocks where the pointers are valid. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-15 23:14:14 -07:00
Stephen Hemminger	1a620698c2	[BRIDGE]: flush forwarding table when device carrier off Flush the forwarding table when carrier is lost. This helps for availability because we don't want to forward to a downed device and new packets may come in on other links. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:13 -07:00
Pablo Neira Ayuso	9ea8cfd6aa	[NETFILTER]: ctnetlink: Remove debugging messages Remove (compilation-breaking) debugging messages introduced at early development stage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:11 -07:00
Patrick McHardy	c08de5d530	[NETFILTER]: xt_CONNSECMARK: fix Kconfig dependencies CONNSECMARK needs conntrack, add missing dependency to fix linking error with CONNSECMARK=y and CONNTRACK=m. Reported by Toralf Förster <toralf.foerster@gmx.de>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:09 -07:00
Patrick McHardy	a9f54596fa	[NETFILTER]: ipt_ECN/ipt_TOS: fix incorrect checksum update Even though the tos field is only a single byte large, the values need to be converted to net-endian for the checkum update so they are in the corrent byte position. Also fix incorrect endian annotations. Reported by Stephane Chazelas <Stephane_Chazelas@yahoo.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:08 -07:00
Patrick McHardy	f603b6ec50	[NETFILTER]: arp_tables: missing unregistration on module unload Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:07 -07:00
Patrick McHardy	f64ad5bb04	[NETFILTER]: fix cut-and-paste error in exit functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:06 -07:00
Patrick McHardy	be60358e94	[DECNET]: Use correct config option for routing by fwmark in compare_keys() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:05 -07:00
Akinbou Mita	30bdbe397b	[PKT_SCHED] sch_htb: use rb_first() cleanup Use rb_first() to get first entry in rb tree. Signed-off-by: Akinbou Mita <akinobu.mita@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-12 01:52:05 -07:00
Patrick McHardy	b974179abe	[RTNETLINK]: Fix use of wrong skb in do_getlink() skb is the netlink query, nskb is the reply message. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-12 01:50:30 -07:00
Patrick McHardy	52c41a3224	[DECNET]: Fix sfuzz hanging on 2.6.18 Dave Jones wrote: > sfuzz D 724EF62A 2828 28717 28691 (NOTLB) > cd69fe98 00000082 0000012d 724ef62a 0001971a 00000010 00000007 df6d22b0 > dfd81080 725bbc5e 0001971a 000cc634 00000001 df6d23bc c140e260 00000202 > de1d5ba0 cd69fea0 de1d5ba0 00000000 00000000 de1d5b60 de1d5b8c de1d5ba0 > Call Trace: > [<c05b1708>] lock_sock+0x75/0xa6 > [<e0b0b604>] dn_getname+0x18/0x5f [decnet] > [<c05b083b>] sys_getsockname+0x5c/0xb0 > [<c05b0b46>] sys_socketcall+0xef/0x261 > [<c0403f97>] syscall_call+0x7/0xb > DWARF2 unwinder stuck at syscall_call+0x7/0xb > > I wonder if the plethora of lockdep related changes inadvertantly broke something? Looks like unbalanced locking. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-12 01:48:20 -07:00
David S. Miller	8238b218ec	[NET]: Do not memcmp() over pad bytes of struct flowi. They are not necessarily initialized to zero by the compiler, for example when using run-time initializers of automatic on-stack variables. Noticed by Eric Dumazet and Patrick McHardy. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-12 00:49:15 -07:00
YOSHIFUJI Hideaki	9469c7b4aa	[NET]: Use typesafe inet_twsk() inline function instead of cast. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:58 -07:00
YOSHIFUJI Hideaki	496c98dff8	[NET]: Use hton{l,s}() for non-initializers. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:56 -07:00
YOSHIFUJI Hideaki	4244f8a9f8	[TCP]: Use TCPOLEN_TSTAMP_ALIGNED macro instead of magic number. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:54 -07:00
Joerg Roedel	0be669bb37	[IPV6]: Seperate sit driver to extra module (addrconf.c changes) This patch contains the changes to net/ipv6/addrconf.c to remove sit specific code if the sit driver is not selected. Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:52 -07:00
Joerg Roedel	989e5b96e1	[IPV6]: Seperate sit driver to extra module This patch removes the driver of the IPv6-in-IPv4 tunnel driver (sit) from the IPv6 module. It adds an option to Kconfig which makes it possible to compile it as a seperate module. Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:50 -07:00
Miklos Szeredi	effee6a000	[NET]: File descriptor loss while receiving SCM_RIGHTS If more than one file descriptor was sent with an SCM_RIGHTS message, and on the receiving end, after installing a nonzero (but not all) file descritpors the process runs out of fds, then the already installed fds will be lost (userspace will have no way of knowing about them). The following patch makes sure, that at least the already installed fds are sent to userspace. It doesn't solve the issue of losing file descriptors in case of an EFAULT on the userspace buffer. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:48 -07:00
Vlad Yasevich	6aa2551cf1	[SCTP]: Fix the RX queue size shown in /proc/net/sctp/assocs output. Show the true receive buffer usage. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:46 -07:00
Vlad Yasevich	331c4ee7fa	[SCTP]: Fix receive buffer accounting. When doing receiver buffer accounting, we always used skb->truesize. This is problematic when processing bundled DATA chunks because for every DATA chunk that could be small part of one large skb, we would charge the size of the entire skb. The new approach is to store the size of the DATA chunk we are accounting for in the sctp_ulpevent structure and use that stored value for accounting. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:44 -07:00
Venkat Yekkirala	3bccfbc7a7	IPsec: fix handling of errors for socket policies This treats the security errors encountered in the case of socket policy matching, the same as how these are treated in the case of main/sub policies, which is to return a full lookup failure. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:39 -07:00
Venkat Yekkirala	5b368e61c2	IPsec: correct semantics for SELinux policy matching Currently when an IPSec policy rule doesn't specify a security context, it is assumed to be "unlabeled" by SELinux, and so the IPSec policy rule fails to match to a flow that it would otherwise match to, unless one has explicitly added an SELinux policy rule allowing the flow to "polmatch" to the "unlabeled" IPSec policy rules. In the absence of such an explicitly added SELinux policy rule, the IPSec policy rule fails to match and so the packet(s) flow in clear text without the otherwise applicable xfrm(s) applied. The above SELinux behavior violates the SELinux security notion of "deny by default" which should actually translate to "encrypt by default" in the above case. This was first reported by Evgeniy Polyakov and the way James Morris was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. With this patch applied, SELinux "polmatching" of flows Vs. IPSec policy rules will only come into play when there's a explicit context specified for the IPSec policy rule (which also means there's corresponding SELinux policy allowing appropriate domains/flows to polmatch to this context). Secondly, when a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return errors other than access denied, such as -EINVAL. We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The solution for this is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). This patch: Fix the selinux side of things. This makes sure SELinux polmatching of flow contexts to IPSec policy rules comes into play only when an explicit context is associated with the IPSec policy rule. Also, this no longer defaults the context of a socket policy to the context of the socket since the "no explicit context" case is now handled properly. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:37 -07:00
James Morris	134b0fc544	IPsec: propagate security module errors up from flow_cache_lookup When a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return an access denied permission (or other error). We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The way I was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. The first SYNACK would be blocked, because of an uncached lookup via flow_cache_lookup(), which would fail to resolve an xfrm policy because the SELinux policy is checked at that point via the resolver. However, retransmitted SYNACKs would then find a cached flow entry when calling into flow_cache_lookup() with a null xfrm policy, which is interpreted by xfrm_lookup() as the packet not having any associated policy and similarly to the first case, allowing it to pass without transformation. The solution presented here is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:34 -07:00
paul.moore@hp.com	ffb733c650	NetLabel: fix a cache race condition Testing revealed a problem with the NetLabel cache where a cached entry could be freed while in use by the LSM layer causing an oops and other problems. This patch fixes that problem by introducing a reference counter to the cache entry so that it is only freed when it is no longer in use. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:29 -07:00
Alexey Dobriyan	d136fe7243	[PATCH] Finish annotations of struct vlan_ethhdr Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-10 16:15:34 -07:00
Al Viro	cfbdbab063	[PATCH] net/sunrpc/auth_gss/svcauth_gss.c endianness regression Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-10 15:37:24 -07:00
Al Viro	86b95c1213	[PATCH] strndup() would better take size_t, not int Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-10 15:37:24 -07:00
Al Viro	5e7ddac75d	[PATCH] ptrdiff_t is %t, not %z Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-10 15:37:23 -07:00
Al Viro	28c4dadd3a	[PATCH] tipc __user annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-10 15:37:21 -07:00
NeilBrown	c6b0a9f87b	[PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc There is some confusion about the meaning of 'bufsz' for a sunrpc server. In some cases it is the largest message that can be sent or received. In other cases it is the largest 'payload' that can be included in a NFS message. In either case, it is not possible for both the request and the reply to be this large. One of the request or reply may only be one page long, which fits nicely with NFS. So we remove 'bufsz' and replace it with two numbers: 'max_payload' and 'max_mesg'. Max_payload is the size that the server requests. It is used by the server to check the max size allowed on a particular connection: depending on the protocol a lower limit might be used. max_mesg is the largest single message that can be sent or received. It is calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and with PAGE_SIZE added to overhead. Only one of the request and reply may be this size. The other must be at most one page. Cc: Greg Banks <gnb@sgi.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-06 08:53:41 -07:00
Linus Torvalds	fefd26b3b8	Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/configh * master.kernel.org:/pub/scm/linux/kernel/git/davej/configh: Remove all inclusions of <linux/config.h> Manually resolved trivial path conflicts due to removed files in the sound/oss/ subdirectory.	2006-10-04 09:59:57 -07:00
Linus Torvalds	d002ec481c	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [XFRM]: BEET mode [TCP]: Kill warning in tcp_clean_rtx_queue(). [NET_SCHED]: Remove old estimator implementation [ATM]: [zatm] always *pcr in alloc_shaper() [ATM]: [ambassador] Change the return type to reflect reality [ATM]: kmalloc to kzalloc patches for drivers/atm [TIPC]: fix printk warning [XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect. [XFRM] STATE: Use destination address for src hash. [NEIGH]: always use hash_mask under tbl lock [UDP]: Fix MSG_PROBE crash [UDP6]: Fix flowi clobbering [NET_SCHED]: Revert "HTB: fix incorrect use of RB_EMPTY_NODE" [NETFILTER]: ebt_mark: add or/and/xor action support to mark target [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function [NETFILTER]: Honour source routing for LVS-NAT [NETFILTER]: add type parameter to ip_route_me_harder [NETFILTER]: Kconfig: fix xt_physdev dependencies	2006-10-04 08:26:19 -07:00
J.Bruce Fields	8f8e05c570	[PATCH] knfsd: svcrpc: use consistent variable name for the reply state The rpc reply has multiple levels of error returns. The code here contributes to the confusion by using "accept_statp" for a pointer to what the rfc (and wireshark, etc.) refer to as the "reply_stat". (The confusion is compounded by the fact that the rfc also has an "accept_stat" which follows the reply_stat in the succesful case.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:19 -07:00
J.Bruce Fields	5b304bc5bf	[PATCH] knfsd: svcrpc: gss: fix failure on SVC_DENIED in integrity case If the request is denied after gss_accept was called, we shouldn't try to wrap the reply. We were checking the accept_stat but not the reply_stat. To check the reply_stat in _release, we need a pointer to before (rather than after) the verifier, so modify body_start appropriately. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:19 -07:00
J.Bruce Fields	3c15a48664	[PATCH] knfsd: svcrpc: gss: factor out some common wrapping code Factor out some common code from the integrity and privacy cases. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:19 -07:00
Olaf Kirch	bc5fea4299	[PATCH] knfsd: register all RPC programs with portmapper by default The NFSACL patches introduced support for multiple RPC services listening on the same transport. However, only the first of these services was registered with portmapper. This was perfectly fine for nfsacl, as you traditionally do not want these to show up in a portmapper listing. The patch below changes the default behavior to always register all services listening on a given transport, but retains the old behavior for nfsacl services. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:19 -07:00
Greg Banks	7b2b1fee30	[PATCH] knfsd: knfsd: cache ipmap per TCP socket Speed up high call-rate workloads by caching the struct ip_map for the peer on the connected struct svc_sock instead of looking it up in the ip_map cache hashtable on every call. This helps workloads using AUTH_SYS authentication over TCP. Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16 synthetic client threads simulating an rsync (i.e. recursive directory listing) workload reading from an i386 RH9 install image (161480 regular files in 10841 directories) on the server. That tree is small enough to fill in the server's RAM so no disk traffic was involved. This setup gives a sustained call rate in excess of 60000 calls/sec before being CPU-bound on the server. Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each CPU, and ip_map_lookup() was taking 2.9%. This patch drops both contribution into the profile noise. Note that the above result overstates this value of this patch for most workloads. The synthetic clients are all using separate IP addresses, so there are 64 entries in the ip_map cache hash. Because the kernel measured contained the bug fixed in commit commit `1f1e030bf7` and was running on 64bit little-endian machine, probably all of those 64 entries were on a single chain, thus increasing the cost of ip_map_lookup(). With a modern kernel you would need more clients to see the same amount of performance improvement. This patch has helped to scale knfsd to handle a deployment with 2000 NFS clients. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:16 -07:00
Greg Banks	7adae489fe	[PATCH] knfsd: Prepare knfsd for support of rsize/wsize of up to 1MB, over TCP The limit over UDP remains at 32K. Also, make some of the apparently arbitrary sizing constants clearer. The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of the rqstp. This allows it to be different for different protocols (udp/tcp) and also allows it to depend on the servers declared sv_bufsiz. Note that we don't actually increase sv_bufsz for nfs yet. That comes next. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:16 -07:00
NeilBrown	3cc03b164c	[PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom .. by allocating the array of 'kvec' in 'struct svc_rqst'. As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer allocate an array of this size on the stack. So we allocate it in 'struct svc_rqst'. However svc_rqst contains (indirectly) an array of the same type and size (actually several, but they are in a union). So rather than waste space, we move those arrays out of the separately allocated union and into svc_rqst to share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at different times, so there is no conflict). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:15 -07:00
NeilBrown	4452435948	[PATCH] knfsd: Replace two page lists in struct svc_rqst with one We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256. This means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES. struct svc_rqst contains two such arrays. However the there are never more that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is needed. The two arrays are for the pages holding the request, and the pages holding the reply. Instead of two arrays, we can simply keep an index into where the first reply page is. This patch also removes a number of small inline functions that probably server to obscure what is going on rather than clarify it, and opencode the needed functionality. Also remove the 'rq_restailpage' variable as it is always 0. i.e. if the response 'xdr' structure has a non-empty tail it is always in the same pages as the head. check counters are initilised and incr properly check for consistant usage of ++ etc maybe extra some inlines for common approach general review Signed-off-by: Neil Brown <neilb@suse.de> Cc: Magnus Maatta <novell@kiruna.se> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:15 -07:00
NeilBrown	5680c44632	[PATCH] knfsd: Fixed handling of lockd fail when adding nfsd socket Arrgg.. We cannot 'lockd_up' before 'svc_addsock' as we don't know the protocol yet.... So switch it around again and save the name of the created sockets so that it can be closed if lock_up fails. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:15 -07:00

1 2 3 4 5 ...

3231 Commits