Commit Graph

27 Commits

Author SHA1 Message Date
Sage Weil
5dfc589a84 ceph: unregister bdi before kill_anon_super releases device name
Unregister and destroy the bdi in put_super, after mount is r/o, but before
put_anon_super releases the device name.

For symmetry, bdi_destroy in destroy_client (we bdi_init in create_client).

Only set s_bdi if bdi_register succeeds, since we use it to decide whether
to bdi_unregister.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-04 16:14:46 -07:00
Sage Weil
c8f16584ac ceph: print more useful version info on module load
Decouple the client version from the server side.  Print relevant protocol
and map version info instead.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-03 10:49:23 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Yehuda Sadeh
422d2cb8f9 ceph: reset osd after relevant messages timed out
This simplifies the process of timing out messages. We
keep lru of current messages that are in flight. If a
timeout has passed, we reset the osd connection, so that
messages will be retransmitted.  This is a failsafe in case
we hit some sort of problem sending out message to the OSD.
Normally, we'll get notification via an updated osdmap if
there are problems.

If a request is older than the keepalive timeout, send a
keepalive to ensure we detect any breaks in the TCP connection.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-04 11:26:35 -08:00
Sage Weil
85ccce43a3 ceph: clean up readdir caps reservation
Use a global counter for the minimum number of allocated caps instead of
hard coding a check against readdir_max.  This takes into account multiple
client instances, and avoids examining the superblock mount options when a
cap is dropped.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-17 10:02:43 -08:00
Yehuda Sadeh
f5a2041bd9 ceph: put unused osd connections on lru
Instead of removing osd connection immediately when the
requests list is empty, put the osd connection on an lru.
Only if that osd has not been used for more than a specified
time, will it be removed.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-11 11:48:48 -08:00
Sage Weil
9bd2e6f8ba ceph: allow renewal of auth credentials
Add infrastructure to allow the mon_client to periodically renew its auth
credentials.  Also add a messenger callback that will force such a renewal
if a peer rejects our authenticator.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-10 15:04:47 -08:00
Sage Weil
e0e3271074 ceph: only unregister registered bdi
Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-23 08:17:18 -08:00
Yehuda Sadeh
2baba25019 ceph: writeback congestion control
Set bdi congestion bit when amount of write data in flight exceeds adjustable
threshold.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-21 16:39:56 -08:00
Sage Weil
9ec7cab14e ceph: hex dump corrupt server data to KERN_DEBUG
Also, print fsid using standard format, NOT hex dump.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-21 16:39:52 -08:00
Yehuda Sadeh
dc14657c9c ceph: mount fails immediately on error
Signed-off-by: Yehuda Sadeh <yehuda@newdream.net>
2009-11-20 14:24:46 -08:00
Sage Weil
0743304d87 ceph: fix debugfs entry, simplify fsid checks
We may first learn our fsid from any of the mon, osd, or mds maps
(whichever the monitor sends first).  Consolidate checks in a single
helper.  Initialize the client debugfs entry then, since we need the
fsid (and global_id) for the directory name.

Also remove dead mount code.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-20 14:24:27 -08:00
Sage Weil
b9bfb93ce2 ceph: move mempool creation to ceph_create_client
Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-18 16:20:08 -08:00
Sage Weil
4e7a5dcd1b ceph: negotiate authentication protocol; implement AUTH_NONE protocol
When we open a monitor session, we send an initial AUTH message listing
the auth protocols we support, our entity name, and (possibly) a previously
assigned global_id.  The monitor chooses a protocol and responds with an
initial message.

Initially implement AUTH_NONE, a dummy protocol that provides no security,
but works within the new framework.  It generates 'authorizers' that are
used when connecting to (mds, osd) services that simply state our entity
name and global_id.

This is a wire protocol change.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-18 16:19:57 -08:00
Sage Weil
5f44f14260 ceph: handle errors during osd client init
Unwind initializing if we get ENOMEM during client initialization.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-18 15:02:36 -08:00
Sage Weil
6a18be16f7 ceph: fix sparse endian warning
Use the __le macro, even though for -1 it doesn't matter.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-04 16:36:12 -08:00
Sage Weil
859e7b1493 ceph: init/destroy bdi in client create/destroy helpers
This keeps bdi setup/teardown in line with client life cycle.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-02 09:32:47 -08:00
Sage Weil
6b8051855d ceph: allocate and parse mount args before client instance
This simplifies much of the error handling during mount.  It also means
that we have the mount args before client creation, and we can initialize
based on those options.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-27 11:57:03 -07:00
Sage Weil
e53c2fe075 ceph: fix, clean up string mount arg parsing
Clearly demark int and string argument options, and do not try to convert
string arguments to ints.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-27 11:17:25 -07:00
Sage Weil
6ca874e92d ceph: silence uninitialized variable warning
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-26 22:07:59 -07:00
Sage Weil
7b813c4602 ceph: reduce parse_mount_args stack usage
Since we've increased the max mon count, we shouldn't put the addr array
on the parse_mount_args stack.  Put it on the heap instead.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-26 22:07:53 -07:00
Sage Weil
ecb19c4649 ceph: remove small mon addr limit; use CEPH_MAX_MON where appropriate
Get rid of separate max mon limit; use the system limit instead.  This
allows mounts when there are lots of mon addrs provided by mount.ceph (as
with a host with lots of A/AAAA records).

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-22 10:53:17 -07:00
Sage Weil
8fa9765576 ceph: enable readahead
Initialized bdi->ra_pages to enable readahead.  Use 512KB default.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-16 14:44:43 -07:00
Sage Weil
f2cf418cec ceph: initialize sb->s_bdi, bdi_unregister after kill_anon_super
Writeback doesn't work without the bdi set, and writeback on
umount doesn't work if we unregister the bdi too early.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-14 14:09:07 -07:00
Sage Weil
572033069d ceph: remove unused CEPH_MSG_{OSD,MDS}_GETMAP
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-12 10:29:44 -07:00
Sage Weil
fa0b72e9e2 ceph: show meaningful version on module load
Kill the old git revision; print the ceph version and protocol
versions instead.

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 10:59:10 -07:00
Sage Weil
16725b9d2a ceph: super.c
Mount option parsing, client setup and teardown, and a few odds and
ends (e.g., statfs).

Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-06 11:31:07 -07:00