0ea6e61122
Below you will find an updated version from the original series bunching all patches into one big patch updating broken web addresses that are located in Documentation/* Some of the addresses date as far far back as 1995 etc... so searching became a bit difficult, the best way to deal with these is to use web.archive.org to locate these addresses that are outdated. Now there are also some addresses pointing to .spec files some are located, but some(after searching on the companies site)where still no where to be found. In this case I just changed the address to the company site this way the users can contact the company and they can locate them for the users. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Thomas Weber <weber@corscience.de> Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Cc: Paulo Marques <pmarques@grupopie.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Michael Neuling <mikey@neuling.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
70 lines
2.9 KiB
Plaintext
70 lines
2.9 KiB
Plaintext
USERSPACE VERBS ACCESS
|
|
|
|
The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
|
|
enables direct userspace access to IB hardware via "verbs," as
|
|
described in chapter 11 of the InfiniBand Architecture Specification.
|
|
|
|
To use the verbs, the libibverbs library, available from
|
|
http://www.openfabrics.org/, is required. libibverbs contains a
|
|
device-independent API for using the ib_uverbs interface.
|
|
libibverbs also requires appropriate device-dependent kernel and
|
|
userspace driver for your InfiniBand hardware. For example, to use
|
|
a Mellanox HCA, you will need the ib_mthca kernel module and the
|
|
libmthca userspace driver be installed.
|
|
|
|
User-kernel communication
|
|
|
|
Userspace communicates with the kernel for slow path, resource
|
|
management operations via the /dev/infiniband/uverbsN character
|
|
devices. Fast path operations are typically performed by writing
|
|
directly to hardware registers mmap()ed into userspace, with no
|
|
system call or context switch into the kernel.
|
|
|
|
Commands are sent to the kernel via write()s on these device files.
|
|
The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
|
|
The structs for commands that require a response from the kernel
|
|
contain a 64-bit field used to pass a pointer to an output buffer.
|
|
Status is returned to userspace as the return value of the write()
|
|
system call.
|
|
|
|
Resource management
|
|
|
|
Since creation and destruction of all IB resources is done by
|
|
commands passed through a file descriptor, the kernel can keep track
|
|
of which resources are attached to a given userspace context. The
|
|
ib_uverbs module maintains idr tables that are used to translate
|
|
between kernel pointers and opaque userspace handles, so that kernel
|
|
pointers are never exposed to userspace and userspace cannot trick
|
|
the kernel into following a bogus pointer.
|
|
|
|
This also allows the kernel to clean up when a process exits and
|
|
prevent one process from touching another process's resources.
|
|
|
|
Memory pinning
|
|
|
|
Direct userspace I/O requires that memory regions that are potential
|
|
I/O targets be kept resident at the same physical address. The
|
|
ib_uverbs module manages pinning and unpinning memory regions via
|
|
get_user_pages() and put_page() calls. It also accounts for the
|
|
amount of memory pinned in the process's locked_vm, and checks that
|
|
unprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
|
|
|
|
Pages that are pinned multiple times are counted each time they are
|
|
pinned, so the value of locked_vm may be an overestimate of the
|
|
number of pages pinned by a process.
|
|
|
|
/dev files
|
|
|
|
To create the appropriate character device files automatically with
|
|
udev, a rule like
|
|
|
|
KERNEL=="uverbs*", NAME="infiniband/%k"
|
|
|
|
can be used. This will create device nodes named
|
|
|
|
/dev/infiniband/uverbs0
|
|
|
|
and so on. Since the InfiniBand userspace verbs should be safe for
|
|
use by non-privileged processes, it may be useful to add an
|
|
appropriate MODE or GROUP to the udev rule.
|