kernel-ark

Author	SHA1	Message	Date
Alexander Schmidt	b9012e0a42	IB/ehca: Generate flush status CQ entries When a QP goes into error state, it is required that CQ entries with a flush error status are delivered to the application for any outstanding work requests. eHCA does not do this in hardware, so this patch adds software flush CQE generation to the ehca driver. Whenever a QP gets into error state, it is added to the QP error list of its respective CQ. If the error QP list of a CQ is not empty, poll_cq() generates flush CQEs before polling the actual CQ. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-20 20:05:21 -07:00
Hoang-Nam Nguyen	2b5e6b120e	IB/ehca: Add PMA support This patch enables ehca to redirect any PMA queries to the actual PMA QP. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Reviewed-by: Joachim Fenkes <fenkes@de.ibm.com> Reviewed-by: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:42 -08:00
Hoang-Nam Nguyen	bbdd267ef2	IB/ehca: Add "port connection autodetect mode" This patch enhances ehca with a capability to "autodetect" the ports being connected physically. In order to utilize that function the module option nr_ports must be set to -1 (default is 2 - two ports). This feature is experimental and will made the default later. More detail: If the user connects only one port to the switch, current code requires 1) port one to be connected and 2) module option nr_ports=1 to be given. If autodetect is enabled, ehca will not wait at creation of the GSI QP for the respective port to become active. Since firmware does not accept modify_qp() while the port is down at initialization, we need to cache all calls to modify_qp() for the SMI/GSI QP and just return a good return code. When a port is activated and we get a PORT_ACTIVE event, we replay the cached modify-qp() parms and re-trigger any posted recv WRs. Only then do we forward the PORT_ACTIVE event to registered clients. The result of this autodetect patch is that all ports will be accessible by the users. Depending on their respective cabling only those ports that are connected properly will become operable. If a user tries to modify a regular QP of a non-connected port, modify_qp() will fail. Furthermore, ibv_devinfo should show the port state accordingly. Note that this patch primarily improves the loading behaviour of ehca. If the cable is removed while the driver is operating and plugged in again, firmware will handle that properly by sending an appropriate async event. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:44 -08:00
Joachim Fenkes	51aaa54eb9	IB/ehca: Fix static rate calculation The IPD (inter-packet delay) formula was a little off and assumed a fixed physical link rate; fix the formula and query the actual physical link rate, now that we can get it. Also, refactor the calculation into a common function ehca_calc_ipd() and use that instead of duplicating code. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:27:00 -08:00
Hoang-Nam Nguyen	2b94397adc	IB/ehca: Fix warnings issued by checkpatch.pl Run the existing ehca code through checkpatch.pl and clean up the worst of the coding style violations. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:40 -07:00
Joachim Fenkes	8705ce5b90	IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events When firmware reports a nondisruptive port configuration change event, previous versions of the eHCA driver didn't forward the event to consumers like IPoIB. Add code that determines the type of configuration change by comparing old and new port attributes and reports it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	a6a12947fb	IB/ehca: add Shared Receive Queue support Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code (structures, create, destroy, post_recv) can be shared between QP and SRQ. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Roland Dreier	f7c6a7b5d5	IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Roland Dreier	ed23a72778	IB: Return "maybe missed event" hint from ib_req_notify_cq() The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Michael S. Tsirkin	f4fd0b224d	IB: Add CQ comp_vector support Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Hoang-Nam Nguyen	4c34bdf58c	IB/ehca: Remove use of do_mmap() This patch removes do_mmap() from ehca: - Call remap_pfn_range() for hardware register block - Use vm_insert_page() to register memory allocated for completion queues and queue pairs - The actual mmap() call/trigger is now controlled by user space, ie. libehca Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-04 14:11:57 -08:00
Hoang-Nam Nguyen	f2d9136133	IB/ehca: Use proper GFP_ flags for get_zeroed_page() Here is a patch for ehca to use proper flag, ie. GFP_ATOMIC resp. GFP_KERNEL, when calling get_zeroed_page() to prevent "Bug: scheduling while atomic...". This error does not cause a kernel panic but makes ipoib un-usable afterwards. It is reproducible on 2.6.20-rc4 if one does ifconfig down during a flood ping test. I have not observed this error in earlier releases incl. 2.6.20-rc1. This error occurs when a qp event/irq is received and ehca event handler allocates a control block/page to obtain HCA error data block. Use of GFP_ATOMIC when in interrupt context prevents this issue. Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-09 14:14:24 -08:00
Hoang-Nam Nguyen	7e28db5d8f	IB/ehca: Assure 4K alignment for firmware control blocks Assure 4K alignment for firmware control blocks in 64K page mode, because kzalloc()'s result address might not be 4K aligned if 64K pages are enabled. Thus, we introduce wrappers called ehca_{alloc,free}_fw_ctrlblock(), which use a slab cache for objects with 4K length and 4K alignment in order to alloc/free firmware control blocks in 64K page mode. In 4K page mode those wrappers just are defines of get_zeroed_page() and free_page(). Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-09 12:41:57 -08:00
Ralph Campbell	9bc57e2d19	IB/uverbs: Pass userspace data to modify_srq and modify_qp methods Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:25 -07:00
Heiko J Schick	fab97220c9	IB/ehca: Add driver for IBM eHCA InfiniBand adapters Add a driver for IBM GX bus InfiniBand adapters, which are usable with some pSeries/System p systems. Signed-off-by: Heiko J Schick <schickhj.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:22 -07:00

15 Commits