The aim of this locking rework is that ioctls which a compositor should be
might call for every frame (set_cursor, page_flip, addfb, rmfb and
getfb/create_handle) should not be able to block on kms background
activities like output detection. And since each EDID read takes about
25ms (in the best case), that always means we'll drop at least one frame.
The solution is to add per-crtc locking for these ioctls, and restrict
background activities to only use the global lock. Change-the-world type
of events (modeset, dpms, ...) need to grab all locks.
Two tricky parts arose in the conversion:
- A lot of current code assumes that a kms fb object can't disappear while
holding the global lock, since the current code serializes fb
destruction with it. Hence proper lifetime management using the already
created refcounting for fbs need to be instantiated for all ioctls and
interfaces/users.
- The rmfb ioctl removes the to-be-deleted fb from all active users. But
unconditionally taking the global kms lock to do so introduces an
unacceptable potential stall point. And obviously changing the userspace
abi isn't on the table, either. Hence this conversion opportunistically
checks whether the rmfb ioctl holds the very last reference, which
guarantees that the fb isn't in active use on any crtc or plane (thanks
to the conversion to the new lifetime rules using proper refcounting).
Only if this is not the case will the code go through the slowpath and
grab all modeset locks. Sane compositors will never hit this path and so
avoid the stall, but userspace relying on these semantics will also not
break.
All these cases are exercised by the newly added subtests for the i-g-t
kms_flip, tested on a machine where a full detect cycle takes around 100
ms. It works, and no frames are dropped any more with these patches
applied. kms_flip also contains a special case to exercise the
above-describe rmfb slowpath.
* 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel: (335 commits)
drm/fb_helper: check whether fbcon is bound
drm/doc: updates for new framebuffer lifetime rules
drm: don't hold crtc mutexes for connector ->detect callbacks
drm: only grab the crtc lock for pageflips
drm: optimize drm_framebuffer_remove
drm/vmwgfx: add proper framebuffer refcounting
drm/i915: dump refcount into framebuffer debugfs file
drm: refcounting for crtc framebuffers
drm: refcounting for sprite framebuffers
drm: fb refcounting for dirtyfb_ioctl
drm: don't take modeset locks in getfb ioctl
drm: push modeset_lock_all into ->fb_create driver callbacks
drm: nest modeset locks within fpriv->fbs_lock
drm: reference framebuffers which are on the idr
drm: revamp framebuffer cleanup interfaces
drm: create drm_framebuffer_lookup
drm: revamp locking around fb creation/destruction
drm: only take the crtc lock for ->cursor_move
drm: only take the crtc lock for ->cursor_set
drm: add per-crtc locks
...
Daniel writes:
- seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris
Wilson
- some leftover kill-agp on gen6+ patches from Ben
- hotplug improvements from Damien
- clear fb when allocated from stolen, avoids dirt on the fbcon (Chris)
- Stolen mem support from Chris Wilson, one of the many steps to get to
real fastboot support.
- Some DDI code cleanups from Paulo.
- Some refactorings around lvds and dp code.
- some random little bits&pieces
* tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits)
drm/i915: Return the real error code from intel_set_mode()
drm/i915: Make GSM void
drm/i915: Move GSM mapping into dev_priv
drm/i915: Move even more gtt code to i915_gem_gtt
drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno
drm/i915: Introduce i915_gem_set_seqno()
drm/i915: Always clear semaphore mboxes on seqno wrap
drm/i915: Initialize hardware semaphore state on ring init
drm/i915: Introduce ring set_seqno
drm/i915: Missed conversion to gtt_pte_t
drm/i915: Bug on unsupported swizzled platforms
drm/i915: BUG() if fences are used on unsupported platform
drm/i915: fixup overlay stolen memory leak
drm/i915: clean up PIPECONF bpc #defines
drm/i915: add intel_dp_set_signal_levels
drm/i915: remove leftover display.update_wm assignment
drm/i915: check for the PCH when setting pch_transcoder
drm/i915: Clear the stolen fb before enabling
drm/i915: Access to snooped system memory through the GTT is incoherent
drm/i915: Remove stale comment about intel_dp_detect()
...
Conflicts:
drivers/gpu/drm/i915/intel_display.c
Avoid clobbering adjacent blocks if they happen to expire earlier and
amalgamate together to form the requested hole.
In passing this fixes a regression from
commit ea7b1dd448
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Feb 18 17:59:12 2011 +0100
drm: mm: track free areas implicitly
which swaps the end address for size (with a potential overflow) and
effectively causes the eviction code to clobber almost all earlier
buffers above the evictee.
v2: Check the original hole not the adjusted as the coloring may confuse
us when later searching for the overlapping nodes. Also make sure that
we do apply the range restriction and color adjustment in the same
order for both scanning, searching and insertion.
v3: Send the version that was actually tested.
Note that this seems to be ducttape of decent quality ot paper over
some of our unbind related gpu hangs reported since 3.7. It is not
fully effective though, and certainly doesn't fix the underlying bug.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Added note plus bugzilla link and tested-by.]
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55984
Tested-by: Norbert Preining <preining@logic.at>
Acked-by: Dave Airlie <airlied@gmail.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Required by i915 in order to avoid the allocation in the middle of
manipulating the drm_mm lists.
Use a pair of stubs to preserve the existing EXPORT_SYMBOLs for
backporting; to be removed later.
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: bikeshedded-away the atomic parameter, it's not yet used
anywhere.]
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will be used i915 in forthcoming patches in order to measure the
largest contiguous chunk of memory available for enabling chipset
features.
v2: Try to make the macro marginally safer and more readable by not
depending upon the drm_mm_hole_node_end() being non-zero. Note that we
need to open code list_for_each() in order to update the hole_start,
hole_end variable on each iteration and keep the macro sane.
v3: Tidy up few BUG_ONs that fell foul of adding additional tests to
drm_mm_hole_node_start().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To be used later by i915 to preallocate exact blocks of space from the
range manager.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In order to support snoopable memory on non-LLC architectures (so that
we can bind vgem objects into the i915 GATT for example), we have to
avoid the prefetcher on the GPU from crossing memory domains and so
prevent allocation of a snoopable PTE immediately following an uncached
PTE. To do that, we need to extend the range allocator with support for
tracking and segregating different node colours.
This will be used by i915 to segregate memory domains within the GTT.
v2: Now with more drm_mm helpers and less driver interference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
The looping helper didn't do anything due to a superficial
semicolon. Furthermore one of the two dump functions suffered
from copy&paste fail.
While staring at the code I've also noticed that the replace
helper (currently unused) is a bit broken.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With the switch to implicit free space accounting one pointer
got unused when scanning. Use it to create a single-linked list
to ensure correct unwinding of the scan state.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The old api has a two-step process: First search for a suitable
free hole, then allocate from that specific hole. No user used
this to do anything clever. So drop it for the embeddable variant
of the drm_mm api (the old one retains this ability, for the time
being).
With struct drm_mm_node embedded, we cannot track allocations
anymore by checking for a NULL pointer. So keep track of this
and add a small helper drm_mm_node_allocated.
Also add a function to move allocations between different struct
drm_mm_node.
v2: Implement suggestions by Chris Wilson.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The idea is to track free holes implicitly by marking the allocation
immediatly preceeding a hole.
To avoid an ugly corner case add a dummy head_node to struct drm_mm
to track the hole that spans to complete allocation area when the
memory manager is empty.
To guarantee that there's always a preceeding/following node (that might
be marked as hole_follows == 1), move the mm->node_list list_head to the
head_node.
The main allocator and fair-lru scan code actually becomes simpler.
Only the debug code slightly suffers because free areas are no longer
explicit.
Also add drm_mm_for_each_node (which will be much more useful when
struct drm_mm_node is embeddable).
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Nouveau was checking drm_mm internals on teardown to see whether the
memory manager was initialized. Hide these internals in a small
inline helper function.
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
These helper functions can be used to efficiently scan lru list
for eviction. Eviction becomes a three stage process:
1. Scanning through the lru list until a suitable hole has been found.
2. Scan backwards to restore drm_mm consistency and find out which
objects fall into the hole.
3. Evict the objects that fall into the hole.
These helper functions don't allocate any memory (at the price of
not allowing any other concurrent operations). Hence this can also be
used for ttm (which does lru scanning under a spinlock).
Evicting objects in this fashion should be more fair than the current
approach by i915 (scan the lru for a object large enough to contain
the new object). It's also more efficient than the current approach used
by ttm (uncoditionally evict objects from the lru until there's enough
free space).
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Thomas Hellstrom <thellstrom@vmwgfx.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Yeah, I've kinda noticed that fl_entry is the free stack. Still
give it (and the memory node list ml_entry) decent names.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Thomas Hellstrom <thellstrom@vmwgfx.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Only ever assigned, never used.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[glisse: I will re-add if needed for range-restricted allocations]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
drm_mm_debug_table will print the memory manager state
in table allowing to give a snapshot of the manager at
given point in time. Usefull for debugging.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds code to the drm_mm to talk to debugfs, and adds
support to radeon to add the VRAM and GTT mm lists to debugfs.
I tested with spinlock debugging and it doesn't give out.
Signed-off-by: Dave Airlie <airlied@redhat.com>
also for the atomic path by using a common code-path.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this is a TTM preparation patch, it rearranges the mm and
add operations needed to do mm operations in atomic context.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>