Merge misc updates from Andrew Morton:
"257 patches.
Subsystems affected by this patch series: scripts, ocfs2, vfs, and
mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
cleanups, kfence, and damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
mm/damon: remove return value from before_terminate callback
mm/damon: fix a few spelling mistakes in comments and a pr_debug message
mm/damon: simplify stop mechanism
Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
Docs/admin-guide/mm/damon/start: simplify the content
Docs/admin-guide/mm/damon/start: fix a wrong link
Docs/admin-guide/mm/damon/start: fix wrong example commands
mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
mm/damon: remove unnecessary variable initialization
Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
selftests/damon: support watermarks
mm/damon/dbgfs: support watermarks
mm/damon/schemes: activate schemes based on a watermarks mechanism
tools/selftests/damon: update for regions prioritization of schemes
mm/damon/dbgfs: support prioritization weights
mm/damon/vaddr,paddr: support pageout prioritization
mm/damon/schemes: prioritize regions within the quotas
mm/damon/selftests: support schemes quotas
mm/damon/dbgfs: support quotas of schemes
...
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()
The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.
@@
identifier vaddr;
expression size;
@@
(
- memblock_phys_free(__pa(vaddr), size);
+ memblock_free(vaddr, size);
|
- memblock_free_ptr(vaddr, size);
+ memblock_free(vaddr, size);
)
[sfr@canb.auug.org.au: fixup]
Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au
Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().
The callers are updated with the below semantic patch:
@@
expression addr;
expression size;
@@
- memblock_free(addr, size);
+ memblock_phys_free(addr, size);
Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Convert /reserved-memory bindings to schemas
- Convert a bunch of NFC bindings to schemas
- Convert bindings to schema: Xilinx USB, Freescale DDR controller, Arm
CCI-400, UBlox Neo-6M, 1-Wire GPIO, MSI controller, ASpeed LPC, OMAP
and Inside-Secure HWRNG, register-bit-led, OV5640, Silead GSL1680,
Elan ekth3000, Marvell bluetooth, TI wlcore, TI bluetooth, ESP ESP8089,
tlm,trusted-foundations, Microchip cap11xx, Ralink SoCs and boards,
and TI sysc
- New binding schemas for: msi-ranges, Aspeed UART routing controller,
palmbus, Xylon LogiCVC display controller, Mediatek's MT7621 SDRAM
memory controller, and Apple M1 PCIe host
- Run schema checks for %.dtb targets
- Improve build time when using DT_SCHEMA_FILES
- Improve error message when dtschema is not found
- Various doc reference fixes in MAINTAINERS
- Convert architectures to common CPU h/w ID parsing function
of_get_cpu_hwid().
- Allow for empty NUMA node IDs which may be hotplugged
- Cleanup of __fdt_scan_reserved_mem()
- Constify device_node parameters
- Update dtc to upstream v1.6.1-19-g0a3a9d3449c8. Adds new checks
'node_name_vs_property_name' and 'interrupt_map'.
- Enable dtc 'unit_address_format' warning by default
- Fix unittest EXPECT text for gpio hog errors
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmGBrj8QHHJvYmhAa2Vy
bmVsLm9yZwAKCRD6+121jbxhw3M1D/9gpaVBqp+Q5hZZLWOjz/WkAsExZ71N/8Lh
rn64XWYQNJ6R1PINkBtlooJy6wTCIMfNs3IEmkAVEXVEj1Nvu7uEZwYbb96B4dJ4
EiMv/Vz0EphoqnBvICT86XfNZduP1sZ5M11pdv2dNvwJrEvvi98VLDvSucvxorn8
sm5jsqWOAwroiCR+u8BWW3qH3sugL1BOAwraMoUbosZAo0SpNH4WBdcBz4+v8lUS
5N8Y8Q6dB6fEqdbVpzMblN2B9c/TEb1VYaeGXRUyQsIUQJajX3xnR8RDnTKLBtsS
FAKGQORemLwVzBVKeZKbhlqXAJbl701LuKHRLiVerb9UGi+tk4AX9Rgg1Whrp7w4
UYi+k4Ozus1vDaKsemB1voabSgYYY+aNTRezltdtPz0a+eQJWPUt1xQB5m68cGO4
TZI+KfExxyGVa8iDgv4AWhvXqbR3+PUTUvel2xEIkRscWmMjXF/+oQXy8QYn2Aok
S9750/3EUQCbKi9ZUjPLRzd5CuPP2E97i8V2WdOgRse3+H7pPg5IcEq7oQYe9A62
SnRFjPz1X5g4Hh3bRVmcAGmDzbZJrl9dULvYVdiUWiqzfmHxN7MXO9FIxv3NKVfp
6jgr5vVVi1ShDnCh3ns4mYUwQ7j72dsONyklbVBbNtGjeeZcv5MEeg9ZAoVvO+lh
9DNNSGSd2g==
=dQa6
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
- Convert /reserved-memory bindings to schemas
- Convert a bunch of NFC bindings to schemas
- Convert bindings to schema: Xilinx USB, Freescale DDR controller, Arm
CCI-400, UBlox Neo-6M, 1-Wire GPIO, MSI controller, ASpeed LPC, OMAP
and Inside-Secure HWRNG, register-bit-led, OV5640, Silead GSL1680,
Elan ekth3000, Marvell bluetooth, TI wlcore, TI bluetooth, ESP
ESP8089, tlm,trusted-foundations, Microchip cap11xx, Ralink SoCs and
boards, and TI sysc
- New binding schemas for: msi-ranges, Aspeed UART routing controller,
palmbus, Xylon LogiCVC display controller, Mediatek's MT7621 SDRAM
memory controller, and Apple M1 PCIe host
- Run schema checks for %.dtb targets
- Improve build time when using DT_SCHEMA_FILES
- Improve error message when dtschema is not found
- Various doc reference fixes in MAINTAINERS
- Convert architectures to common CPU h/w ID parsing function
of_get_cpu_hwid().
- Allow for empty NUMA node IDs which may be hotplugged
- Cleanup of __fdt_scan_reserved_mem()
- Constify device_node parameters
- Update dtc to upstream v1.6.1-19-g0a3a9d3449c8. Adds new checks
'node_name_vs_property_name' and 'interrupt_map'.
- Enable dtc 'unit_address_format' warning by default
- Fix unittest EXPECT text for gpio hog errors
* tag 'devicetree-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (97 commits)
dt-bindings: net: ti,bluetooth: Document default max-speed
dt-bindings: pci: rcar-pci-ep: Document r8a7795
dt-bindings: net: qcom,ipa: IPA does support up to two iommus
of/fdt: Remove of_scan_flat_dt() usage for __fdt_scan_reserved_mem()
of: unittest: document intentional interrupt-map provider build warning
of: unittest: fix EXPECT text for gpio hog errors
of/unittest: Disable new dtc node_name_vs_property_name and interrupt_map warnings
scripts/dtc: Update to upstream version v1.6.1-19-g0a3a9d3449c8
dt-bindings: arm: firmware: tlm,trusted-foundations: Convert txt bindings to yaml
dt-bindings: display: tilcd: Fix endpoint addressing in example
dt-bindings: input: microchip,cap11xx: Convert txt bindings to yaml
dt-bindings: ufs: exynos-ufs: add exynosautov9 compatible
dt-bindings: ufs: exynos-ufs: add io-coherency property
dt-bindings: mips: convert Ralink SoCs and boards to schema
dt-bindings: display: xilinx: Fix example with psgtr
dt-bindings: net: nfc: nxp,pn544: Convert txt bindings to yaml
dt-bindings: Add a help message when dtschema tools are missing
dt-bindings: bus: ti-sysc: Update to use yaml binding
dt-bindings: sram: Allow numbers in sram region node name
dt-bindings: display: Document the Xylon LogiCVC display controller
...
* More progress on the protected VM front, now with the full
fixed feature set as well as the limitation of some hypercalls
after initialisation.
* Cleanup of the RAZ/WI sysreg handling, which was pointlessly
complicated
* Fixes for the vgic placement in the IPA space, together with a
bunch of selftests
* More memcg accounting of the memory allocated on behalf of a guest
* Timer and vgic selftests
* Workarounds for the Apple M1 broken vgic implementation
* KConfig cleanups
* New kvmarm.mode=none option, for those who really dislike us
RISC-V:
* New KVM port.
x86:
* New API to control TSC offset from userspace
* TSC scaling for nested hypervisors on SVM
* Switch masterclock protection from raw_spin_lock to seqcount
* Clean up function prototypes in the page fault code and avoid
repeated memslot lookups
* Convey the exit reason to userspace on emulation failure
* Configure time between NX page recovery iterations
* Expose Predictive Store Forwarding Disable CPUID leaf
* Allocate page tracking data structures lazily (if the i915
KVM-GT functionality is not compiled in)
* Cleanups, fixes and optimizations for the shadow MMU code
s390:
* SIGP Fixes
* initial preparations for lazy destroy of secure VMs
* storage key improvements/fixes
* Log the guest CPNC
Starting from this release, KVM-PPC patches will come from
Michael Ellerman's PPC tree.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBOiEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNowwf/axlx3g9sgCwQHr12/6UF/7hL/RwP
9z+pGiUzjl2YQE+RjSvLqyd6zXh+h4dOdOKbZDLSkSTbcral/8U70ojKnQsXM0XM
1LoymxBTJqkgQBLm9LjYreEbzrPV4irk4ygEmuk3CPOHZu8xX1ei6c5LdandtM/n
XVUkXsQY+STkmnGv4P3GcPoDththCr0tBTWrFWtxa0w9hYOxx0ay1AZFlgM4FFX0
QFuRc8VBLoDJpIUjbkhsIRIbrlHc/YDGjuYnAU7lV/CIME8vf2BW6uBwIZJdYcDj
0ejozLjodEnuKXQGnc8sXFioLX2gbMyQJEvwCgRvUu/EU7ncFm1lfs7THQ==
=UxKM
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- More progress on the protected VM front, now with the full fixed
feature set as well as the limitation of some hypercalls after
initialisation.
- Cleanup of the RAZ/WI sysreg handling, which was pointlessly
complicated
- Fixes for the vgic placement in the IPA space, together with a
bunch of selftests
- More memcg accounting of the memory allocated on behalf of a guest
- Timer and vgic selftests
- Workarounds for the Apple M1 broken vgic implementation
- KConfig cleanups
- New kvmarm.mode=none option, for those who really dislike us
RISC-V:
- New KVM port.
x86:
- New API to control TSC offset from userspace
- TSC scaling for nested hypervisors on SVM
- Switch masterclock protection from raw_spin_lock to seqcount
- Clean up function prototypes in the page fault code and avoid
repeated memslot lookups
- Convey the exit reason to userspace on emulation failure
- Configure time between NX page recovery iterations
- Expose Predictive Store Forwarding Disable CPUID leaf
- Allocate page tracking data structures lazily (if the i915 KVM-GT
functionality is not compiled in)
- Cleanups, fixes and optimizations for the shadow MMU code
s390:
- SIGP Fixes
- initial preparations for lazy destroy of secure VMs
- storage key improvements/fixes
- Log the guest CPNC
Starting from this release, KVM-PPC patches will come from Michael
Ellerman's PPC tree"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
RISC-V: KVM: fix boolreturn.cocci warnings
RISC-V: KVM: remove unneeded semicolon
RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions
RISC-V: KVM: Factor-out FP virtualization into separate sources
KVM: s390: add debug statement for diag 318 CPNC data
KVM: s390: pv: properly handle page flags for protected guests
KVM: s390: Fix handle_sske page fault handling
KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol
KVM: x86: On emulation failure, convey the exit reason, etc. to userspace
KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info
KVM: x86: Clarify the kvm_run.emulation_failure structure layout
KVM: s390: Add a routine for setting userspace CPU state
KVM: s390: Simplify SIGP Set Arch handling
KVM: s390: pv: avoid stalls when making pages secure
KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
KVM: s390: pv: avoid double free of sida page
KVM: s390: pv: add macros for UVC CC values
s390/mm: optimize reset_guest_reference_bit()
s390/mm: optimize set_guest_storage_key()
s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present
...
- kprobes: Restructured stack unwinder to show properly on x86 when a stack
dump happens from a kretprobe callback.
- Fix to bootconfig parsing
- Have tracefs allow owner and group permissions by default (only denying
others). There's been pressure to allow non root to tracefs in a
controlled fashion, and using groups is probably the safest.
- Bootconfig memory managament updates.
- Bootconfig clean up to have the tools directory be less dependent on
changes in the kernel tree.
- Allow perf to be traced by function tracer.
- Rewrite of function graph tracer to be a callback from the function tracer
instead of having its own trampoline (this change will happen on an arch
by arch basis, and currently only x86_64 implements it).
- Allow multiple direct trampolines (bpf hooks to functions) be batched
together in one synchronization.
- Allow histogram triggers to add variables that can perform calculations
against the event's fields.
- Use the linker to determine architecture callbacks from the ftrace
trampoline to allow for proper parameter prototypes and prevent warnings
from the compiler.
- Extend histogram triggers to key off of variables.
- Have trace recursion use bit magic to determine preempt context over if
branches.
- Have trace recursion disable preemption as all use cases do anyway.
- Added testing for verification of tracing utilities.
- Various small clean ups and fixes.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYBdxhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qp1sAQD2oYFwaG3sx872gj/myBcHIBSKdiki
Hry5csd8zYDBpgD+Poylopt5JIbeDuoYw/BedgEXmscZ8Qr7VzjAXdnv/Q4=
=Loz8
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- kprobes: Restructured stack unwinder to show properly on x86 when a
stack dump happens from a kretprobe callback.
- Fix to bootconfig parsing
- Have tracefs allow owner and group permissions by default (only
denying others). There's been pressure to allow non root to tracefs
in a controlled fashion, and using groups is probably the safest.
- Bootconfig memory managament updates.
- Bootconfig clean up to have the tools directory be less dependent on
changes in the kernel tree.
- Allow perf to be traced by function tracer.
- Rewrite of function graph tracer to be a callback from the function
tracer instead of having its own trampoline (this change will happen
on an arch by arch basis, and currently only x86_64 implements it).
- Allow multiple direct trampolines (bpf hooks to functions) be batched
together in one synchronization.
- Allow histogram triggers to add variables that can perform
calculations against the event's fields.
- Use the linker to determine architecture callbacks from the ftrace
trampoline to allow for proper parameter prototypes and prevent
warnings from the compiler.
- Extend histogram triggers to key off of variables.
- Have trace recursion use bit magic to determine preempt context over
if branches.
- Have trace recursion disable preemption as all use cases do anyway.
- Added testing for verification of tracing utilities.
- Various small clean ups and fixes.
* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
tracing/histogram: Fix semicolon.cocci warnings
tracing/histogram: Fix documentation inline emphasis warning
tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
tracing: Show size of requested perf buffer
bootconfig: Initialize ret in xbc_parse_tree()
ftrace: do CPU checking after preemption disabled
ftrace: disable preemption when recursion locked
tracing/histogram: Document expression arithmetic and constants
tracing/histogram: Optimize division by a power of 2
tracing/histogram: Covert expr to const if both operands are constants
tracing/histogram: Simplify handling of .sym-offset in expressions
tracing: Fix operator precedence for hist triggers expression
tracing: Add division and multiplication support for hist triggers
tracing: Add support for creating hist trigger variables from literal
selftests/ftrace: Stop tracing while reading the trace file by default
MAINTAINERS: Update KPROBES and TRACING entries
test_kprobes: Move it from kernel/ to lib/
docs, kprobes: Remove invalid URL and add new reference
samples/kretprobes: Fix return value if register_kretprobe() failed
lib/bootconfig: Fix the xbc_get_info kerneldoc
...
Cross-architecture update to move task_struct::cpu back into thread_info
on arm64, x86, s390, powerpc, and riscv. All Acked by arch maintainers.
Quoting Ard Biesheuvel:
"Move task_struct::cpu back into thread_info
Keeping CPU in task_struct is problematic for architectures that define
raw_smp_processor_id() in terms of this field, as it requires
linux/sched.h to be included, which causes a lot of pain in terms of
circular dependencies (aka 'header soup')
This series moves it back into thread_info (where it came from) for all
architectures that enable THREAD_INFO_IN_TASK, addressing the header
soup issue as well as some pointless differences in the implementations
of task_cpu() and set_task_cpu()."
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAEPYWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJq4wEACItgLuyzPgB2eSLVMc3sHPIWcn
EUWbAWsuzJH79wmJtn2AKxW/C5OLBNGeoNjkXQvFN3ULkQDPrfCpB4x/tB6CjIQI
WRDf8kO7oaAD85ZrbSwyFl/MFfrD67f6H1HZoB9FKWAzuv/Bp2xQ0Kf06Dv4HEZp
CzprzZuWtjHB+qgyy+EpGOge3zbFmCuYPE2QpMYLWgs1rcVW9OYvoCI6AYtNefrC
6Kl6CbmBb1k6lFxkhM7wvRcIJthBl6Bajpc3Z2uL1aLb27dVpQZs3YpY859Knb6U
ZpOQCRJOMui3HOxyF3bDUI37y0XVLm6xaNM6C/7i0XS1GiFlSxkGVamg+Mp7anpI
+hdK5kqtSagaBC9CaJvRHnWIex1npQAfiyDNdyiEbrsUJ1dp6/zZcQSe4/m/XRbi
vywQPGxU9f1ASshzHsGU2TJf7Ps7qHulUsS5fKwmHU2ZjQnbYCoPN10JGO9gKjOX
yioN5xsKnbPY9j0ys3l9XBqaMJ8KAr1XspplTGIMZIVbjNMlqrfgbg8Qn8T8WGM7
oUqudMIxczilj0/iEGfGRxBeFaYAfhGQCDnxNlNX9g7Xe/gHTJgNYlHVxL55jHNu
AoPE3Gd0X8K9fbov0BCB6a21XwGJ6Wj+FSrnvuyWrRuy8JWiDFJaVKUBEcalKr7a
MhoUNQPu5M83OdC42A==
=PzvV
-----END PGP SIGNATURE-----
Merge tag 'cpu-to-thread_info-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull thread_info update to move 'cpu' back from task_struct from Kees Cook:
"Cross-architecture update to move task_struct::cpu back into
thread_info on arm64, x86, s390, powerpc, and riscv. All Acked by arch
maintainers.
Quoting Ard Biesheuvel:
'Move task_struct::cpu back into thread_info
Keeping CPU in task_struct is problematic for architectures that
define raw_smp_processor_id() in terms of this field, as it
requires linux/sched.h to be included, which causes a lot of pain
in terms of circular dependencies (aka 'header soup')
This series moves it back into thread_info (where it came from)
for all architectures that enable THREAD_INFO_IN_TASK, addressing
the header soup issue as well as some pointless differences in the
implementations of task_cpu() and set_task_cpu()'"
* tag 'cpu-to-thread_info-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
riscv: rely on core code to keep thread_info::cpu updated
powerpc: smp: remove hack to obtain offset of task_struct::cpu
sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y
powerpc: add CPU field to struct thread_info
s390: add CPU field to struct thread_info
x86: add CPU field to struct thread_info
arm64: add CPU field to struct thread_info
- Revert the printk format based wchan() symbol resolution as it can leak
the raw value in case that the symbol is not resolvable.
- Make wchan() more robust and work with all kind of unwinders by
enforcing that the task stays blocked while unwinding is in progress.
- Prevent sched_fork() from accessing an invalid sched_task_group
- Improve asymmetric packing logic
- Extend scheduler statistics to RT and DL scheduling classes and add
statistics for bandwith burst to the SCHED_FAIR class.
- Properly account SCHED_IDLE entities
- Prevent a potential deadlock when initial priority is assigned to a
newly created kthread. A recent change to plug a race between cpuset and
__sched_setscheduler() introduced a new lock dependency which is now
triggered. Break the lock dependency chain by moving the priority
assignment to the thread function.
- Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
- Improve idle balancing in general and especially for NOHZ enabled
systems.
- Provide proper interfaces for live patching so it does not have to
fiddle with scheduler internals.
- Add cluster aware scheduling support.
- A small set of tweaks for RT (irqwork, wait_task_inactive(), various
scheduler options and delaying mmdrop)
- The usual small tweaks and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
/1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
d2qWG7UvoseT+g==
=fgtS
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:
- Revert the printk format based wchan() symbol resolution as it can
leak the raw value in case that the symbol is not resolvable.
- Make wchan() more robust and work with all kind of unwinders by
enforcing that the task stays blocked while unwinding is in progress.
- Prevent sched_fork() from accessing an invalid sched_task_group
- Improve asymmetric packing logic
- Extend scheduler statistics to RT and DL scheduling classes and add
statistics for bandwith burst to the SCHED_FAIR class.
- Properly account SCHED_IDLE entities
- Prevent a potential deadlock when initial priority is assigned to a
newly created kthread. A recent change to plug a race between cpuset
and __sched_setscheduler() introduced a new lock dependency which is
now triggered. Break the lock dependency chain by moving the priority
assignment to the thread function.
- Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
- Improve idle balancing in general and especially for NOHZ enabled
systems.
- Provide proper interfaces for live patching so it does not have to
fiddle with scheduler internals.
- Add cluster aware scheduling support.
- A small set of tweaks for RT (irqwork, wait_task_inactive(), various
scheduler options and delaying mmdrop)
- The usual small tweaks and improvements all over the place
* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
sched/fair: Cleanup newidle_balance
sched/fair: Remove sysctl_sched_migration_cost condition
sched/fair: Wait before decaying max_newidle_lb_cost
sched/fair: Skip update_blocked_averages if we are defering load balance
sched/fair: Account update_blocked_averages in newidle_balance cost
x86: Fix __get_wchan() for !STACKTRACE
sched,x86: Fix L2 cache mask
sched/core: Remove rq_relock()
sched: Improve wake_up_all_idle_cpus() take #2
irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
sched: Add cluster scheduler level for x86
sched: Add cluster scheduler level in core and related Kconfig for ARM64
topology: Represent clusters of CPUs within a die
sched: Disable -Wunused-but-set-variable
sched: Add wrapper for get_wchan() to keep task blocked
x86: Fix get_wchan() to support the ORC unwinder
proc: Use task_is_running() for wchan in /proc/$pid/stat
...
Core changes:
- Prevent a potential deadlock when initial priority is assigned to a
newly created interrupt thread. A recent change to plug a race between
cpuset and __sched_setscheduler() introduced a new lock dependency
which is now triggered. Break the lock dependency chain by moving the
priority assignment to the thread function.
- A couple of small updates to make the irq core RT safe.
- Confine the irq_cpu_online/offline() API to the only left unfixable
user Cavium Octeon so that it does not grow new usage.
- A small documentation update
Driver changes:
- A large cross architecture rework to move irq_enter/exit() into the
architecture code to make addressing the NOHZ_FULL/RCU issues simpler.
- The obligatory new irq chip driver for Microchip EIC
- Modularize a few irq chip drivers
- Expand usage of devm_*() helpers throughout the driver code
- The usual small fixes and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF+8BUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWs2EACeNbL93aIFokd2/RllRSr4VvMjKNyW
PpA0RYDOz1Jh4ldK+7b/EYapKgAkR3yyOtz+jyjRE7jsQK0pQeLtYNLd3cTzsD7K
LCvl8rq6cbRqyFoSC15UKKNbQ/f+o/3LeGPoipr5NQZRMepxk2J/yBCNRXHvIbe6
oLMQJUgw7KKtvCrCUX9OSei4F09T1qsNrIYb7QafP5+v0zndAT7uKNivWrKGFrsh
Uk9epoH3hIkvQERkpmzwJEJaq6oyqhoYQy7ZRGayEPwIdCyivJGZrVX0mZk1LX58
uc8u5grIslX9MqZEQWBweR5y7nISB494NGKmoCInu66U/+3DSOg3AGH2Rfw8PNFZ
lMKdXzYoDgv2y6LeiLtTUKV4K1NBRXo0BhwSGbPw0o6C03/x003kG824Y+/naU75
6q05BZSia1PagPV3e0UAm0A2Rnjj/5uso2fEk0eGBSGM27jf9SQcSE8DVrEiLRd1
2N5uAXbMdfu4xACsEI1Uxu1KNOSQnUhBCy0X6Ppj1a083kLG7jg/126ebb05R8G4
MF79PFt+xUPSzmuKc/xwCdANtW+zzoyjYl5w6mwELBJ9veNbPShokGBTN/qzjXKZ
vdr3/pXx95lRAzFnGOnETesm3IyObruU4K8NbMKd2b+eYa0w1WuZCKnutGLfsqxg
byhCEw459e3P2g==
=r6ln
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt subsystem:
Core changes:
- Prevent a potential deadlock when initial priority is assigned to a
newly created interrupt thread. A recent change to plug a race
between cpuset and __sched_setscheduler() introduced a new lock
dependency which is now triggered. Break the lock dependency chain
by moving the priority assignment to the thread function.
- A couple of small updates to make the irq core RT safe.
- Confine the irq_cpu_online/offline() API to the only left unfixable
user Cavium Octeon so that it does not grow new usage.
- A small documentation update
Driver changes:
- A large cross architecture rework to move irq_enter/exit() into the
architecture code to make addressing the NOHZ_FULL/RCU issues
simpler.
- The obligatory new irq chip driver for Microchip EIC
- Modularize a few irq chip drivers
- Expand usage of devm_*() helpers throughout the driver code
- The usual small fixes and improvements all over the place"
* tag 'irq-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
h8300: Fix linux/irqchip.h include mess
dt-bindings: irqchip: renesas-irqc: Document r8a774e1 bindings
MIPS: irq: Avoid an unused-variable error
genirq: Hide irq_cpu_{on,off}line() behind a deprecated option
irqchip/mips-gic: Get rid of the reliance on irq_cpu_online()
MIPS: loongson64: Drop call to irq_cpu_offline()
irq: remove handle_domain_{irq,nmi}()
irq: remove CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY
irq: riscv: perform irqentry in entry code
irq: openrisc: perform irqentry in entry code
irq: csky: perform irqentry in entry code
irq: arm64: perform irqentry in entry code
irq: arm: perform irqentry in entry code
irq: add a (temporary) CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY
irq: nds32: avoid CONFIG_HANDLE_DOMAIN_IRQ
irq: arc: avoid CONFIG_HANDLE_DOMAIN_IRQ
irq: add generic_handle_arch_irq()
irq: unexport handle_irq_desc()
irq: simplify handle_domain_{irq,nmi}()
irq: mips: simplify do_domain_IRQ()
...
The trap vector marked by label .Lsecondary_park must align on a
4-byte boundary, as the {m,s}tvec is defined to require 4-byte
alignment.
Signed-off-by: Chen Lu <181250012@smail.nju.edu.cn>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Fixes: e011995e82 ("RISC-V: Move relocate and few other functions out of __init")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
As the documentation explained, ftrace_test_recursion_trylock()
and ftrace_test_recursion_unlock() were supposed to disable and
enable preemption properly, however currently this work is done
outside of the function, which could be missing by mistake.
And since the internal using of trace_test_and_set_recursion()
and trace_clear_recursion() also require preemption disabled, we
can just merge the logical.
This patch will make sure the preemption has been disabled when
trace_test_and_set_recursion() return bit >= 0, and
trace_clear_recursion() will enable the preemption if previously
enabled.
Link: https://lkml.kernel.org/r/13bde807-779c-aa4c-0672-20515ae365ea@linux.alibaba.com
CC: Petr Mladek <pmladek@suse.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jisheng Zhang <jszhang@kernel.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Miroslav Benes <mbenes@suse.cz>
Reported-by: Abaci <abaci@linux.alibaba.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
[ Removed extra line in comment - SDR ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
In preparation for removing HANDLE_DOMAIN_IRQ_IRQENTRY, have arch/riscv
perform all the irqentry accounting in its entry code. As arch/riscv
uses GENERIC_IRQ_MULTI_HANDLER, we can use generic_handle_arch_irq() to
do so.
Since generic_handle_arch_irq() handles the irq entry and setting the
irq regs, and happens before the irqchip code calls handle_IPI(), we can
remove the redundant irq entry and irq regs manipulation from
handle_IPI().
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common
ftrace_dyn_arch_init() to cleanup them.
Link: https://lkml.kernel.org/r/20210909090216.1955240-1-o451686892@gmail.com
Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390)
Acked-by: Helge Deller <deller@gmx.de> (parisc)
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
This patch adds floating point (F and D extension) context save/restore
for guest VCPUs. The FP context is saved and restored lazily only when
kernel enter/exits the in-kernel run loop and not during the KVM world
switch. This way FP save/restore has minimal impact on KVM performance.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
We will get stage2 page faults whenever Guest/VM access SW emulated
MMIO device or unmapped Guest RAM.
This patch implements MMIO read/write emulation by extracting MMIO
details from the trapped load/store instruction and forwarding the
MMIO read/write to user-space. The actual MMIO emulation will happen
in user-space and KVM kernel module will only take care of register
updates before resuming the trapped VCPU.
The handling for stage2 page faults for unmapped Guest RAM will be
implemeted by a separate patch later.
[jiangyifei: ioeventfd and in-kernel mmio device support]
Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
This patch implements the VCPU world-switch for KVM RISC-V.
The KVM RISC-V world-switch (i.e. __kvm_riscv_switch_to()) mostly
switches general purpose registers, SSTATUS, STVEC, SSCRATCH and
HSTATUS CSRs. Other CSRs are switched via vcpu_load() and vcpu_put()
interface in kvm_arch_vcpu_load() and kvm_arch_vcpu_put() functions
respectively.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
riscv architectures relying on mmap_sem for write in their
arch_setup_additional_pages. If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving. Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
As commit 601255ae3c ("arm64: vdso: move data page before code pages"), the
same issue exists on riscv, testcase is shown below, make sure that vdso.so is
bigger than page size,
struct timespec tp;
clock_gettime(5, &tp);
printf("tv_sec: %ld, tv_nsec: %ld\n", tp.tv_sec, tp.tv_nsec);
without this patch, test result : tv_sec: 0, tv_nsec: 0
with this patch, test result : tv_sec: 1629271537, tv_nsec: 748000000
Move the vdso data page in front of the VDSO area to fix the issue.
Fixes: ad5d1122b8 ("riscv: use vDSO common flow to reduce the latency of the time-related functions")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The asm/vdso.h will be included in vdso.lds.S in the next patch, the
following cleanup is needed to avoid syntax error:
1.the declaration of sys_riscv_flush_icache() is moved into asm/syscall.h.
2.the definition of struct vdso_data is moved into kernel/vdso.c.
2.the definition of VDSO_SYMBOL is placed under "#ifndef __ASSEMBLY__".
Also remove the redundant linux/types.h include.
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Since now there is kretprobe_trampoline_addr() for referring the
address of kretprobe trampoline code, we don't need to access
kretprobe_trampoline directly.
Make it harder to refer by renaming it to __kretprobe_trampoline().
Link: https://lkml.kernel.org/r/163163045446.489837.14510577516938803097.stgit@devnote2
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The __kretprobe_trampoline_handler() callback, called from low level
arch kprobes methods, has the 'trampoline_address' parameter, which is
entirely superfluous as it basically just replicates:
dereference_kernel_function_descriptor(kretprobe_trampoline)
In fact we had bugs in arch code where it wasn't replicated correctly.
So remove this superfluous parameter and use kretprobe_trampoline_addr()
instead.
Link: https://lkml.kernel.org/r/163163044546.489837.13505751885476015002.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
This clean up the error/notification messages in kprobes related code.
Basically this defines 'pr_fmt()' macros for each files and update
the messages which describes
- what happened,
- what is the kernel going to do or not do,
- is the kernel fine,
- what can the user do about it.
Also, if the message is not needed (e.g. the function returns unique
error code, or other error message is already shown.) remove it,
and replace the message with WARN_*() macros if suitable.
Link: https://lkml.kernel.org/r/163163036568.489837.14085396178727185469.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Now that the core code switched back to using thread_info::cpu to keep
a task's CPU number, we no longer need to keep it in sync explicitly. So
just drop the code that does this.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
- Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
original hotplug code and now causing trouble with the ARM64 cache
topology setup due to the pointless SMP function call. It's not longer
required as the hotplug callbacks are guaranteed to be invoked on the
upcoming CPU.
- Remove the deprecated and now unused CPU hotplug functions
- Rewrite the CPU hotplug API documentation
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmE+VhMTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUZmD/9Q7XO8EgfitIh3sMO53spOv6ql1aWK
1bHZmnFZL/txdIJiEgouf7wV4YgPgadJtZcK6//V/wGhYj5dB+z6otj+LwdrjjQT
dgaXN6a27My0kvoyNCP2V3Xc9g6Q6XXAUadw+d7aWGqZvg5yAr+AdRgGmK3Ct2a1
AsNjiG1HJsBMWv6eKnweOwfE6FbQpwFH4vXlldQi59QaMIOteMUwx9f64ZNyZWSe
FNqVF2EVmLEmjMzhWSBzYqVdZBEUuEsPM2Y2UYqGAs7Wtwttoupredvplzsf2uJ/
sCrDQspdgZsiD1EnjaSogLFUSfdRFd+9KvvChhuR8FSjPMNU+cWf62SAjVlUGIpI
QI2G6S7707LPbun8KSlbqsXD2zKmZ9U+SkTdwJFpRhkket73uVYtuuR0PjSxUrxt
BaULcpjKjf2joMji7BMvY7AR5bwnbDS+NUtqZpqhaUYHCjOZrPglGeUlLqth5epw
SMP21BQq8Ys9M5/6dA3ATUYaE1vJb2ES7jn6sULVJ9e9RuupdCl3KfdGCaH9fiWg
dfcowI9ACI+ZZ4OPJVvR/nlEVnK3GREYS5w3S/Ay1kLYpAfvGH2l3idzclfHMvWT
ywB2uyRKowAT/Ig7mL7t3Y7ZOMLTzG8KxPfl8ar8Ja+oqDbEL5VOnIQXev3gxBgC
1f4K8WUVGl+xXQ==
=uAYt
-----END PGP SIGNATURE-----
Merge tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
"Updates for the SMP and CPU hotplug:
- Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
original hotplug code and now causing trouble with the ARM64 cache
topology setup due to the pointless SMP function call.
It's not longer required as the hotplug callbacks are guaranteed to
be invoked on the upcoming CPU.
- Remove the deprecated and now unused CPU hotplug functions
- Rewrite the CPU hotplug API documentation"
* tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation: core-api/cpuhotplug: Rewrite the API section
cpu/hotplug: Remove deprecated CPU-hotplug functions.
thermal: Replace deprecated CPU-hotplug functions.
drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
* A pair of defconfig additions, for NVMe and the EFI filesystem
localization options.
* A larger address space for stack randomization.
* A cleanup to our install rules.
* A DTS update for the Microchip Icicle board, to fix the serial
console.
* Support for build-time table sorting, which allows us to have
__ex_table read-only.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmE84S4THHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiYAlD/9NMmgarbs7kWtQjNPleAonwbgRHjW8
3yVXDhxuAIihkr+grje6eahon17gbLcLhzSq89fUBUKe0gvRLbjV/So8P2pYre2K
l5xIl+F72ULNM6KkjitPe972kFHnuR7IBlCy4zQhoNhc1XGd4qExE9T51v3dT3vg
flIJwuc1lg/Knz96EoiNMufS3UjNqNDxxa54anplGXZW25jumnQELFlQTsxlRO3F
FO+3JsiF7wzaxCNpWzmSsvMmXAaW9XerGKAIumtzRsXQ4EqPBfYdn/p+4A11QIUv
R4cdsl/QP5XST3zR74ZXblaVCHXUZun3N0FHbdhmTcu2stdcs7q3GtWfR6H+OiQm
igWwRn0MLPEiqmx12Ss+WELZOB/tyA14HHj6HE9gxbPcYcXO62Ok7Qg6gP4qMYkL
3v6yqZ/WWdJEi7mzRrk5mTZAMgAdXf5Je6eAOxY3hCbwtC2UMOA1qCj3Ir9XW/pN
TC/SDLDSqV9AfAZ2hRqxV1VLCASh0mTqt7yq+Vp/jBd4S8MevXV6LS9YGMxjTR2v
OzUMS53oZhddLSemZRi/cp1B8PP0Ih2A/f816iY+9MswJWZAkR6TqJL8TfSbQh2u
f3yTVAkRvavi7gnJ/s1HuzavRFtv452zmrBGhGh4IsGctSPWn/C+U2N5Pl3bEDhw
88R+rkAJkfLRkw==
=4SV6
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- A pair of defconfig additions, for NVMe and the EFI filesystem
localization options.
- A larger address space for stack randomization.
- A cleanup to our install rules.
- A DTS update for the Microchip Icicle board, to fix the serial
console.
- Support for build-time table sorting, which allows us to have
__ex_table read-only.
* tag 'riscv-for-linus-5.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Move EXCEPTION_TABLE to RO_DATA segment
riscv: Enable BUILDTIME_TABLE_SORT
riscv: dts: microchip: mpfs-icicle: Fix serial console
riscv: move the (z)install rules to arch/riscv/Makefile
riscv: Improve stack randomisation on RV64
riscv: defconfig: enable NLS_CODEPAGE_437, NLS_ISO8859_1
riscv: defconfig: enable BLK_DEV_NVME
_ex_table section is read-only, so move it to RO_DATA.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Ensure that all usage sites of get/put_online_cpus() except for the
struggler in drivers/thermal are gone. So the last user and the deprecated
inlines can be removed.
Merge more updates from Andrew Morton:
"147 patches, based on 7d2a07b769.
Subsystems affected by this patch series: mm (memory-hotplug, rmap,
ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
selftests, ipc, and scripts"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
scripts: check_extable: fix typo in user error message
mm/workingset: correct kernel-doc notations
ipc: replace costly bailout check in sysvipc_find_ipc()
selftests/memfd: remove unused variable
Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
configs: remove the obsolete CONFIG_INPUT_POLLDEV
prctl: allow to setup brk for et_dyn executables
pid: cleanup the stale comment mentioning pidmap_init().
kernel/fork.c: unexport get_{mm,task}_exe_file
coredump: fix memleak in dump_vma_snapshot()
fs/coredump.c: log if a core dump is aborted due to changed file permissions
nilfs2: use refcount_dec_and_lock() to fix potential UAF
nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
nilfs2: fix NULL pointer in nilfs_##name##_attr_release
nilfs2: fix memory leak in nilfs_sysfs_create_device_group
trap: cleanup trap_init()
init: move usermodehelper_enable() to populate_rootfs()
...
There are some empty trap_init() definitions in different ARCHs, Introduce
a new weak trap_init() function to clean them up.
Link: https://lkml.kernel.org/r/20210812123602.76356-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm32]
Acked-by: Vineet Gupta [arc]
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <palmerdabbelt@google.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Support for PC-relative instructions (auipc and branches) in kprobes.
* Support for forced IRQ threading.
* Support for the hlt/nohlt kernel command line options, via the generic
idle loop.
* Support for showing the edge/level triggered behavior of interrupts in
/proc/interrupts.
* A handful of cleanups to our address mapping mechanisms.
* Support for allocating gigantic hugepages via CMA.
* Support for the undefined behavior sanitizer.
* A handful of cleanups to the VDSO that allow the kernel to build with
LLD.
* Support for hugepage migration.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmEzxA4THHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYifuvD/93Tr9oAy6dwkGoa/VIzBc/nklCeTiO
cv2W9eFQMG+PdRh1Ezi9FrLO6qZoADGOPPcnuc2HTTe58CWWWg0juC862H/N2k3V
+3LGhHTv3E7NsEZNa4YhQXw6pPmj5OhbHab87HNowHpuNJGpKYE6Q+QbnYl5vorI
9YS10pRpqvO6tdHUW335Z5N0nW9X3HCHD6NUb1ZV8qd34QzP+XeCJMqVXpoOijNO
96lhyDYPivM2Rhi+uvtSjUeWzbddnN/P9IywfECddHFJyGvoEtiIROo9d2sR6pla
lAa1YOw0ybD/UPHtL/diEU+K8ivNVKdRN/YvlzHS4CdCrPyuWSUmBMozsOBMBwf1
UZCMDwmP2sQkHu1Ht9g1fXwk2R85j1s+qE5cZEXyAEULKfHlmt4vygd1xgm8DTf9
xgS+np2g0frcmKU8cLPmC7Sr0YI1fPeDMRkBXTYE1SmBF3eHrZCCbeM1H97p2mUC
zs/Dvyoj7e25HGyYqxN3lHZzhMluqh7wCQSOQMB2YxVwdiA4VjyjxLF4MKAdXL8i
F1QTETsjLGu6UiMfNIQ4RVs981/fGYSANdOvMfwo447gVH6JaxC51sLo6YIDtXss
tLzchHNu5IMNqB7XOb1gU8ZKeXruG9lrxwUOyEIYAVi6pHCbkdLV0A8ymwEjj06b
aGV4xFPaAGvDLQ==
=rDB3
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- support PC-relative instructions (auipc and branches) in kprobes
- support for forced IRQ threading
- support for the hlt/nohlt kernel command line options, via the
generic idle loop
- show the edge/level triggered behavior of interrupts
in /proc/interrupts
- a handful of cleanups to our address mapping mechanisms
- support for allocating gigantic hugepages via CMA
- support for the undefined behavior sanitizer (UBSAN)
- a handful of cleanups to the VDSO that allow the kernel to build with
LLD.
- support for hugepage migration
* tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
riscv: add support for hugepage migration
RISC-V: Fix VDSO build for !MMU
riscv: use strscpy to replace strlcpy
riscv: explicitly use symbol offsets for VDSO
riscv: Enable Undefined Behavior Sanitizer UBSAN
riscv: Keep the riscv Kconfig selects sorted
riscv: Support allocating gigantic hugepages using CMA
riscv: fix the global name pfn_base confliction error
riscv: Move early fdt mapping creation in its own function
riscv: Simplify BUILTIN_DTB device tree mapping handling
riscv: Use __maybe_unused instead of #ifdefs around variable declarations
riscv: Get rid of map_size parameter to create_kernel_page_table
riscv: Introduce va_kernel_pa_offset for 32-bit kernel
riscv: Optimize kernel virtual address conversion macro
dt-bindings: riscv: add starfive jh7100 bindings
riscv: Enable GENERIC_IRQ_SHOW_LEVEL
riscv: Enable idle generic idle loop
riscv: Allow forced irq threading
riscv: Implement thread_struct whitelist for hardened usercopy
riscv: kprobes: implement the branch instructions
...
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework
to ensure that the cache related functions are called on the upcoming CPU
because the notifier itself could run on any online CPU.
The hotplug state machine guarantees that the callbacks are invoked on the
upcoming CPU. So there is no need to have this SMP function call
obfuscation. That indirection was missed when the hotplug notifiers were
converted.
This also solves the problem of ARM64 init_cache_level() invoking ACPI
functions which take a semaphore in that context. That's invalid as SMP
function calls run with interrupts disabled. Running it just from the
callback in context of the CPU hotplug thread solves this.
Fixes: 8571890e15 ("arm64: Add support for ACPI based firmware tables")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
The strlcpy should not be used because it doesn't limit the source
length. As linus says, it's a completely useless function if you
can't implicitly trust the source string - but that is almost always
why people think they should use it! All in all the BSD function
will lead some potential bugs.
But the strscpy doesn't require reading memory from the src string
beyond the specified "count" bytes, and since the return value is
easier to error-check than strlcpy()'s. In addition, the implementation
is robust to the string changing out from underneath it, unlike the
current strlcpy() implementation.
Thus, We prefer using strscpy instead of strlcpy.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The current implementation of the `__rt_sigaction` reference computed an
absolute offset relative to the mapped base of the VDSO. While this can
be handled in the medlow model, the medany model cannot handle this as
it is meant to be position independent. The current implementation
relied on the BFD linker relaxing the PC-relative relocation into an
absolute relocation as it was a near-zero address allowing it to be
referenced relative to `zero`.
We now extract the offsets and create a generated header allowing the
build with LLVM and lld to succeed as we no longer depend on the linker
rewriting address references near zero. This change was largely
modelled after the ARM64 target which does something similar.
Signed-off-by: Saleem Abdulrasool <abdulras@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Select ARCH_HAS_UBSAN_SANITIZE_ALL in order to allow the user to
enable CONFIG_UBSAN_SANITIZE_ALL and instrument the entire kernel for
ubsan checks.
VDSO is excluded because its build doesn't include the
__ubsan_handle_*() functions from lib/ubsan.c, and the VDSO has no
sane way to report errors even if it has definitions of these functions.
Passed lib/test_ubsan.c test.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The value of FP registers in the core dump file comes from the
thread.fstate. However, kernel saves the FP registers to the thread.fstate
only before scheduling out the process. If no process switch happens
during the exception handling process, kernel will not have a chance to
save the latest value of FP registers to thread.fstate. It will cause the
value of FP registers in the core dump file may be incorrect. To solve this
problem, this patch force lets kernel save the FP register into the
thread.fstate if the target task_struct equals the current.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: b8c8a9590e ("RISC-V: Add FP register ptrace support for gdb.")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Function init_resources() allocates a boot memory block to hold an array of
resources which it adds to iomem_resource. The array is filled in from its
end and the function then attempts to free any unused memory at the
beginning. The problem is that size of the unused memory is incorrectly
calculated and this can result in releasing memory which is in use by
active resources. Their data then gets corrupted later when the memory is
reused by a different part of the system.
Fix the size of the released memory to correctly match the number of unused
resource entries.
Fixes: ffe0e52612 ("RISC-V: Improve init_resources()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>
Tested-by: Sunil V L <sunilvl@ventanamicro.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The RISC-V special option '-mno-relax' which to disable linker relaxations
is supported by GCC8+. For GCC7 and lower versions do not support this
option.
Fixes: fba8a8674f ("RISC-V: Add kexec support")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This has been tested by probing a module that contains each of the
flavors of branches we have.
Signed-off-by: Chen Lifu <chenlifu@huawei.com>
[Palmer: commit message, fix kconfig errors]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This has been tested by probing a module that contains an auipc
instruction.
Signed-off-by: Chen Lifu <chenlifu@huawei.com>
[Palmer: commit message]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
In addition to We have a handful of new features for 5.14:
* Support for transparent huge pages.
* Support for generic PCI resources mapping.
* Support for the mem= kernel parameter.
* Support for KFENCE.
* A handful of fixes to avoid W+X mappings in the kernel.
* Support for VMAP_STACK based overflow detection.
* An optimized copy_{to,from}_user.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmDn3XITHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiW4UD/9Z0BJNXG9rOERofyFWwb7EYchT551A
Coi8BFpuUCZfT9qonuBzQcPrAlXH/T9yMDGiShZio9jh29bnaOIqo5NrvNjB88VU
LdarNeiumPM4SCQFIsbIBnRrk5OQDtzsPx+dS5xVUQlnHUV26xBakAJo3K3FxG9Y
bl2JU+LTvP52eBKpKHp0i2pbuC2z0dBEu9Y3d4q8phI+YeolJ0rgOCbMra+maCls
kk5TeROrDJpTg5INsWpVNvKvRk+sNlh0K9AsZHL1fIA8d2UFyTeCltUNjkWkIZA/
SBiS+ruxtLD9rTPOrF1i+rvW+rs/gL1LkPjOvoilTMuNm5ltCu5MLsflX6oZ5wX4
2vAXup6HQzLcJpCCq2QyMIl2s8ORV+gY89nYmRYHLzCoqGtF/WhbSFSKjqNM1+Kf
/M4C9q3kEopkdlHuKahR8dP4wYz6kP1kEijv83P21ahUFsShSLyLAegYoKO0pNeC
H/WOWMBqpG+rE80Tsctfo/z3g16Wc51yesYjXsOP5iRYoytG2uGLtzG7fPTjtyuC
Fa3Ue/9d/W/XyZ7bzolvGRoaeoHqoIXb07MsDLDhO2cHYg6B/wIvHXfPcM7ovR6f
VT0aRu85dvvewlukuqyH5+rwSPNQfzHo32GIiK7h8QjjIEh1axOFi+7oXGfbcIUw
EP0u4AZsK9iHZg==
=uB+H
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
"We have a handful of new features for 5.14:
- Support for transparent huge pages.
- Support for generic PCI resources mapping.
- Support for the mem= kernel parameter.
- Support for KFENCE.
- A handful of fixes to avoid W+X mappings in the kernel.
- Support for VMAP_STACK based overflow detection.
- An optimized copy_{to,from}_user"
* tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits)
riscv: xip: Fix duplicate included asm/pgtable.h
riscv: Fix PTDUMP output now BPF region moved back to module region
riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
riscv: add VMAP_STACK overflow detection
riscv: ptrace: add argn syntax
riscv: mm: fix build errors caused by mk_pmd()
riscv: Introduce structure that group all variables regarding kernel mapping
riscv: Map the kernel with correct permissions the first time
riscv: Introduce set_kernel_memory helper
riscv: Enable KFENCE for riscv64
RISC-V: Use asm-generic for {in,out}{bwlq}
riscv: add ASID-based tlbflushing methods
riscv: pass the mm_struct to __sbi_tlb_flush_range
riscv: Add mem kernel parameter support
riscv: Simplify xip and !xip kernel address conversion macros
riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
riscv: Only initialize swiotlb when necessary
riscv: fix typo in init.c
riscv: Cleanup unused functions
riscv: mm: Use better bitmap_zalloc()
...
Clean up the following includecheck warning:
./arch/riscv/kernel/vmlinux-xip.lds.S: asm/pgtable.h is included more
than once.
No functional change.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
We have a lot of variables that are used to hold kernel mapping addresses,
offsets between physical and virtual mappings and some others used for XIP
kernels: they are all defined at different places in mm/init.c, so group
them into a single structure with, for some of them, more explicit and concise
names.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This contains both the short-term fix for the W+X boot mappings and the
larger cleanup.
* riscv-wx-mappings:
riscv: Map the kernel with correct permissions the first time
riscv: Introduce set_kernel_memory helper
riscv: Simplify xip and !xip kernel address conversion macros
riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
riscv: mm: Fix W+X mappings at boot
For 64-bit kernels, we map all the kernel with write and execute
permissions and afterwards remove writability from text and executability
from data.
For 32-bit kernels, the kernel mapping resides in the linear mapping, so we
map all the linear mapping as writable and executable and afterwards we
remove those properties for unused memory and kernel mapping as
described above.
Change this behavior to directly map the kernel with correct permissions
and avoid going through the whole mapping to fix the permissions.
At the same time, this fixes an issue introduced by commit 2bfc6cd81b
("riscv: Move kernel mapping outside of linear mapping") as reported
here https://github.com/starfive-tech/linux/issues/17.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>