c5e22feffd
Both ACPI and DT provide the ability to describe additional layers of topology between that of individual cores and higher level constructs such as the level at which the last level cache is shared. In ACPI this can be represented in PPTT as a Processor Hierarchy Node Structure [1] that is the parent of the CPU cores and in turn has a parent Processor Hierarchy Nodes Structure representing a higher level of topology. For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each cluster has 4 cpus. All clusters share L3 cache data, but each cluster has local L3 tag. On the other hand, each clusters will share some internal system bus. +-----------------------------------+ +---------+ | +------+ +------+ +--------------------------+ | | | CPU0 | | cpu1 | | +-----------+ | | | +------+ +------+ | | | | | | +----+ L3 | | | | +------+ +------+ cluster | | tag | | | | | CPU2 | | CPU3 | | | | | | | +------+ +------+ | +-----------+ | | | | | | +-----------------------------------+ | | +-----------------------------------+ | | | +------+ +------+ +--------------------------+ | | | | | | | +-----------+ | | | +------+ +------+ | | | | | | | | L3 | | | | +------+ +------+ +----+ tag | | | | | | | | | | | | | | +------+ +------+ | +-----------+ | | | | | | +-----------------------------------+ | L3 | | data | +-----------------------------------+ | | | +------+ +------+ | +-----------+ | | | | | | | | | | | | | +------+ +------+ +----+ L3 | | | | | | tag | | | | +------+ +------+ | | | | | | | | | | | +-----------+ | | | +------+ +------+ +--------------------------+ | +-----------------------------------| | | +-----------------------------------| | | | +------+ +------+ +--------------------------+ | | | | | | | +-----------+ | | | +------+ +------+ | | | | | | +----+ L3 | | | | +------+ +------+ | | tag | | | | | | | | | | | | | | +------+ +------+ | +-----------+ | | | | | | +-----------------------------------+ | | +-----------------------------------+ | | | +------+ +------+ +--------------------------+ | | | | | | | +-----------+ | | | +------+ +------+ | | | | | | | | L3 | | | | +------+ +------+ +---+ tag | | | | | | | | | | | | | | +------+ +------+ | +-----------+ | | | | | | +-----------------------------------+ | | +-----------------------------------+ | | | +------+ +------+ +--------------------------+ | | | | | | | +-----------+ | | | +------+ +------+ | | | | | | | | L3 | | | | +------+ +------+ +--+ tag | | | | | | | | | | | | | | +------+ +------+ | +-----------+ | | | | +---------+ +-----------------------------------+ That means spreading tasks among clusters will bring more bandwidth while packing tasks within one cluster will lead to smaller cache synchronization latency. So both kernel and userspace will have a chance to leverage this topology to deploy tasks accordingly to achieve either smaller cache latency within one cluster or an even distribution of load among clusters for higher throughput. This patch exposes cluster topology to both kernel and userspace. Libraried like hwloc will know cluster by cluster_cpus and related sysfs attributes. PoC of HWLOC support at [2]. Note this patch only handle the ACPI case. Special consideration is needed for SMT processors, where it is necessary to move 2 levels up the hierarchy from the leaf nodes (thus skipping the processor core level). Note that arm64 / ACPI does not provide any means of identifying a die level in the topology but that may be unrelate to the cluster level. [1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node structure (Type 0) [2] https://github.com/hisilicon/hwloc/tree/linux-cluster Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
124 lines
5.5 KiB
Plaintext
124 lines
5.5 KiB
Plaintext
What: /sys/devices/system/cpu/dscr_default
|
|
Date: 13-May-2014
|
|
KernelVersion: v3.15.0
|
|
Contact:
|
|
Description: Writes are equivalent to writing to
|
|
/sys/devices/system/cpu/cpuN/dscr on all CPUs.
|
|
Reads return the last written value or 0.
|
|
This value is not a global default: it is a way to set
|
|
all per-CPU defaults at the same time.
|
|
Values: 64 bit unsigned integer (bit field)
|
|
|
|
What: /sys/devices/system/cpu/cpu[0-9]+/dscr
|
|
Date: 13-May-2014
|
|
KernelVersion: v3.15.0
|
|
Contact:
|
|
Description: Default value for the Data Stream Control Register (DSCR) on
|
|
a CPU.
|
|
This default value is used when the kernel is executing and
|
|
for any process that has not set the DSCR itself.
|
|
If a process ever sets the DSCR (via direct access to the
|
|
SPR) that value will be persisted for that process and used
|
|
on any CPU where it executes (overriding the value described
|
|
here).
|
|
If set by a process it will be inherited by child processes.
|
|
Values: 64 bit unsigned integer (bit field)
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/physical_package_id
|
|
Description: physical package id of cpuX. Typically corresponds to a physical
|
|
socket number, but the actual value is architecture and platform
|
|
dependent.
|
|
Values: integer
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/die_id
|
|
Description: the CPU die ID of cpuX. Typically it is the hardware platform's
|
|
identifier (rather than the kernel's). The actual value is
|
|
architecture and platform dependent.
|
|
Values: integer
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/core_id
|
|
Description: the CPU core ID of cpuX. Typically it is the hardware platform's
|
|
identifier (rather than the kernel's). The actual value is
|
|
architecture and platform dependent.
|
|
Values: integer
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/cluster_id
|
|
Description: the cluster ID of cpuX. Typically it is the hardware platform's
|
|
identifier (rather than the kernel's). The actual value is
|
|
architecture and platform dependent.
|
|
Values: integer
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/book_id
|
|
Description: the book ID of cpuX. Typically it is the hardware platform's
|
|
identifier (rather than the kernel's). The actual value is
|
|
architecture and platform dependent. it's only used on s390.
|
|
Values: integer
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/drawer_id
|
|
Description: the drawer ID of cpuX. Typically it is the hardware platform's
|
|
identifier (rather than the kernel's). The actual value is
|
|
architecture and platform dependent. it's only used on s390.
|
|
Values: integer
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/core_cpus
|
|
Description: internal kernel map of CPUs within the same core.
|
|
(deprecated name: "thread_siblings")
|
|
Values: hexadecimal bitmask.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/core_cpus_list
|
|
Description: human-readable list of CPUs within the same core.
|
|
The format is like 0-3, 8-11, 14,17.
|
|
(deprecated name: "thread_siblings_list").
|
|
Values: decimal list.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/package_cpus
|
|
Description: internal kernel map of the CPUs sharing the same physical_package_id.
|
|
(deprecated name: "core_siblings").
|
|
Values: hexadecimal bitmask.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/package_cpus_list
|
|
Description: human-readable list of CPUs sharing the same physical_package_id.
|
|
The format is like 0-3, 8-11, 14,17.
|
|
(deprecated name: "core_siblings_list")
|
|
Values: decimal list.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/die_cpus
|
|
Description: internal kernel map of CPUs within the same die.
|
|
Values: hexadecimal bitmask.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/die_cpus_list
|
|
Description: human-readable list of CPUs within the same die.
|
|
The format is like 0-3, 8-11, 14,17.
|
|
Values: decimal list.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/cluster_cpus
|
|
Description: internal kernel map of CPUs within the same cluster.
|
|
Values: hexadecimal bitmask.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/cluster_cpus_list
|
|
Description: human-readable list of CPUs within the same cluster.
|
|
The format is like 0-3, 8-11, 14,17.
|
|
Values: decimal list.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/book_siblings
|
|
Description: internal kernel map of cpuX's hardware threads within the same
|
|
book_id. it's only used on s390.
|
|
Values: hexadecimal bitmask.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/book_siblings_list
|
|
Description: human-readable list of cpuX's hardware threads within the same
|
|
book_id.
|
|
The format is like 0-3, 8-11, 14,17. it's only used on s390.
|
|
Values: decimal list.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/drawer_siblings
|
|
Description: internal kernel map of cpuX's hardware threads within the same
|
|
drawer_id. it's only used on s390.
|
|
Values: hexadecimal bitmask.
|
|
|
|
What: /sys/devices/system/cpu/cpuX/topology/drawer_siblings_list
|
|
Description: human-readable list of cpuX's hardware threads within the same
|
|
drawer_id.
|
|
The format is like 0-3, 8-11, 14,17. it's only used on s390.
|
|
Values: decimal list.
|