Sync with upstream branch release/2.34/master

Upstream commit: 91c2e6c3db44297bf4cb3a2e3c40236c5b6a0b23

- dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo
- manual: Document the dlinfo function
- x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896]
- x86: Fix bug in strncmp-evex and strncmp-avx2 [BZ #28895]
- x86: Set .text section in memset-vec-unaligned-erms
- x86-64: Optimize bzero
- x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only)
- x86: Improve vec generation in memset-vec-unaligned-erms.S
- x86-64: Fix strcmp-evex.S
- x86-64: Fix strcmp-avx2.S
- x86: Optimize strcmp-evex.S
- x86: Optimize strcmp-avx2.S
- manual: Clarify that abbreviations of long options are allowed
- Add HWCAP2_AFP, HWCAP2_RPRES from Linux 5.17 to AArch64 bits/hwcap.h
- aarch64: Add HWCAP2_ECV from Linux 5.16
- Add SOL_MPTCP, SOL_MCTP from Linux 5.16 to bits/socket.h
- Update kernel version to 5.17 in tst-mman-consts.py
- Update kernel version to 5.16 in tst-mman-consts.py
- Update syscall lists for Linux 5.17
- Add ARPHRD_CAN, ARPHRD_MCTP to net/if_arp.h
- Update kernel version to 5.15 in tst-mman-consts.py
- Add PF_MCTP, AF_MCTP from Linux 5.15 to bits/socket.h
This commit is contained in:
Florian Weimer 2022-05-12 20:17:16 +02:00
parent 8f8d2743e4
commit 329e925ee9
23 changed files with 6268 additions and 1 deletions

View File

@ -0,0 +1,35 @@
commit bc6fba3c8048b11c9f73db03339c97a2fec3f0cf
Author: Joseph Myers <joseph@codesourcery.com>
Date: Wed Nov 17 14:25:16 2021 +0000
Add PF_MCTP, AF_MCTP from Linux 5.15 to bits/socket.h
Linux 5.15 adds a new address / protocol family PF_MCTP / AF_MCTP; add
these constants to bits/socket.h.
Tested for x86_64.
(cherry picked from commit bdeb7a8fa9989d18dab6310753d04d908125dc1d)
diff --git a/sysdeps/unix/sysv/linux/bits/socket.h b/sysdeps/unix/sysv/linux/bits/socket.h
index a011a8c0959b9970..7bb9e863d7329da9 100644
--- a/sysdeps/unix/sysv/linux/bits/socket.h
+++ b/sysdeps/unix/sysv/linux/bits/socket.h
@@ -86,7 +86,8 @@ typedef __socklen_t socklen_t;
#define PF_QIPCRTR 42 /* Qualcomm IPC Router. */
#define PF_SMC 43 /* SMC sockets. */
#define PF_XDP 44 /* XDP sockets. */
-#define PF_MAX 45 /* For now.. */
+#define PF_MCTP 45 /* Management component transport protocol. */
+#define PF_MAX 46 /* For now.. */
/* Address families. */
#define AF_UNSPEC PF_UNSPEC
@@ -137,6 +138,7 @@ typedef __socklen_t socklen_t;
#define AF_QIPCRTR PF_QIPCRTR
#define AF_SMC PF_SMC
#define AF_XDP PF_XDP
+#define AF_MCTP PF_MCTP
#define AF_MAX PF_MAX
/* Socket level values. Others are defined in the appropriate headers.

View File

@ -0,0 +1,27 @@
commit fd5dbfd1cd98cb2f12f9e9f7004a4d25ab0c977f
Author: Joseph Myers <joseph@codesourcery.com>
Date: Mon Nov 22 15:30:12 2021 +0000
Update kernel version to 5.15 in tst-mman-consts.py
This patch updates the kernel version in the test tst-mman-consts.py
to 5.15. (There are no new MAP_* constants covered by this test in
5.15 that need any other header changes.)
Tested with build-many-glibcs.py.
(cherry picked from commit 5c3ece451d46a7d8721311609bfcb6faafacb39e)
diff --git a/sysdeps/unix/sysv/linux/tst-mman-consts.py b/sysdeps/unix/sysv/linux/tst-mman-consts.py
index 810433c238f31c25..eeccdfd04dae57ab 100644
--- a/sysdeps/unix/sysv/linux/tst-mman-consts.py
+++ b/sysdeps/unix/sysv/linux/tst-mman-consts.py
@@ -33,7 +33,7 @@ def main():
help='C compiler (including options) to use')
args = parser.parse_args()
linux_version_headers = glibcsyscalls.linux_kernel_version(args.cc)
- linux_version_glibc = (5, 14)
+ linux_version_glibc = (5, 15)
sys.exit(glibcextract.compare_macro_consts(
'#define _GNU_SOURCE 1\n'
'#include <sys/mman.h>\n',

View File

@ -0,0 +1,28 @@
commit 5146b73d72ced9bab125e986aa99ef5fe2f88475
Author: Joseph Myers <joseph@codesourcery.com>
Date: Mon Dec 20 15:38:32 2021 +0000
Add ARPHRD_CAN, ARPHRD_MCTP to net/if_arp.h
Add the constant ARPHRD_MCTP, from Linux 5.15, to net/if_arp.h, along
with ARPHRD_CAN which was added to Linux in version 2.6.25 (commit
cd05acfe65ed2cf2db683fa9a6adb8d35635263b, "[CAN]: Allocate protocol
numbers for PF_CAN") but apparently missed for glibc at the time.
Tested for x86_64.
(cherry picked from commit a94d9659cd69dbc70d3494b1cbbbb5a1551675c5)
diff --git a/sysdeps/unix/sysv/linux/net/if_arp.h b/sysdeps/unix/sysv/linux/net/if_arp.h
index 2a8933cde7cf236d..42910b776660def1 100644
--- a/sysdeps/unix/sysv/linux/net/if_arp.h
+++ b/sysdeps/unix/sysv/linux/net/if_arp.h
@@ -95,6 +95,8 @@ struct arphdr
#define ARPHRD_ROSE 270
#define ARPHRD_X25 271 /* CCITT X.25. */
#define ARPHRD_HWX25 272 /* Boards with X.25 in firmware. */
+#define ARPHRD_CAN 280 /* Controller Area Network. */
+#define ARPHRD_MCTP 290
#define ARPHRD_PPP 512
#define ARPHRD_CISCO 513 /* Cisco HDLC. */
#define ARPHRD_HDLC ARPHRD_CISCO

View File

@ -0,0 +1,337 @@
commit 6af165658d0999ac2c4e9ce88bee020fbc2ee49f
Author: Joseph Myers <joseph@codesourcery.com>
Date: Wed Mar 23 17:11:56 2022 +0000
Update syscall lists for Linux 5.17
Linux 5.17 has one new syscall, set_mempolicy_home_node. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
(cherry picked from commit 8ef9196b26793830515402ea95aca2629f7721ec)
diff --git a/sysdeps/unix/sysv/linux/aarch64/arch-syscall.h b/sysdeps/unix/sysv/linux/aarch64/arch-syscall.h
index 9905ebedf298954c..4fcb6da80af37e9e 100644
--- a/sysdeps/unix/sysv/linux/aarch64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/aarch64/arch-syscall.h
@@ -236,6 +236,7 @@
#define __NR_sendmsg 211
#define __NR_sendto 206
#define __NR_set_mempolicy 237
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 99
#define __NR_set_tid_address 96
#define __NR_setdomainname 162
diff --git a/sysdeps/unix/sysv/linux/alpha/arch-syscall.h b/sysdeps/unix/sysv/linux/alpha/arch-syscall.h
index ee8085be69958b25..0cf74c1a96bb1235 100644
--- a/sysdeps/unix/sysv/linux/alpha/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/alpha/arch-syscall.h
@@ -391,6 +391,7 @@
#define __NR_sendmsg 114
#define __NR_sendto 133
#define __NR_set_mempolicy 431
+#define __NR_set_mempolicy_home_node 560
#define __NR_set_robust_list 466
#define __NR_set_tid_address 411
#define __NR_setdomainname 166
diff --git a/sysdeps/unix/sysv/linux/arc/arch-syscall.h b/sysdeps/unix/sysv/linux/arc/arch-syscall.h
index 1b626d97705d545a..c1207aaa12be6a51 100644
--- a/sysdeps/unix/sysv/linux/arc/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/arc/arch-syscall.h
@@ -238,6 +238,7 @@
#define __NR_sendmsg 211
#define __NR_sendto 206
#define __NR_set_mempolicy 237
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 99
#define __NR_set_tid_address 96
#define __NR_setdomainname 162
diff --git a/sysdeps/unix/sysv/linux/arm/arch-syscall.h b/sysdeps/unix/sysv/linux/arm/arch-syscall.h
index 96ef8db9368e7de4..e7ba04c106d8af7d 100644
--- a/sysdeps/unix/sysv/linux/arm/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/arm/arch-syscall.h
@@ -302,6 +302,7 @@
#define __NR_sendmsg 296
#define __NR_sendto 290
#define __NR_set_mempolicy 321
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 338
#define __NR_set_tid_address 256
#define __NR_set_tls 983045
diff --git a/sysdeps/unix/sysv/linux/csky/arch-syscall.h b/sysdeps/unix/sysv/linux/csky/arch-syscall.h
index 96910154ed6a5c1b..dc9383758ebc641b 100644
--- a/sysdeps/unix/sysv/linux/csky/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/csky/arch-syscall.h
@@ -250,6 +250,7 @@
#define __NR_sendmsg 211
#define __NR_sendto 206
#define __NR_set_mempolicy 237
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 99
#define __NR_set_thread_area 244
#define __NR_set_tid_address 96
diff --git a/sysdeps/unix/sysv/linux/hppa/arch-syscall.h b/sysdeps/unix/sysv/linux/hppa/arch-syscall.h
index 36675fd48e6f50c5..767f1287a30b473e 100644
--- a/sysdeps/unix/sysv/linux/hppa/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/hppa/arch-syscall.h
@@ -289,6 +289,7 @@
#define __NR_sendmsg 183
#define __NR_sendto 82
#define __NR_set_mempolicy 262
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 289
#define __NR_set_tid_address 237
#define __NR_setdomainname 121
diff --git a/sysdeps/unix/sysv/linux/i386/arch-syscall.h b/sysdeps/unix/sysv/linux/i386/arch-syscall.h
index c86ccbda4681066c..1998f0d76a444cac 100644
--- a/sysdeps/unix/sysv/linux/i386/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/i386/arch-syscall.h
@@ -323,6 +323,7 @@
#define __NR_sendmsg 370
#define __NR_sendto 369
#define __NR_set_mempolicy 276
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 311
#define __NR_set_thread_area 243
#define __NR_set_tid_address 258
diff --git a/sysdeps/unix/sysv/linux/ia64/arch-syscall.h b/sysdeps/unix/sysv/linux/ia64/arch-syscall.h
index d898bce404955ef0..b2eab1b93d70b9de 100644
--- a/sysdeps/unix/sysv/linux/ia64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/ia64/arch-syscall.h
@@ -272,6 +272,7 @@
#define __NR_sendmsg 1205
#define __NR_sendto 1199
#define __NR_set_mempolicy 1261
+#define __NR_set_mempolicy_home_node 1474
#define __NR_set_robust_list 1298
#define __NR_set_tid_address 1233
#define __NR_setdomainname 1129
diff --git a/sysdeps/unix/sysv/linux/m68k/arch-syscall.h b/sysdeps/unix/sysv/linux/m68k/arch-syscall.h
index fe721b809076abeb..5fc3723772f92516 100644
--- a/sysdeps/unix/sysv/linux/m68k/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/m68k/arch-syscall.h
@@ -310,6 +310,7 @@
#define __NR_sendmsg 367
#define __NR_sendto 366
#define __NR_set_mempolicy 270
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 304
#define __NR_set_thread_area 334
#define __NR_set_tid_address 253
diff --git a/sysdeps/unix/sysv/linux/microblaze/arch-syscall.h b/sysdeps/unix/sysv/linux/microblaze/arch-syscall.h
index 6e10c3661db96a1e..b6e9b007e496cd80 100644
--- a/sysdeps/unix/sysv/linux/microblaze/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/microblaze/arch-syscall.h
@@ -326,6 +326,7 @@
#define __NR_sendmsg 360
#define __NR_sendto 353
#define __NR_set_mempolicy 276
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 311
#define __NR_set_thread_area 243
#define __NR_set_tid_address 258
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h b/sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
index 26a6d594a2222f15..b3a3871f8ab8a23e 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
@@ -308,6 +308,7 @@
#define __NR_sendmsg 4179
#define __NR_sendto 4180
#define __NR_set_mempolicy 4270
+#define __NR_set_mempolicy_home_node 4450
#define __NR_set_robust_list 4309
#define __NR_set_thread_area 4283
#define __NR_set_tid_address 4252
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h b/sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
index 83e0d49c5e3ca1bc..b462182723aff286 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
@@ -288,6 +288,7 @@
#define __NR_sendmsg 6045
#define __NR_sendto 6043
#define __NR_set_mempolicy 6233
+#define __NR_set_mempolicy_home_node 6450
#define __NR_set_robust_list 6272
#define __NR_set_thread_area 6246
#define __NR_set_tid_address 6213
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h b/sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
index d6747c542f63202b..a9d6b94572e93001 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
@@ -270,6 +270,7 @@
#define __NR_sendmsg 5045
#define __NR_sendto 5043
#define __NR_set_mempolicy 5229
+#define __NR_set_mempolicy_home_node 5450
#define __NR_set_robust_list 5268
#define __NR_set_thread_area 5242
#define __NR_set_tid_address 5212
diff --git a/sysdeps/unix/sysv/linux/nios2/arch-syscall.h b/sysdeps/unix/sysv/linux/nios2/arch-syscall.h
index 4ee209bc4475ea7d..809a219ef32a45ef 100644
--- a/sysdeps/unix/sysv/linux/nios2/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/nios2/arch-syscall.h
@@ -250,6 +250,7 @@
#define __NR_sendmsg 211
#define __NR_sendto 206
#define __NR_set_mempolicy 237
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 99
#define __NR_set_tid_address 96
#define __NR_setdomainname 162
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/arch-syscall.h b/sysdeps/unix/sysv/linux/powerpc/powerpc32/arch-syscall.h
index 497299fbc47a708c..627831ebae1b9e90 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/arch-syscall.h
@@ -319,6 +319,7 @@
#define __NR_sendmsg 341
#define __NR_sendto 335
#define __NR_set_mempolicy 261
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 300
#define __NR_set_tid_address 232
#define __NR_setdomainname 121
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/arch-syscall.h b/sysdeps/unix/sysv/linux/powerpc/powerpc64/arch-syscall.h
index e840279f171b10b9..bae597199d79eaad 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/arch-syscall.h
@@ -298,6 +298,7 @@
#define __NR_sendmsg 341
#define __NR_sendto 335
#define __NR_set_mempolicy 261
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 300
#define __NR_set_tid_address 232
#define __NR_setdomainname 121
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
index 73ef74c005e5a2bb..bf4be80f8d380963 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
@@ -228,6 +228,7 @@
#define __NR_sendmsg 211
#define __NR_sendto 206
#define __NR_set_mempolicy 237
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 99
#define __NR_set_tid_address 96
#define __NR_setdomainname 162
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
index 919a79ee91177459..d656aedcc2be6009 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
@@ -235,6 +235,7 @@
#define __NR_sendmsg 211
#define __NR_sendto 206
#define __NR_set_mempolicy 237
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 99
#define __NR_set_tid_address 96
#define __NR_setdomainname 162
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/arch-syscall.h b/sysdeps/unix/sysv/linux/s390/s390-32/arch-syscall.h
index 005c0ada7aab85a1..57025107e82c9439 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/arch-syscall.h
@@ -311,6 +311,7 @@
#define __NR_sendmsg 370
#define __NR_sendto 369
#define __NR_set_mempolicy 270
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 304
#define __NR_set_tid_address 252
#define __NR_setdomainname 121
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/arch-syscall.h b/sysdeps/unix/sysv/linux/s390/s390-64/arch-syscall.h
index 9131fddcc16116e4..72e19c6d569fbf9b 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/arch-syscall.h
@@ -278,6 +278,7 @@
#define __NR_sendmsg 370
#define __NR_sendto 369
#define __NR_set_mempolicy 270
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 304
#define __NR_set_tid_address 252
#define __NR_setdomainname 121
diff --git a/sysdeps/unix/sysv/linux/sh/arch-syscall.h b/sysdeps/unix/sysv/linux/sh/arch-syscall.h
index d8fb041568ecb4da..d52b522d9cac87ef 100644
--- a/sysdeps/unix/sysv/linux/sh/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/sh/arch-syscall.h
@@ -303,6 +303,7 @@
#define __NR_sendmsg 355
#define __NR_sendto 349
#define __NR_set_mempolicy 276
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 311
#define __NR_set_tid_address 258
#define __NR_setdomainname 121
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/arch-syscall.h b/sysdeps/unix/sysv/linux/sparc/sparc32/arch-syscall.h
index 2bc014fe6a1a1f4a..d3f4d8aa3edb4795 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/arch-syscall.h
@@ -310,6 +310,7 @@
#define __NR_sendmsg 114
#define __NR_sendto 133
#define __NR_set_mempolicy 305
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 300
#define __NR_set_tid_address 166
#define __NR_setdomainname 163
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/arch-syscall.h b/sysdeps/unix/sysv/linux/sparc/sparc64/arch-syscall.h
index 76dbbe595ffe868f..2cc03d7a24453335 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/arch-syscall.h
@@ -286,6 +286,7 @@
#define __NR_sendmsg 114
#define __NR_sendto 133
#define __NR_set_mempolicy 305
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 300
#define __NR_set_tid_address 166
#define __NR_setdomainname 163
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 0bc2af37dfa1eeb5..e2743c649586d97a 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -21,8 +21,8 @@
# This file can list all potential system calls. The names are only
# used if the installed kernel headers also provide them.
-# The list of system calls is current as of Linux 5.16.
-kernel 5.16
+# The list of system calls is current as of Linux 5.17.
+kernel 5.17
FAST_atomic_update
FAST_cmpxchg
@@ -523,6 +523,7 @@ sendmmsg
sendmsg
sendto
set_mempolicy
+set_mempolicy_home_node
set_robust_list
set_thread_area
set_tid_address
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/arch-syscall.h b/sysdeps/unix/sysv/linux/x86_64/64/arch-syscall.h
index 28558279b48a1ef4..b4ab892ec183e32d 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/x86_64/64/arch-syscall.h
@@ -278,6 +278,7 @@
#define __NR_sendmsg 46
#define __NR_sendto 44
#define __NR_set_mempolicy 238
+#define __NR_set_mempolicy_home_node 450
#define __NR_set_robust_list 273
#define __NR_set_thread_area 205
#define __NR_set_tid_address 218
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h b/sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
index c1ab8ec45e8b8fd3..772559c87b3625b8 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
@@ -270,6 +270,7 @@
#define __NR_sendmsg 1073742342
#define __NR_sendto 1073741868
#define __NR_set_mempolicy 1073742062
+#define __NR_set_mempolicy_home_node 1073742274
#define __NR_set_robust_list 1073742354
#define __NR_set_thread_area 1073742029
#define __NR_set_tid_address 1073742042

View File

@ -0,0 +1,27 @@
commit 81181ba5d916fc49bd737f603e28a3c2dc8430b4
Author: Joseph Myers <joseph@codesourcery.com>
Date: Wed Feb 16 14:19:24 2022 +0000
Update kernel version to 5.16 in tst-mman-consts.py
This patch updates the kernel version in the test tst-mman-consts.py
to 5.16. (There are no new MAP_* constants covered by this test in
5.16 that need any other header changes.)
Tested with build-many-glibcs.py.
(cherry picked from commit 790a607e234aa10d4b977a1b80aebe8a2acac970)
diff --git a/sysdeps/unix/sysv/linux/tst-mman-consts.py b/sysdeps/unix/sysv/linux/tst-mman-consts.py
index eeccdfd04dae57ab..8102d80b6660e523 100644
--- a/sysdeps/unix/sysv/linux/tst-mman-consts.py
+++ b/sysdeps/unix/sysv/linux/tst-mman-consts.py
@@ -33,7 +33,7 @@ def main():
help='C compiler (including options) to use')
args = parser.parse_args()
linux_version_headers = glibcsyscalls.linux_kernel_version(args.cc)
- linux_version_glibc = (5, 15)
+ linux_version_glibc = (5, 16)
sys.exit(glibcextract.compare_macro_consts(
'#define _GNU_SOURCE 1\n'
'#include <sys/mman.h>\n',

View File

@ -0,0 +1,27 @@
commit 0499c3a95fb864284fef36d3e9c5a54f6646b2db
Author: Joseph Myers <joseph@codesourcery.com>
Date: Thu Mar 24 15:35:27 2022 +0000
Update kernel version to 5.17 in tst-mman-consts.py
This patch updates the kernel version in the test tst-mman-consts.py
to 5.17. (There are no new MAP_* constants covered by this test in
5.17 that need any other header changes.)
Tested with build-many-glibcs.py.
(cherry picked from commit 23808a422e6036accaba7236fd3b9a0d7ab7e8ee)
diff --git a/sysdeps/unix/sysv/linux/tst-mman-consts.py b/sysdeps/unix/sysv/linux/tst-mman-consts.py
index 8102d80b6660e523..724c7375c3a1623b 100644
--- a/sysdeps/unix/sysv/linux/tst-mman-consts.py
+++ b/sysdeps/unix/sysv/linux/tst-mman-consts.py
@@ -33,7 +33,7 @@ def main():
help='C compiler (including options) to use')
args = parser.parse_args()
linux_version_headers = glibcsyscalls.linux_kernel_version(args.cc)
- linux_version_glibc = (5, 16)
+ linux_version_glibc = (5, 17)
sys.exit(glibcextract.compare_macro_consts(
'#define _GNU_SOURCE 1\n'
'#include <sys/mman.h>\n',

View File

@ -0,0 +1,26 @@
commit f858bc309315a03ff6b1a048f59405c159d23430
Author: Joseph Myers <joseph@codesourcery.com>
Date: Mon Feb 21 22:49:36 2022 +0000
Add SOL_MPTCP, SOL_MCTP from Linux 5.16 to bits/socket.h
Linux 5.16 adds constants SOL_MPTCP and SOL_MCTP to the getsockopt /
setsockopt levels; add these constants to bits/socket.h.
Tested for x86_64.
(cherry picked from commit fdc1ae67fef27eea1445bab4bdfe2f0fb3bc7aa1)
diff --git a/sysdeps/unix/sysv/linux/bits/socket.h b/sysdeps/unix/sysv/linux/bits/socket.h
index 7bb9e863d7329da9..c81fab840918924e 100644
--- a/sysdeps/unix/sysv/linux/bits/socket.h
+++ b/sysdeps/unix/sysv/linux/bits/socket.h
@@ -169,6 +169,8 @@ typedef __socklen_t socklen_t;
#define SOL_KCM 281
#define SOL_TLS 282
#define SOL_XDP 283
+#define SOL_MPTCP 284
+#define SOL_MCTP 285
/* Maximum queue length specifiable by listen. */
#define SOMAXCONN 4096

View File

@ -0,0 +1,21 @@
commit c108e87026d61d6744e3e55704e0bea937243f5a
Author: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Tue Dec 14 11:15:07 2021 +0000
aarch64: Add HWCAP2_ECV from Linux 5.16
Indicates the availability of enhanced counter virtualization extension
of armv8.6-a with self-synchronized virtual counter CNTVCTSS_EL0 usable
in userspace.
(cherry picked from commit 5a1be8ebdf6f02d4efec6e5f12ad06db17511f90)
diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h b/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
index 30fda0a4a347695e..04cc762015a7230a 100644
--- a/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
+++ b/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
@@ -74,3 +74,4 @@
#define HWCAP2_RNG (1 << 16)
#define HWCAP2_BTI (1 << 17)
#define HWCAP2_MTE (1 << 18)
+#define HWCAP2_ECV (1 << 19)

View File

@ -0,0 +1,21 @@
commit 97cb8227b864b8ea0d99a4a50e4163baad3e1c72
Author: Joseph Myers <joseph@codesourcery.com>
Date: Mon Mar 28 13:16:48 2022 +0000
Add HWCAP2_AFP, HWCAP2_RPRES from Linux 5.17 to AArch64 bits/hwcap.h
Add the new HWCAP2_AFP and HWCAP2_RPRES constants from Linux 5.17.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
(cherry picked from commit 866c599182e87f116440b5d854f9e99533c48eb3)
diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h b/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
index 04cc762015a7230a..9a5c4116b3fe9903 100644
--- a/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
+++ b/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
@@ -75,3 +75,5 @@
#define HWCAP2_BTI (1 << 17)
#define HWCAP2_MTE (1 << 18)
#define HWCAP2_ECV (1 << 19)
+#define HWCAP2_AFP (1 << 20)
+#define HWCAP2_RPRES (1 << 21)

View File

@ -0,0 +1,29 @@
commit 31af92b9c8cf753992d45c801a855a02060afc08
Author: Siddhesh Poyarekar <siddhesh@sourceware.org>
Date: Wed May 4 15:56:47 2022 +0530
manual: Clarify that abbreviations of long options are allowed
The man page and code comments clearly state that abbreviations of long
option names are recognized correctly as long as they are unique.
Document this fact in the glibc manual as well.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
(cherry picked from commit db1efe02c9f15affc3908d6ae73875b82898a489)
diff --git a/manual/getopt.texi b/manual/getopt.texi
index 5485fc46946631f7..b4c0b15ac2060560 100644
--- a/manual/getopt.texi
+++ b/manual/getopt.texi
@@ -250,7 +250,8 @@ option, and stores the option's argument (if it has one) in @code{optarg}.
When @code{getopt_long} encounters a long option, it takes actions based
on the @code{flag} and @code{val} fields of the definition of that
-option.
+option. The option name may be abbreviated as long as the abbreviation is
+unique.
If @code{flag} is a null pointer, then @code{getopt_long} returns the
contents of @code{val} to indicate which option it found. You should

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,29 @@
commit d299032743e05571ef326c838a5ecf6ef5b3e9c3
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Feb 4 11:09:10 2022 -0800
x86-64: Fix strcmp-avx2.S
Change "movl %edx, %rdx" to "movl %edx, %edx" in:
commit b77b06e0e296f1a2276c27a67e1d44f2cfa38d45
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Jan 10 15:35:38 2022 -0600
x86: Optimize strcmp-avx2.S
(cherry picked from commit c15efd011cea3d8f0494269eb539583215a1feed)
diff --git a/sysdeps/x86_64/multiarch/strcmp-avx2.S b/sysdeps/x86_64/multiarch/strcmp-avx2.S
index a0d1c65db11028bc..cdded412a70bad10 100644
--- a/sysdeps/x86_64/multiarch/strcmp-avx2.S
+++ b/sysdeps/x86_64/multiarch/strcmp-avx2.S
@@ -106,7 +106,7 @@ ENTRY(STRCMP)
# ifdef USE_AS_STRNCMP
# ifdef __ILP32__
/* Clear the upper 32 bits. */
- movl %edx, %rdx
+ movl %edx, %edx
# endif
cmp $1, %RDX_LP
/* Signed comparison intentional. We use this branch to also

View File

@ -0,0 +1,29 @@
commit 53ddafe917a8af17b16beb794c29e5b09b86d534
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Feb 4 11:11:08 2022 -0800
x86-64: Fix strcmp-evex.S
Change "movl %edx, %rdx" to "movl %edx, %edx" in:
commit 8418eb3ff4b781d31c4ed5dc6c0bd7356bc45db9
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Jan 10 15:35:39 2022 -0600
x86: Optimize strcmp-evex.S
(cherry picked from commit 0e0199a9e02ebe42e2b36958964d63f03573c382)
diff --git a/sysdeps/x86_64/multiarch/strcmp-evex.S b/sysdeps/x86_64/multiarch/strcmp-evex.S
index 99d8409af27327ad..ed56af8ecdad48b2 100644
--- a/sysdeps/x86_64/multiarch/strcmp-evex.S
+++ b/sysdeps/x86_64/multiarch/strcmp-evex.S
@@ -116,7 +116,7 @@ ENTRY(STRCMP)
# ifdef USE_AS_STRNCMP
# ifdef __ILP32__
/* Clear the upper 32 bits. */
- movl %edx, %rdx
+ movl %edx, %edx
# endif
cmp $1, %RDX_LP
/* Signed comparison intentional. We use this branch to also

View File

@ -0,0 +1,451 @@
commit ea19c490a3f5628d55ded271cbb753e66b2f05e8
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sun Feb 6 00:54:18 2022 -0600
x86: Improve vec generation in memset-vec-unaligned-erms.S
No bug.
Split vec generation into multiple steps. This allows the
broadcast in AVX2 to use 'xmm' registers for the L(less_vec)
case. This saves an expensive lane-cross instruction and removes
the need for 'vzeroupper'.
For SSE2 replace 2x 'punpck' instructions with zero-idiom 'pxor' for
byte broadcast.
Results for memset-avx2 small (geomean of N = 20 benchset runs).
size, New Time, Old Time, New / Old
0, 4.100, 3.831, 0.934
1, 5.074, 4.399, 0.867
2, 4.433, 4.411, 0.995
4, 4.487, 4.415, 0.984
8, 4.454, 4.396, 0.987
16, 4.502, 4.443, 0.987
All relevant string/wcsmbs tests are passing.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit b62ace2740a106222e124cc86956448fa07abf4d)
diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S
index 0137eba4cdd9f830..34ee0bfdcb81fb39 100644
--- a/sysdeps/x86_64/memset.S
+++ b/sysdeps/x86_64/memset.S
@@ -28,17 +28,22 @@
#define VMOVU movups
#define VMOVA movaps
-#define MEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
+# define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
movd d, %xmm0; \
- movq r, %rax; \
- punpcklbw %xmm0, %xmm0; \
- punpcklwd %xmm0, %xmm0; \
- pshufd $0, %xmm0, %xmm0
+ pxor %xmm1, %xmm1; \
+ pshufb %xmm1, %xmm0; \
+ movq r, %rax
-#define WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
+# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
movd d, %xmm0; \
- movq r, %rax; \
- pshufd $0, %xmm0, %xmm0
+ pshufd $0, %xmm0, %xmm0; \
+ movq r, %rax
+
+# define MEMSET_VDUP_TO_VEC0_HIGH()
+# define MEMSET_VDUP_TO_VEC0_LOW()
+
+# define WMEMSET_VDUP_TO_VEC0_HIGH()
+# define WMEMSET_VDUP_TO_VEC0_LOW()
#define SECTION(p) p
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
index 1af668af0aeda59e..c0bf2875d03d51ab 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
@@ -10,15 +10,18 @@
# define VMOVU vmovdqu
# define VMOVA vmovdqa
-# define MEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
+# define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
vmovd d, %xmm0; \
- movq r, %rax; \
- vpbroadcastb %xmm0, %ymm0
+ movq r, %rax;
-# define WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
- vmovd d, %xmm0; \
- movq r, %rax; \
- vpbroadcastd %xmm0, %ymm0
+# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
+ MEMSET_SET_VEC0_AND_SET_RETURN(d, r)
+
+# define MEMSET_VDUP_TO_VEC0_HIGH() vpbroadcastb %xmm0, %ymm0
+# define MEMSET_VDUP_TO_VEC0_LOW() vpbroadcastb %xmm0, %xmm0
+
+# define WMEMSET_VDUP_TO_VEC0_HIGH() vpbroadcastd %xmm0, %ymm0
+# define WMEMSET_VDUP_TO_VEC0_LOW() vpbroadcastd %xmm0, %xmm0
# ifndef SECTION
# define SECTION(p) p##.avx
@@ -30,5 +33,6 @@
# define WMEMSET_SYMBOL(p,s) p##_avx2_##s
# endif
+# define USE_XMM_LESS_VEC
# include "memset-vec-unaligned-erms.S"
#endif
diff --git a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
index f14d6f8493c21a36..5241216a77bf72b7 100644
--- a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
@@ -15,13 +15,19 @@
# define VZEROUPPER
-# define MEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
- movq r, %rax; \
- vpbroadcastb d, %VEC0
+# define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
+ vpbroadcastb d, %VEC0; \
+ movq r, %rax
-# define WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
- movq r, %rax; \
- vpbroadcastd d, %VEC0
+# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
+ vpbroadcastd d, %VEC0; \
+ movq r, %rax
+
+# define MEMSET_VDUP_TO_VEC0_HIGH()
+# define MEMSET_VDUP_TO_VEC0_LOW()
+
+# define WMEMSET_VDUP_TO_VEC0_HIGH()
+# define WMEMSET_VDUP_TO_VEC0_LOW()
# define SECTION(p) p##.evex512
# define MEMSET_SYMBOL(p,s) p##_avx512_##s
diff --git a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
index 64b09e77cc20cc42..637002150659123c 100644
--- a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
@@ -15,13 +15,19 @@
# define VZEROUPPER
-# define MEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
- movq r, %rax; \
- vpbroadcastb d, %VEC0
+# define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
+ vpbroadcastb d, %VEC0; \
+ movq r, %rax
-# define WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN(d, r) \
- movq r, %rax; \
- vpbroadcastd d, %VEC0
+# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
+ vpbroadcastd d, %VEC0; \
+ movq r, %rax
+
+# define MEMSET_VDUP_TO_VEC0_HIGH()
+# define MEMSET_VDUP_TO_VEC0_LOW()
+
+# define WMEMSET_VDUP_TO_VEC0_HIGH()
+# define WMEMSET_VDUP_TO_VEC0_LOW()
# define SECTION(p) p##.evex
# define MEMSET_SYMBOL(p,s) p##_evex_##s
diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
index e723413a664c088f..c8db87dcbf69f0d8 100644
--- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
@@ -58,8 +58,10 @@
#ifndef MOVQ
# if VEC_SIZE > 16
# define MOVQ vmovq
+# define MOVD vmovd
# else
# define MOVQ movq
+# define MOVD movd
# endif
#endif
@@ -72,9 +74,17 @@
#if defined USE_WITH_EVEX || defined USE_WITH_AVX512
# define END_REG rcx
# define LOOP_REG rdi
+# define LESS_VEC_REG rax
#else
# define END_REG rdi
# define LOOP_REG rdx
+# define LESS_VEC_REG rdi
+#endif
+
+#ifdef USE_XMM_LESS_VEC
+# define XMM_SMALL 1
+#else
+# define XMM_SMALL 0
#endif
#define PAGE_SIZE 4096
@@ -110,8 +120,12 @@ END_CHK (WMEMSET_CHK_SYMBOL (__wmemset_chk, unaligned))
ENTRY (WMEMSET_SYMBOL (__wmemset, unaligned))
shl $2, %RDX_LP
- WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN (%esi, %rdi)
- jmp L(entry_from_bzero)
+ WMEMSET_SET_VEC0_AND_SET_RETURN (%esi, %rdi)
+ WMEMSET_VDUP_TO_VEC0_LOW()
+ cmpq $VEC_SIZE, %rdx
+ jb L(less_vec_no_vdup)
+ WMEMSET_VDUP_TO_VEC0_HIGH()
+ jmp L(entry_from_wmemset)
END (WMEMSET_SYMBOL (__wmemset, unaligned))
#endif
@@ -123,7 +137,7 @@ END_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned))
#endif
ENTRY (MEMSET_SYMBOL (__memset, unaligned))
- MEMSET_VDUP_TO_VEC0_AND_SET_RETURN (%esi, %rdi)
+ MEMSET_SET_VEC0_AND_SET_RETURN (%esi, %rdi)
# ifdef __ILP32__
/* Clear the upper 32 bits. */
mov %edx, %edx
@@ -131,6 +145,8 @@ ENTRY (MEMSET_SYMBOL (__memset, unaligned))
L(entry_from_bzero):
cmpq $VEC_SIZE, %rdx
jb L(less_vec)
+ MEMSET_VDUP_TO_VEC0_HIGH()
+L(entry_from_wmemset):
cmpq $(VEC_SIZE * 2), %rdx
ja L(more_2x_vec)
/* From VEC and to 2 * VEC. No branch when size == VEC_SIZE. */
@@ -179,27 +195,27 @@ END_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms))
# endif
ENTRY_P2ALIGN (MEMSET_SYMBOL (__memset, unaligned_erms), 6)
- MEMSET_VDUP_TO_VEC0_AND_SET_RETURN (%esi, %rdi)
+ MEMSET_SET_VEC0_AND_SET_RETURN (%esi, %rdi)
# ifdef __ILP32__
/* Clear the upper 32 bits. */
mov %edx, %edx
# endif
cmp $VEC_SIZE, %RDX_LP
jb L(less_vec)
+ MEMSET_VDUP_TO_VEC0_HIGH ()
cmp $(VEC_SIZE * 2), %RDX_LP
ja L(stosb_more_2x_vec)
- /* From VEC and to 2 * VEC. No branch when size == VEC_SIZE.
- */
- VMOVU %VEC(0), (%rax)
- VMOVU %VEC(0), -VEC_SIZE(%rax, %rdx)
+ /* From VEC and to 2 * VEC. No branch when size == VEC_SIZE. */
+ VMOVU %VEC(0), (%rdi)
+ VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
VZEROUPPER_RETURN
#endif
- .p2align 4,, 10
+ .p2align 4,, 4
L(last_2x_vec):
#ifdef USE_LESS_VEC_MASK_STORE
- VMOVU %VEC(0), (VEC_SIZE * 2 + LOOP_4X_OFFSET)(%rcx)
- VMOVU %VEC(0), (VEC_SIZE * 3 + LOOP_4X_OFFSET)(%rcx)
+ VMOVU %VEC(0), (VEC_SIZE * -2)(%rdi, %rdx)
+ VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
#else
VMOVU %VEC(0), (VEC_SIZE * -2)(%rdi)
VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi)
@@ -212,6 +228,7 @@ L(last_2x_vec):
#ifdef USE_LESS_VEC_MASK_STORE
.p2align 4,, 10
L(less_vec):
+L(less_vec_no_vdup):
/* Less than 1 VEC. */
# if VEC_SIZE != 16 && VEC_SIZE != 32 && VEC_SIZE != 64
# error Unsupported VEC_SIZE!
@@ -262,28 +279,18 @@ L(stosb_more_2x_vec):
/* Fallthrough goes to L(loop_4x_vec). Tests for memset (2x, 4x]
and (4x, 8x] jump to target. */
L(more_2x_vec):
-
- /* Two different methods of setting up pointers / compare. The
- two methods are based on the fact that EVEX/AVX512 mov
- instructions take more bytes then AVX2/SSE2 mov instructions. As
- well that EVEX/AVX512 machines also have fast LEA_BID. Both
- setup and END_REG to avoid complex address mode. For EVEX/AVX512
- this saves code size and keeps a few targets in one fetch block.
- For AVX2/SSE2 this helps prevent AGU bottlenecks. */
-#if defined USE_WITH_EVEX || defined USE_WITH_AVX512
- /* If EVEX/AVX512 compute END_REG - (VEC_SIZE * 4 +
- LOOP_4X_OFFSET) with LEA_BID. */
-
- /* END_REG is rcx for EVEX/AVX512. */
- leaq -(VEC_SIZE * 4 + LOOP_4X_OFFSET)(%rdi, %rdx), %END_REG
-#endif
-
- /* Stores to first 2x VEC before cmp as any path forward will
- require it. */
- VMOVU %VEC(0), (%rax)
- VMOVU %VEC(0), VEC_SIZE(%rax)
+ /* Store next 2x vec regardless. */
+ VMOVU %VEC(0), (%rdi)
+ VMOVU %VEC(0), (VEC_SIZE * 1)(%rdi)
+ /* Two different methods of setting up pointers / compare. The two
+ methods are based on the fact that EVEX/AVX512 mov instructions take
+ more bytes then AVX2/SSE2 mov instructions. As well that EVEX/AVX512
+ machines also have fast LEA_BID. Both setup and END_REG to avoid complex
+ address mode. For EVEX/AVX512 this saves code size and keeps a few
+ targets in one fetch block. For AVX2/SSE2 this helps prevent AGU
+ bottlenecks. */
#if !(defined USE_WITH_EVEX || defined USE_WITH_AVX512)
/* If AVX2/SSE2 compute END_REG (rdi) with ALU. */
addq %rdx, %END_REG
@@ -292,6 +299,15 @@ L(more_2x_vec):
cmpq $(VEC_SIZE * 4), %rdx
jbe L(last_2x_vec)
+
+#if defined USE_WITH_EVEX || defined USE_WITH_AVX512
+ /* If EVEX/AVX512 compute END_REG - (VEC_SIZE * 4 + LOOP_4X_OFFSET) with
+ LEA_BID. */
+
+ /* END_REG is rcx for EVEX/AVX512. */
+ leaq -(VEC_SIZE * 4 + LOOP_4X_OFFSET)(%rdi, %rdx), %END_REG
+#endif
+
/* Store next 2x vec regardless. */
VMOVU %VEC(0), (VEC_SIZE * 2)(%rax)
VMOVU %VEC(0), (VEC_SIZE * 3)(%rax)
@@ -355,65 +371,93 @@ L(stosb_local):
/* Define L(less_vec) only if not otherwise defined. */
.p2align 4
L(less_vec):
+ /* Broadcast esi to partial register (i.e VEC_SIZE == 32 broadcast to
+ xmm). This is only does anything for AVX2. */
+ MEMSET_VDUP_TO_VEC0_LOW ()
+L(less_vec_no_vdup):
#endif
L(cross_page):
#if VEC_SIZE > 32
cmpl $32, %edx
- jae L(between_32_63)
+ jge L(between_32_63)
#endif
#if VEC_SIZE > 16
cmpl $16, %edx
- jae L(between_16_31)
+ jge L(between_16_31)
+#endif
+#ifndef USE_XMM_LESS_VEC
+ MOVQ %XMM0, %rcx
#endif
- MOVQ %XMM0, %rdi
cmpl $8, %edx
- jae L(between_8_15)
+ jge L(between_8_15)
cmpl $4, %edx
- jae L(between_4_7)
+ jge L(between_4_7)
cmpl $1, %edx
- ja L(between_2_3)
- jb L(return)
- movb %sil, (%rax)
- VZEROUPPER_RETURN
+ jg L(between_2_3)
+ jl L(between_0_0)
+ movb %sil, (%LESS_VEC_REG)
+L(between_0_0):
+ ret
- /* Align small targets only if not doing so would cross a fetch
- line. */
+ /* Align small targets only if not doing so would cross a fetch line.
+ */
#if VEC_SIZE > 32
.p2align 4,, SMALL_MEMSET_ALIGN(MOV_SIZE, RET_SIZE)
/* From 32 to 63. No branch when size == 32. */
L(between_32_63):
- VMOVU %YMM0, (%rax)
- VMOVU %YMM0, -32(%rax, %rdx)
+ VMOVU %YMM0, (%LESS_VEC_REG)
+ VMOVU %YMM0, -32(%LESS_VEC_REG, %rdx)
VZEROUPPER_RETURN
#endif
#if VEC_SIZE >= 32
- .p2align 4,, SMALL_MEMSET_ALIGN(MOV_SIZE, RET_SIZE)
+ .p2align 4,, SMALL_MEMSET_ALIGN(MOV_SIZE, 1)
L(between_16_31):
/* From 16 to 31. No branch when size == 16. */
- VMOVU %XMM0, (%rax)
- VMOVU %XMM0, -16(%rax, %rdx)
- VZEROUPPER_RETURN
+ VMOVU %XMM0, (%LESS_VEC_REG)
+ VMOVU %XMM0, -16(%LESS_VEC_REG, %rdx)
+ ret
#endif
- .p2align 4,, SMALL_MEMSET_ALIGN(3, RET_SIZE)
+ /* Move size is 3 for SSE2, EVEX, and AVX512. Move size is 4 for AVX2.
+ */
+ .p2align 4,, SMALL_MEMSET_ALIGN(3 + XMM_SMALL, 1)
L(between_8_15):
/* From 8 to 15. No branch when size == 8. */
- movq %rdi, (%rax)
- movq %rdi, -8(%rax, %rdx)
- VZEROUPPER_RETURN
+#ifdef USE_XMM_LESS_VEC
+ MOVQ %XMM0, (%rdi)
+ MOVQ %XMM0, -8(%rdi, %rdx)
+#else
+ movq %rcx, (%LESS_VEC_REG)
+ movq %rcx, -8(%LESS_VEC_REG, %rdx)
+#endif
+ ret
- .p2align 4,, SMALL_MEMSET_ALIGN(2, RET_SIZE)
+ /* Move size is 2 for SSE2, EVEX, and AVX512. Move size is 4 for AVX2.
+ */
+ .p2align 4,, SMALL_MEMSET_ALIGN(2 << XMM_SMALL, 1)
L(between_4_7):
/* From 4 to 7. No branch when size == 4. */
- movl %edi, (%rax)
- movl %edi, -4(%rax, %rdx)
- VZEROUPPER_RETURN
+#ifdef USE_XMM_LESS_VEC
+ MOVD %XMM0, (%rdi)
+ MOVD %XMM0, -4(%rdi, %rdx)
+#else
+ movl %ecx, (%LESS_VEC_REG)
+ movl %ecx, -4(%LESS_VEC_REG, %rdx)
+#endif
+ ret
- .p2align 4,, SMALL_MEMSET_ALIGN(3, RET_SIZE)
+ /* 4 * XMM_SMALL for the third mov for AVX2. */
+ .p2align 4,, 4 * XMM_SMALL + SMALL_MEMSET_ALIGN(3, 1)
L(between_2_3):
/* From 2 to 3. No branch when size == 2. */
- movw %di, (%rax)
- movb %dil, -1(%rax, %rdx)
- VZEROUPPER_RETURN
+#ifdef USE_XMM_LESS_VEC
+ movb %sil, (%rdi)
+ movb %sil, 1(%rdi)
+ movb %sil, -1(%rdi, %rdx)
+#else
+ movw %cx, (%LESS_VEC_REG)
+ movb %sil, -1(%LESS_VEC_REG, %rdx)
+#endif
+ ret
END (MEMSET_SYMBOL (__memset, unaligned_erms))

View File

@ -0,0 +1,35 @@
commit 190ea5f7e4e7e98b9b6e3f29835ae8b1f6a5442e
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Feb 7 00:32:23 2022 -0600
x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only)
commit b62ace2740a106222e124cc86956448fa07abf4d
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sun Feb 6 00:54:18 2022 -0600
x86: Improve vec generation in memset-vec-unaligned-erms.S
Revert usage of 'pshufb' in broadcast logic as it is an SSSE3
instruction and memset.S is restricted to only SSE2 instructions.
(cherry picked from commit 1b0c60f95bbe2eded80b2bb5be75c0e45b11cde1)
diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S
index 34ee0bfdcb81fb39..954471e5a5bf225b 100644
--- a/sysdeps/x86_64/memset.S
+++ b/sysdeps/x86_64/memset.S
@@ -30,9 +30,10 @@
# define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
movd d, %xmm0; \
- pxor %xmm1, %xmm1; \
- pshufb %xmm1, %xmm0; \
- movq r, %rax
+ movq r, %rax; \
+ punpcklbw %xmm0, %xmm0; \
+ punpcklwd %xmm0, %xmm0; \
+ pshufd $0, %xmm0, %xmm0
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
movd d, %xmm0; \

View File

@ -0,0 +1,719 @@
commit 5cb6329652696e79d6d576165ea87e332c9de106
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Feb 7 05:55:15 2022 -0800
x86-64: Optimize bzero
memset with zero as the value to set is by far the majority value (99%+
for Python3 and GCC).
bzero can be slightly more optimized for this case by using a zero-idiom
xor for broadcasting the set value to a register (vector or GPR).
Co-developed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 3d9f171bfb5325bd5f427e9fc386453358c6e840)
diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S
index 954471e5a5bf225b..0358210c7ff3a976 100644
--- a/sysdeps/x86_64/memset.S
+++ b/sysdeps/x86_64/memset.S
@@ -35,6 +35,9 @@
punpcklwd %xmm0, %xmm0; \
pshufd $0, %xmm0, %xmm0
+# define BZERO_ZERO_VEC0() \
+ pxor %xmm0, %xmm0
+
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
movd d, %xmm0; \
pshufd $0, %xmm0, %xmm0; \
@@ -53,6 +56,10 @@
# define MEMSET_SYMBOL(p,s) memset
#endif
+#ifndef BZERO_SYMBOL
+# define BZERO_SYMBOL(p,s) __bzero
+#endif
+
#ifndef WMEMSET_SYMBOL
# define WMEMSET_CHK_SYMBOL(p,s) p
# define WMEMSET_SYMBOL(p,s) __wmemset
@@ -63,6 +70,7 @@
libc_hidden_builtin_def (memset)
#if IS_IN (libc)
+weak_alias (__bzero, bzero)
libc_hidden_def (__wmemset)
weak_alias (__wmemset, wmemset)
libc_hidden_weak (wmemset)
diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile
index 26be40959ce62895..37d8d6f0bd2d10cc 100644
--- a/sysdeps/x86_64/multiarch/Makefile
+++ b/sysdeps/x86_64/multiarch/Makefile
@@ -1,85 +1,130 @@
ifeq ($(subdir),string)
-sysdep_routines += strncat-c stpncpy-c strncpy-c \
- strcmp-sse2 strcmp-sse2-unaligned strcmp-ssse3 \
- strcmp-sse4_2 strcmp-avx2 \
- strncmp-sse2 strncmp-ssse3 strncmp-sse4_2 strncmp-avx2 \
- memchr-sse2 rawmemchr-sse2 memchr-avx2 rawmemchr-avx2 \
- memrchr-sse2 memrchr-avx2 \
- memcmp-sse2 \
- memcmp-avx2-movbe \
- memcmp-sse4 memcpy-ssse3 \
- memmove-ssse3 \
- memcpy-ssse3-back \
- memmove-ssse3-back \
- memmove-avx512-no-vzeroupper \
- strcasecmp_l-sse2 strcasecmp_l-ssse3 \
- strcasecmp_l-sse4_2 strcasecmp_l-avx \
- strncase_l-sse2 strncase_l-ssse3 \
- strncase_l-sse4_2 strncase_l-avx \
- strchr-sse2 strchrnul-sse2 strchr-avx2 strchrnul-avx2 \
- strrchr-sse2 strrchr-avx2 \
- strlen-sse2 strnlen-sse2 strlen-avx2 strnlen-avx2 \
- strcat-avx2 strncat-avx2 \
- strcat-ssse3 strncat-ssse3\
- strcpy-avx2 strncpy-avx2 \
- strcpy-sse2 stpcpy-sse2 \
- strcpy-ssse3 strncpy-ssse3 stpcpy-ssse3 stpncpy-ssse3 \
- strcpy-sse2-unaligned strncpy-sse2-unaligned \
- stpcpy-sse2-unaligned stpncpy-sse2-unaligned \
- stpcpy-avx2 stpncpy-avx2 \
- strcat-sse2 \
- strcat-sse2-unaligned strncat-sse2-unaligned \
- strchr-sse2-no-bsf memcmp-ssse3 strstr-sse2-unaligned \
- strcspn-sse2 strpbrk-sse2 strspn-sse2 \
- strcspn-c strpbrk-c strspn-c varshift \
- memset-avx512-no-vzeroupper \
- memmove-sse2-unaligned-erms \
- memmove-avx-unaligned-erms \
- memmove-avx512-unaligned-erms \
- memset-sse2-unaligned-erms \
- memset-avx2-unaligned-erms \
- memset-avx512-unaligned-erms \
- memchr-avx2-rtm \
- memcmp-avx2-movbe-rtm \
- memmove-avx-unaligned-erms-rtm \
- memrchr-avx2-rtm \
- memset-avx2-unaligned-erms-rtm \
- rawmemchr-avx2-rtm \
- strchr-avx2-rtm \
- strcmp-avx2-rtm \
- strchrnul-avx2-rtm \
- stpcpy-avx2-rtm \
- stpncpy-avx2-rtm \
- strcat-avx2-rtm \
- strcpy-avx2-rtm \
- strlen-avx2-rtm \
- strncat-avx2-rtm \
- strncmp-avx2-rtm \
- strncpy-avx2-rtm \
- strnlen-avx2-rtm \
- strrchr-avx2-rtm \
- memchr-evex \
- memcmp-evex-movbe \
- memmove-evex-unaligned-erms \
- memrchr-evex \
- memset-evex-unaligned-erms \
- rawmemchr-evex \
- stpcpy-evex \
- stpncpy-evex \
- strcat-evex \
- strchr-evex \
- strchrnul-evex \
- strcmp-evex \
- strcpy-evex \
- strlen-evex \
- strncat-evex \
- strncmp-evex \
- strncpy-evex \
- strnlen-evex \
- strrchr-evex \
- memchr-evex-rtm \
- rawmemchr-evex-rtm
+sysdep_routines += \
+ bzero \
+ memchr-avx2 \
+ memchr-avx2-rtm \
+ memchr-evex \
+ memchr-evex-rtm \
+ memchr-sse2 \
+ memcmp-avx2-movbe \
+ memcmp-avx2-movbe-rtm \
+ memcmp-evex-movbe \
+ memcmp-sse2 \
+ memcmp-sse4 \
+ memcmp-ssse3 \
+ memcpy-ssse3 \
+ memcpy-ssse3-back \
+ memmove-avx-unaligned-erms \
+ memmove-avx-unaligned-erms-rtm \
+ memmove-avx512-no-vzeroupper \
+ memmove-avx512-unaligned-erms \
+ memmove-evex-unaligned-erms \
+ memmove-sse2-unaligned-erms \
+ memmove-ssse3 \
+ memmove-ssse3-back \
+ memrchr-avx2 \
+ memrchr-avx2-rtm \
+ memrchr-evex \
+ memrchr-sse2 \
+ memset-avx2-unaligned-erms \
+ memset-avx2-unaligned-erms-rtm \
+ memset-avx512-no-vzeroupper \
+ memset-avx512-unaligned-erms \
+ memset-evex-unaligned-erms \
+ memset-sse2-unaligned-erms \
+ rawmemchr-avx2 \
+ rawmemchr-avx2-rtm \
+ rawmemchr-evex \
+ rawmemchr-evex-rtm \
+ rawmemchr-sse2 \
+ stpcpy-avx2 \
+ stpcpy-avx2-rtm \
+ stpcpy-evex \
+ stpcpy-sse2 \
+ stpcpy-sse2-unaligned \
+ stpcpy-ssse3 \
+ stpncpy-avx2 \
+ stpncpy-avx2-rtm \
+ stpncpy-c \
+ stpncpy-evex \
+ stpncpy-sse2-unaligned \
+ stpncpy-ssse3 \
+ strcasecmp_l-avx \
+ strcasecmp_l-sse2 \
+ strcasecmp_l-sse4_2 \
+ strcasecmp_l-ssse3 \
+ strcat-avx2 \
+ strcat-avx2-rtm \
+ strcat-evex \
+ strcat-sse2 \
+ strcat-sse2-unaligned \
+ strcat-ssse3 \
+ strchr-avx2 \
+ strchr-avx2-rtm \
+ strchr-evex \
+ strchr-sse2 \
+ strchr-sse2-no-bsf \
+ strchrnul-avx2 \
+ strchrnul-avx2-rtm \
+ strchrnul-evex \
+ strchrnul-sse2 \
+ strcmp-avx2 \
+ strcmp-avx2-rtm \
+ strcmp-evex \
+ strcmp-sse2 \
+ strcmp-sse2-unaligned \
+ strcmp-sse4_2 \
+ strcmp-ssse3 \
+ strcpy-avx2 \
+ strcpy-avx2-rtm \
+ strcpy-evex \
+ strcpy-sse2 \
+ strcpy-sse2-unaligned \
+ strcpy-ssse3 \
+ strcspn-c \
+ strcspn-sse2 \
+ strlen-avx2 \
+ strlen-avx2-rtm \
+ strlen-evex \
+ strlen-sse2 \
+ strncase_l-avx \
+ strncase_l-sse2 \
+ strncase_l-sse4_2 \
+ strncase_l-ssse3 \
+ strncat-avx2 \
+ strncat-avx2-rtm \
+ strncat-c \
+ strncat-evex \
+ strncat-sse2-unaligned \
+ strncat-ssse3 \
+ strncmp-avx2 \
+ strncmp-avx2-rtm \
+ strncmp-evex \
+ strncmp-sse2 \
+ strncmp-sse4_2 \
+ strncmp-ssse3 \
+ strncpy-avx2 \
+ strncpy-avx2-rtm \
+ strncpy-c \
+ strncpy-evex \
+ strncpy-sse2-unaligned \
+ strncpy-ssse3 \
+ strnlen-avx2 \
+ strnlen-avx2-rtm \
+ strnlen-evex \
+ strnlen-sse2 \
+ strpbrk-c \
+ strpbrk-sse2 \
+ strrchr-avx2 \
+ strrchr-avx2-rtm \
+ strrchr-evex \
+ strrchr-sse2 \
+ strspn-c \
+ strspn-sse2 \
+ strstr-sse2-unaligned \
+ varshift \
+# sysdep_routines
CFLAGS-varshift.c += -msse4
CFLAGS-strcspn-c.c += -msse4
CFLAGS-strpbrk-c.c += -msse4
diff --git a/sysdeps/x86_64/multiarch/bzero.c b/sysdeps/x86_64/multiarch/bzero.c
new file mode 100644
index 0000000000000000..13e399a9a1fbdeb2
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/bzero.c
@@ -0,0 +1,108 @@
+/* Multiple versions of bzero.
+ All versions must be listed in ifunc-impl-list.c.
+ Copyright (C) 2022 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* Define multiple versions only for the definition in libc. */
+#if IS_IN (libc)
+# define __bzero __redirect___bzero
+# include <string.h>
+# undef __bzero
+
+/* OPTIMIZE1 definition required for bzero patch. */
+# define OPTIMIZE1(name) EVALUATOR1 (SYMBOL_NAME, name)
+# define SYMBOL_NAME __bzero
+# include <init-arch.h>
+
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (sse2_unaligned)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (sse2_unaligned_erms)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned) attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_erms)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_rtm)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_erms_rtm)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (evex_unaligned)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (evex_unaligned_erms)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx512_unaligned)
+ attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx512_unaligned_erms)
+ attribute_hidden;
+
+static inline void *
+IFUNC_SELECTOR (void)
+{
+ const struct cpu_features* cpu_features = __get_cpu_features ();
+
+ if (CPU_FEATURE_USABLE_P (cpu_features, AVX512F)
+ && !CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_AVX512))
+ {
+ if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
+ && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)
+ && CPU_FEATURE_USABLE_P (cpu_features, BMI2))
+ {
+ if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
+ return OPTIMIZE1 (avx512_unaligned_erms);
+
+ return OPTIMIZE1 (avx512_unaligned);
+ }
+ }
+
+ if (CPU_FEATURE_USABLE_P (cpu_features, AVX2))
+ {
+ if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
+ && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)
+ && CPU_FEATURE_USABLE_P (cpu_features, BMI2))
+ {
+ if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
+ return OPTIMIZE1 (evex_unaligned_erms);
+
+ return OPTIMIZE1 (evex_unaligned);
+ }
+
+ if (CPU_FEATURE_USABLE_P (cpu_features, RTM))
+ {
+ if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
+ return OPTIMIZE1 (avx2_unaligned_erms_rtm);
+
+ return OPTIMIZE1 (avx2_unaligned_rtm);
+ }
+
+ if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER))
+ {
+ if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
+ return OPTIMIZE1 (avx2_unaligned_erms);
+
+ return OPTIMIZE1 (avx2_unaligned);
+ }
+ }
+
+ if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
+ return OPTIMIZE1 (sse2_unaligned_erms);
+
+ return OPTIMIZE1 (sse2_unaligned);
+}
+
+libc_ifunc_redirected (__redirect___bzero, __bzero, IFUNC_SELECTOR ());
+
+weak_alias (__bzero, bzero)
+#endif
diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
index 39ab10613bb0ffea..4992d7bd3206a7c0 100644
--- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
@@ -282,6 +282,48 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__memset_avx512_no_vzeroupper)
)
+ /* Support sysdeps/x86_64/multiarch/bzero.c. */
+ IFUNC_IMPL (i, name, bzero,
+ IFUNC_IMPL_ADD (array, i, bzero, 1,
+ __bzero_sse2_unaligned)
+ IFUNC_IMPL_ADD (array, i, bzero, 1,
+ __bzero_sse2_unaligned_erms)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ CPU_FEATURE_USABLE (AVX2),
+ __bzero_avx2_unaligned)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ CPU_FEATURE_USABLE (AVX2),
+ __bzero_avx2_unaligned_erms)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ (CPU_FEATURE_USABLE (AVX2)
+ && CPU_FEATURE_USABLE (RTM)),
+ __bzero_avx2_unaligned_rtm)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ (CPU_FEATURE_USABLE (AVX2)
+ && CPU_FEATURE_USABLE (RTM)),
+ __bzero_avx2_unaligned_erms_rtm)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ (CPU_FEATURE_USABLE (AVX512VL)
+ && CPU_FEATURE_USABLE (AVX512BW)
+ && CPU_FEATURE_USABLE (BMI2)),
+ __bzero_evex_unaligned)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ (CPU_FEATURE_USABLE (AVX512VL)
+ && CPU_FEATURE_USABLE (AVX512BW)
+ && CPU_FEATURE_USABLE (BMI2)),
+ __bzero_evex_unaligned_erms)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ (CPU_FEATURE_USABLE (AVX512VL)
+ && CPU_FEATURE_USABLE (AVX512BW)
+ && CPU_FEATURE_USABLE (BMI2)),
+ __bzero_avx512_unaligned_erms)
+ IFUNC_IMPL_ADD (array, i, bzero,
+ (CPU_FEATURE_USABLE (AVX512VL)
+ && CPU_FEATURE_USABLE (AVX512BW)
+ && CPU_FEATURE_USABLE (BMI2)),
+ __bzero_avx512_unaligned)
+ )
+
/* Support sysdeps/x86_64/multiarch/rawmemchr.c. */
IFUNC_IMPL (i, name, rawmemchr,
IFUNC_IMPL_ADD (array, i, rawmemchr,
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
index 8ac3e479bba488be..5a5ee6f67299400b 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
@@ -5,6 +5,7 @@
#define SECTION(p) p##.avx.rtm
#define MEMSET_SYMBOL(p,s) p##_avx2_##s##_rtm
+#define BZERO_SYMBOL(p,s) p##_avx2_##s##_rtm
#define WMEMSET_SYMBOL(p,s) p##_avx2_##s##_rtm
#include "memset-avx2-unaligned-erms.S"
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
index c0bf2875d03d51ab..a093a2831f3dfa0d 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
@@ -14,6 +14,9 @@
vmovd d, %xmm0; \
movq r, %rax;
+# define BZERO_ZERO_VEC0() \
+ vpxor %xmm0, %xmm0, %xmm0
+
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
MEMSET_SET_VEC0_AND_SET_RETURN(d, r)
@@ -29,6 +32,9 @@
# ifndef MEMSET_SYMBOL
# define MEMSET_SYMBOL(p,s) p##_avx2_##s
# endif
+# ifndef BZERO_SYMBOL
+# define BZERO_SYMBOL(p,s) p##_avx2_##s
+# endif
# ifndef WMEMSET_SYMBOL
# define WMEMSET_SYMBOL(p,s) p##_avx2_##s
# endif
diff --git a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
index 5241216a77bf72b7..727c92133a15900f 100644
--- a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
@@ -19,6 +19,9 @@
vpbroadcastb d, %VEC0; \
movq r, %rax
+# define BZERO_ZERO_VEC0() \
+ vpxorq %XMM0, %XMM0, %XMM0
+
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
vpbroadcastd d, %VEC0; \
movq r, %rax
diff --git a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
index 637002150659123c..5d8fa78f05476b10 100644
--- a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
@@ -19,6 +19,9 @@
vpbroadcastb d, %VEC0; \
movq r, %rax
+# define BZERO_ZERO_VEC0() \
+ vpxorq %XMM0, %XMM0, %XMM0
+
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
vpbroadcastd d, %VEC0; \
movq r, %rax
diff --git a/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
index e4e95fc19fe48d2d..bac74ac37fd3c144 100644
--- a/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
@@ -22,6 +22,7 @@
#if IS_IN (libc)
# define MEMSET_SYMBOL(p,s) p##_sse2_##s
+# define BZERO_SYMBOL(p,s) MEMSET_SYMBOL (p, s)
# define WMEMSET_SYMBOL(p,s) p##_sse2_##s
# ifdef SHARED
diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
index c8db87dcbf69f0d8..39a096a594ccb5b6 100644
--- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
@@ -26,6 +26,10 @@
#include <sysdep.h>
+#ifndef BZERO_SYMBOL
+# define BZERO_SYMBOL(p,s) MEMSET_SYMBOL (p, s)
+#endif
+
#ifndef MEMSET_CHK_SYMBOL
# define MEMSET_CHK_SYMBOL(p,s) MEMSET_SYMBOL(p, s)
#endif
@@ -87,6 +91,18 @@
# define XMM_SMALL 0
#endif
+#ifdef USE_LESS_VEC_MASK_STORE
+# define SET_REG64 rcx
+# define SET_REG32 ecx
+# define SET_REG16 cx
+# define SET_REG8 cl
+#else
+# define SET_REG64 rsi
+# define SET_REG32 esi
+# define SET_REG16 si
+# define SET_REG8 sil
+#endif
+
#define PAGE_SIZE 4096
/* Macro to calculate size of small memset block for aligning
@@ -96,18 +112,6 @@
#ifndef SECTION
# error SECTION is not defined!
-#endif
-
- .section SECTION(.text),"ax",@progbits
-#if VEC_SIZE == 16 && IS_IN (libc)
-ENTRY (__bzero)
- mov %RDI_LP, %RAX_LP /* Set return value. */
- mov %RSI_LP, %RDX_LP /* Set n. */
- xorl %esi, %esi
- pxor %XMM0, %XMM0
- jmp L(entry_from_bzero)
-END (__bzero)
-weak_alias (__bzero, bzero)
#endif
#if IS_IN (libc)
@@ -123,12 +127,37 @@ ENTRY (WMEMSET_SYMBOL (__wmemset, unaligned))
WMEMSET_SET_VEC0_AND_SET_RETURN (%esi, %rdi)
WMEMSET_VDUP_TO_VEC0_LOW()
cmpq $VEC_SIZE, %rdx
- jb L(less_vec_no_vdup)
+ jb L(less_vec_from_wmemset)
WMEMSET_VDUP_TO_VEC0_HIGH()
jmp L(entry_from_wmemset)
END (WMEMSET_SYMBOL (__wmemset, unaligned))
#endif
+ENTRY (BZERO_SYMBOL(__bzero, unaligned))
+#if VEC_SIZE > 16
+ BZERO_ZERO_VEC0 ()
+#endif
+ mov %RDI_LP, %RAX_LP
+ mov %RSI_LP, %RDX_LP
+#ifndef USE_LESS_VEC_MASK_STORE
+ xorl %esi, %esi
+#endif
+ cmp $VEC_SIZE, %RDX_LP
+ jb L(less_vec_no_vdup)
+#ifdef USE_LESS_VEC_MASK_STORE
+ xorl %esi, %esi
+#endif
+#if VEC_SIZE <= 16
+ BZERO_ZERO_VEC0 ()
+#endif
+ cmp $(VEC_SIZE * 2), %RDX_LP
+ ja L(more_2x_vec)
+ /* From VEC and to 2 * VEC. No branch when size == VEC_SIZE. */
+ VMOVU %VEC(0), (%rdi)
+ VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
+ VZEROUPPER_RETURN
+END (BZERO_SYMBOL(__bzero, unaligned))
+
#if defined SHARED && IS_IN (libc)
ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned))
cmp %RDX_LP, %RCX_LP
@@ -142,7 +171,6 @@ ENTRY (MEMSET_SYMBOL (__memset, unaligned))
/* Clear the upper 32 bits. */
mov %edx, %edx
# endif
-L(entry_from_bzero):
cmpq $VEC_SIZE, %rdx
jb L(less_vec)
MEMSET_VDUP_TO_VEC0_HIGH()
@@ -187,6 +215,31 @@ END (__memset_erms)
END (MEMSET_SYMBOL (__memset, erms))
# endif
+ENTRY_P2ALIGN (BZERO_SYMBOL(__bzero, unaligned_erms), 6)
+# if VEC_SIZE > 16
+ BZERO_ZERO_VEC0 ()
+# endif
+ mov %RDI_LP, %RAX_LP
+ mov %RSI_LP, %RDX_LP
+# ifndef USE_LESS_VEC_MASK_STORE
+ xorl %esi, %esi
+# endif
+ cmp $VEC_SIZE, %RDX_LP
+ jb L(less_vec_no_vdup)
+# ifdef USE_LESS_VEC_MASK_STORE
+ xorl %esi, %esi
+# endif
+# if VEC_SIZE <= 16
+ BZERO_ZERO_VEC0 ()
+# endif
+ cmp $(VEC_SIZE * 2), %RDX_LP
+ ja L(stosb_more_2x_vec)
+ /* From VEC and to 2 * VEC. No branch when size == VEC_SIZE. */
+ VMOVU %VEC(0), (%rdi)
+ VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
+ VZEROUPPER_RETURN
+END (BZERO_SYMBOL(__bzero, unaligned_erms))
+
# if defined SHARED && IS_IN (libc)
ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms))
cmp %RDX_LP, %RCX_LP
@@ -229,6 +282,7 @@ L(last_2x_vec):
.p2align 4,, 10
L(less_vec):
L(less_vec_no_vdup):
+L(less_vec_from_wmemset):
/* Less than 1 VEC. */
# if VEC_SIZE != 16 && VEC_SIZE != 32 && VEC_SIZE != 64
# error Unsupported VEC_SIZE!
@@ -374,8 +428,11 @@ L(less_vec):
/* Broadcast esi to partial register (i.e VEC_SIZE == 32 broadcast to
xmm). This is only does anything for AVX2. */
MEMSET_VDUP_TO_VEC0_LOW ()
+L(less_vec_from_wmemset):
+#if VEC_SIZE > 16
L(less_vec_no_vdup):
#endif
+#endif
L(cross_page):
#if VEC_SIZE > 32
cmpl $32, %edx
@@ -386,7 +443,10 @@ L(cross_page):
jge L(between_16_31)
#endif
#ifndef USE_XMM_LESS_VEC
- MOVQ %XMM0, %rcx
+ MOVQ %XMM0, %SET_REG64
+#endif
+#if VEC_SIZE <= 16
+L(less_vec_no_vdup):
#endif
cmpl $8, %edx
jge L(between_8_15)
@@ -395,7 +455,7 @@ L(cross_page):
cmpl $1, %edx
jg L(between_2_3)
jl L(between_0_0)
- movb %sil, (%LESS_VEC_REG)
+ movb %SET_REG8, (%LESS_VEC_REG)
L(between_0_0):
ret
@@ -428,8 +488,8 @@ L(between_8_15):
MOVQ %XMM0, (%rdi)
MOVQ %XMM0, -8(%rdi, %rdx)
#else
- movq %rcx, (%LESS_VEC_REG)
- movq %rcx, -8(%LESS_VEC_REG, %rdx)
+ movq %SET_REG64, (%LESS_VEC_REG)
+ movq %SET_REG64, -8(%LESS_VEC_REG, %rdx)
#endif
ret
@@ -442,8 +502,8 @@ L(between_4_7):
MOVD %XMM0, (%rdi)
MOVD %XMM0, -4(%rdi, %rdx)
#else
- movl %ecx, (%LESS_VEC_REG)
- movl %ecx, -4(%LESS_VEC_REG, %rdx)
+ movl %SET_REG32, (%LESS_VEC_REG)
+ movl %SET_REG32, -4(%LESS_VEC_REG, %rdx)
#endif
ret
@@ -452,12 +512,12 @@ L(between_4_7):
L(between_2_3):
/* From 2 to 3. No branch when size == 2. */
#ifdef USE_XMM_LESS_VEC
- movb %sil, (%rdi)
- movb %sil, 1(%rdi)
- movb %sil, -1(%rdi, %rdx)
+ movb %SET_REG8, (%rdi)
+ movb %SET_REG8, 1(%rdi)
+ movb %SET_REG8, -1(%rdi, %rdx)
#else
- movw %cx, (%LESS_VEC_REG)
- movb %sil, -1(%LESS_VEC_REG, %rdx)
+ movw %SET_REG16, (%LESS_VEC_REG)
+ movb %SET_REG8, -1(%LESS_VEC_REG, %rdx)
#endif
ret
END (MEMSET_SYMBOL (__memset, unaligned_erms))

View File

@ -0,0 +1,29 @@
commit 70509f9b4807295b2b4b43bffe110580fc0381ef
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sat Feb 12 00:45:00 2022 -0600
x86: Set .text section in memset-vec-unaligned-erms
commit 3d9f171bfb5325bd5f427e9fc386453358c6e840
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Feb 7 05:55:15 2022 -0800
x86-64: Optimize bzero
Remove setting the .text section for the code. This commit
adds that back.
(cherry picked from commit 7912236f4a597deb092650ca79f33504ddb4af28)
diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
index 39a096a594ccb5b6..d9c577fb5ff9700f 100644
--- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
@@ -114,6 +114,7 @@
# error SECTION is not defined!
#endif
+ .section SECTION(.text), "ax", @progbits
#if IS_IN (libc)
# if defined SHARED
ENTRY_CHK (WMEMSET_CHK_SYMBOL (__wmemset_chk, unaligned))

View File

@ -0,0 +1,76 @@
commit 5373c90f2ea3c3fa9931a684c9b81c648dfbe8d7
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Tue Feb 15 20:27:21 2022 -0600
x86: Fix bug in strncmp-evex and strncmp-avx2 [BZ #28895]
Logic can read before the start of `s1` / `s2` if both `s1` and `s2`
are near the start of a page. To avoid having the result contimated by
these comparisons the `strcmp` variants would mask off these
comparisons. This was missing in the `strncmp` variants causing
the bug. This commit adds the masking to `strncmp` so that out of
range comparisons don't affect the result.
test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass as
well a full xcheck on x86_64 linux.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit e108c02a5e23c8c88ce66d8705d4a24bb6b9a8bf)
diff --git a/string/test-strncmp.c b/string/test-strncmp.c
index 97e831d88fd24316..56e23670ae7f90e4 100644
--- a/string/test-strncmp.c
+++ b/string/test-strncmp.c
@@ -438,13 +438,23 @@ check3 (void)
static void
check4 (void)
{
- const CHAR *s1 = L ("abc");
- CHAR *s2 = STRDUP (s1);
+ /* To trigger bug 28895; We need 1) both s1 and s2 to be within 32 bytes of
+ the end of the page. 2) For there to be no mismatch/null byte before the
+ first page cross. 3) For length (`n`) to be large enough for one string to
+ cross the page. And 4) for there to be either mismatch/null bytes before
+ the start of the strings. */
+
+ size_t size = 10;
+ size_t addr_mask = (getpagesize () - 1) ^ (sizeof (CHAR) - 1);
+ CHAR *s1 = (CHAR *)(buf1 + (addr_mask & 0xffa));
+ CHAR *s2 = (CHAR *)(buf2 + (addr_mask & 0xfed));
+ int exp_result;
+ STRCPY (s1, L ("tst-tlsmod%"));
+ STRCPY (s2, L ("tst-tls-manydynamic73mod"));
+ exp_result = SIMPLE_STRNCMP (s1, s2, size);
FOR_EACH_IMPL (impl, 0)
- check_result (impl, s1, s2, SIZE_MAX, 0);
-
- free (s2);
+ check_result (impl, s1, s2, size, exp_result);
}
int
diff --git a/sysdeps/x86_64/multiarch/strcmp-avx2.S b/sysdeps/x86_64/multiarch/strcmp-avx2.S
index cdded412a70bad10..f9bdc5ccd03aa1f9 100644
--- a/sysdeps/x86_64/multiarch/strcmp-avx2.S
+++ b/sysdeps/x86_64/multiarch/strcmp-avx2.S
@@ -661,6 +661,7 @@ L(ret8):
# ifdef USE_AS_STRNCMP
.p2align 4,, 10
L(return_page_cross_end_check):
+ andl %r10d, %ecx
tzcntl %ecx, %ecx
leal -VEC_SIZE(%rax, %rcx), %ecx
cmpl %ecx, %edx
diff --git a/sysdeps/x86_64/multiarch/strcmp-evex.S b/sysdeps/x86_64/multiarch/strcmp-evex.S
index ed56af8ecdad48b2..0dfa62bd149c02b4 100644
--- a/sysdeps/x86_64/multiarch/strcmp-evex.S
+++ b/sysdeps/x86_64/multiarch/strcmp-evex.S
@@ -689,6 +689,7 @@ L(ret8):
# ifdef USE_AS_STRNCMP
.p2align 4,, 10
L(return_page_cross_end_check):
+ andl %r10d, %ecx
tzcntl %ecx, %ecx
leal -VEC_SIZE(%rax, %rcx, SIZE_OF_CHAR), %ecx
# ifdef USE_AS_WCSCMP

View File

@ -0,0 +1,71 @@
commit e123f08ad5ea4691bc37430ce536988c221332d6
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Thu Mar 24 15:50:33 2022 -0500
x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896]
Overflow case for __wcsncmp_avx2_rtm should be __wcscmp_avx2_rtm not
__wcscmp_avx2.
commit ddf0992cf57a93200e0c782e2a94d0733a5a0b87
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sun Jan 9 16:02:21 2022 -0600
x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755]
Set the wrong fallback function for `__wcsncmp_avx2_rtm`. It was set
to fallback on to `__wcscmp_avx2` instead of `__wcscmp_avx2_rtm` which
can cause spurious aborts.
This change will need to be backported.
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 9fef7039a7d04947bc89296ee0d187bc8d89b772)
diff --git a/sysdeps/x86/tst-strncmp-rtm.c b/sysdeps/x86/tst-strncmp-rtm.c
index aef9866cf2fbe774..ba6543be8ce13927 100644
--- a/sysdeps/x86/tst-strncmp-rtm.c
+++ b/sysdeps/x86/tst-strncmp-rtm.c
@@ -70,6 +70,16 @@ function_overflow (void)
return 1;
}
+__attribute__ ((noinline, noclone))
+static int
+function_overflow2 (void)
+{
+ if (STRNCMP (string1, string2, SIZE_MAX >> 4) == 0)
+ return 0;
+ else
+ return 1;
+}
+
static int
do_test (void)
{
@@ -77,5 +87,10 @@ do_test (void)
if (status != EXIT_SUCCESS)
return status;
status = do_test_1 (TEST_NAME, LOOP, prepare, function_overflow);
+ if (status != EXIT_SUCCESS)
+ return status;
+ status = do_test_1 (TEST_NAME, LOOP, prepare, function_overflow2);
+ if (status != EXIT_SUCCESS)
+ return status;
return status;
}
diff --git a/sysdeps/x86_64/multiarch/strcmp-avx2.S b/sysdeps/x86_64/multiarch/strcmp-avx2.S
index f9bdc5ccd03aa1f9..09a73942086f9c9f 100644
--- a/sysdeps/x86_64/multiarch/strcmp-avx2.S
+++ b/sysdeps/x86_64/multiarch/strcmp-avx2.S
@@ -122,7 +122,7 @@ ENTRY(STRCMP)
are cases where length is large enough that it can never be a
bound on valid memory so just use wcscmp. */
shrq $56, %rcx
- jnz __wcscmp_avx2
+ jnz OVERFLOW_STRCMP
leaq (, %rdx, 4), %rdx
# endif

View File

@ -0,0 +1,170 @@
commit e4a2fb76efb45210c541ee3f8ef32f317783c3a8
Author: Florian Weimer <fweimer@redhat.com>
Date: Wed May 11 20:30:49 2022 +0200
manual: Document the dlinfo function
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@rehdat.com>
(cherry picked from commit 93804a1ee084d4bdc620b2b9f91615c7da0fabe1)
Also includes partial backport of commit 5d28a8962dcb6ec056b81d730e
(the addition of manual/dynlink.texi).
diff --git a/manual/Makefile b/manual/Makefile
index e83444341e282916..31678681ef059e0f 100644
--- a/manual/Makefile
+++ b/manual/Makefile
@@ -39,7 +39,7 @@ chapters = $(addsuffix .texi, \
pipe socket terminal syslog math arith time \
resource setjmp signal startup process ipc job \
nss users sysinfo conf crypt debug threads \
- probes tunables)
+ dynlink probes tunables)
appendices = lang.texi header.texi install.texi maint.texi platform.texi \
contrib.texi
licenses = freemanuals.texi lgpl-2.1.texi fdl-1.3.texi
diff --git a/manual/dynlink.texi b/manual/dynlink.texi
new file mode 100644
index 0000000000000000..dbf3de11769d8e57
--- /dev/null
+++ b/manual/dynlink.texi
@@ -0,0 +1,100 @@
+@node Dynamic Linker
+@c @node Dynamic Linker, Internal Probes, Threads, Top
+@c %MENU% Loading programs and shared objects.
+@chapter Dynamic Linker
+@cindex dynamic linker
+@cindex dynamic loader
+
+The @dfn{dynamic linker} is responsible for loading dynamically linked
+programs and their dependencies (in the form of shared objects). The
+dynamic linker in @theglibc{} also supports loading shared objects (such
+as plugins) later at run time.
+
+Dynamic linkers are sometimes called @dfn{dynamic loaders}.
+
+@menu
+* Dynamic Linker Introspection:: Interfaces for querying mapping information.
+@end menu
+
+@node Dynamic Linker Introspection
+@section Dynamic Linker Introspection
+
+@Theglibc{} provides various functions for querying information from the
+dynamic linker.
+
+@deftypefun {int} dlinfo (void *@var{handle}, int @var{request}, void *@var{arg})
+@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{}}}
+@standards{GNU, dlfcn.h}
+This function returns information about @var{handle} in the memory
+location @var{arg}, based on @var{request}. The @var{handle} argument
+must be a pointer returned by @code{dlopen} or @code{dlmopen}; it must
+not have been closed by @code{dlclose}.
+
+On success, @code{dlinfo} returns 0. If there is an error, the function
+returns @math{-1}, and @code{dlerror} can be used to obtain a
+corresponding error message.
+
+The following operations are defined for use with @var{request}:
+
+@vtable @code
+@item RTLD_DI_LINKMAP
+The corresponding @code{struct link_map} pointer for @var{handle} is
+written to @code{*@var{arg}}. The @var{arg} argument must be the
+address of an object of type @code{struct link_map *}.
+
+@item RTLD_DI_LMID
+The namespace identifier of @var{handle} is written to
+@code{*@var{arg}}. The @var{arg} argument must be the address of an
+object of type @code{Lmid_t}.
+
+@item RTLD_DI_ORIGIN
+The value of the @code{$ORIGIN} dynamic string token for @var{handle} is
+written to the character array starting at @var{arg} as a
+null-terminated string.
+
+This request type should not be used because it is prone to buffer
+overflows.
+
+@item RTLD_DI_SERINFO
+@itemx RTLD_DI_SERINFOSIZE
+These requests can be used to obtain search path information for
+@var{handle}. For both requests, @var{arg} must point to a
+@code{Dl_serinfo} object. The @code{RTLD_DI_SERINFOSIZE} request must
+be made first; it updates the @code{dls_size} and @code{dls_cnt} members
+of the @code{Dl_serinfo} object. The caller should then allocate memory
+to store at least @code{dls_size} bytes and pass that buffer to a
+@code{RTLD_DI_SERINFO} request. This second request fills the
+@code{dls_serpath} array. The number of array elements was returned in
+the @code{dls_cnt} member in the initial @code{RTLD_DI_SERINFOSIZE}
+request. The caller is responsible for freeing the allocated buffer.
+
+This interface is prone to buffer overflows in multi-threaded processes
+because the required size can change between the
+@code{RTLD_DI_SERINFOSIZE} and @code{RTLD_DI_SERINFO} requests.
+
+@item RTLD_DI_TLS_DATA
+This request writes the address of the TLS block (in the current thread)
+for the shared object identified by @var{handle} to @code{*@var{arg}}.
+The argument @var{arg} must be the address of an object of type
+@code{void *}. A null pointer is written if the object does not have
+any associated TLS block.
+
+@item RTLD_DI_TLS_MODID
+This request writes the TLS module ID for the shared object @var{handle}
+to @code{*@var{arg}}. The argument @var{arg} must be the address of an
+object of type @code{size_t}. The module ID is zero if the object
+does not have an associated TLS block.
+@end vtable
+
+The @code{dlinfo} function is a GNU extension.
+@end deftypefun
+
+@c FIXME these are undocumented:
+@c dladdr
+@c dladdr1
+@c dlclose
+@c dlerror
+@c dlmopen
+@c dlopen
+@c dlsym
+@c dlvsym
diff --git a/manual/libdl.texi b/manual/libdl.texi
deleted file mode 100644
index e3fe0452d9f41d47..0000000000000000
--- a/manual/libdl.texi
+++ /dev/null
@@ -1,10 +0,0 @@
-@c FIXME these are undocumented:
-@c dladdr
-@c dladdr1
-@c dlclose
-@c dlerror
-@c dlinfo
-@c dlmopen
-@c dlopen
-@c dlsym
-@c dlvsym
diff --git a/manual/probes.texi b/manual/probes.texi
index 4aae76b81921f347..ee019e651706f492 100644
--- a/manual/probes.texi
+++ b/manual/probes.texi
@@ -1,5 +1,5 @@
@node Internal Probes
-@c @node Internal Probes, Tunables, Threads, Top
+@c @node Internal Probes, Tunables, Dynamic Linker, Top
@c %MENU% Probes to monitor libc internal behavior
@chapter Internal probes
diff --git a/manual/threads.texi b/manual/threads.texi
index 06b6b277a1228af1..7f166bfa87e88c36 100644
--- a/manual/threads.texi
+++ b/manual/threads.texi
@@ -1,5 +1,5 @@
@node Threads
-@c @node Threads, Internal Probes, Debugging Support, Top
+@c @node Threads, Dynamic Linker, Debugging Support, Top
@c %MENU% Functions, constants, and data types for working with threads
@chapter Threads
@cindex threads

View File

@ -0,0 +1,256 @@
commit 91c2e6c3db44297bf4cb3a2e3c40236c5b6a0b23
Author: Florian Weimer <fweimer@redhat.com>
Date: Fri Apr 29 17:00:53 2022 +0200
dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo
The information is theoretically available via dl_iterate_phdr as
well, but that approach is very slow if there are many shared
objects.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@rehdat.com>
(cherry picked from commit d056c212130280c0a54d9a4f72170ec621b70ce5)
diff --git a/dlfcn/Makefile b/dlfcn/Makefile
index 6bbfbb8344da05cb..d3965427dabed898 100644
--- a/dlfcn/Makefile
+++ b/dlfcn/Makefile
@@ -73,6 +73,10 @@ tststatic3-ENV = $(tststatic-ENV)
tststatic4-ENV = $(tststatic-ENV)
tststatic5-ENV = $(tststatic-ENV)
+tests-internal += \
+ tst-dlinfo-phdr \
+ # tests-internal
+
ifneq (,$(CXX))
modules-names += bug-atexit3-lib
else
diff --git a/dlfcn/dlfcn.h b/dlfcn/dlfcn.h
index 4a3b870a487ea789..24388cfedae4dd67 100644
--- a/dlfcn/dlfcn.h
+++ b/dlfcn/dlfcn.h
@@ -162,7 +162,12 @@ enum
segment, or if the calling thread has not allocated a block for it. */
RTLD_DI_TLS_DATA = 10,
- RTLD_DI_MAX = 10
+ /* Treat ARG as const ElfW(Phdr) **, and store the address of the
+ program header array at that location. The dlinfo call returns
+ the number of program headers in the array. */
+ RTLD_DI_PHDR = 11,
+
+ RTLD_DI_MAX = 11
};
diff --git a/dlfcn/dlinfo.c b/dlfcn/dlinfo.c
index 47d2daa96fa5986f..1842925fb7c594dd 100644
--- a/dlfcn/dlinfo.c
+++ b/dlfcn/dlinfo.c
@@ -28,6 +28,10 @@ struct dlinfo_args
void *handle;
int request;
void *arg;
+
+ /* This is the value that is returned from dlinfo if no error is
+ signaled. */
+ int result;
};
static void
@@ -40,6 +44,7 @@ dlinfo_doit (void *argsblock)
{
case RTLD_DI_CONFIGADDR:
default:
+ args->result = -1;
_dl_signal_error (0, NULL, NULL, N_("unsupported dlinfo request"));
break;
@@ -75,6 +80,11 @@ dlinfo_doit (void *argsblock)
*(void **) args->arg = data;
break;
}
+
+ case RTLD_DI_PHDR:
+ *(const ElfW(Phdr) **) args->arg = l->l_phdr;
+ args->result = l->l_phnum;
+ break;
}
}
@@ -82,7 +92,8 @@ static int
dlinfo_implementation (void *handle, int request, void *arg)
{
struct dlinfo_args args = { handle, request, arg };
- return _dlerror_run (&dlinfo_doit, &args) ? -1 : 0;
+ _dlerror_run (&dlinfo_doit, &args);
+ return args.result;
}
#ifdef SHARED
diff --git a/dlfcn/tst-dlinfo-phdr.c b/dlfcn/tst-dlinfo-phdr.c
new file mode 100644
index 0000000000000000..a15a7d48ebd3b976
--- /dev/null
+++ b/dlfcn/tst-dlinfo-phdr.c
@@ -0,0 +1,125 @@
+/* Test for dlinfo (RTLD_DI_PHDR).
+ Copyright (C) 2022 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <dlfcn.h>
+#include <link.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/auxv.h>
+
+#include <support/check.h>
+#include <support/xdlfcn.h>
+
+/* Used to verify that the program header array appears as expected
+ among the dl_iterate_phdr callback invocations. */
+
+struct dlip_callback_args
+{
+ struct link_map *l; /* l->l_addr is used to find the object. */
+ const ElfW(Phdr) *phdr; /* Expected program header pointed. */
+ int phnum; /* Expected program header count. */
+ bool found; /* True if l->l_addr has been found. */
+};
+
+static int
+dlip_callback (struct dl_phdr_info *dlpi, size_t size, void *closure)
+{
+ TEST_COMPARE (sizeof (*dlpi), size);
+ struct dlip_callback_args *args = closure;
+
+ if (dlpi->dlpi_addr == args->l->l_addr)
+ {
+ TEST_VERIFY (!args->found);
+ args->found = true;
+ TEST_VERIFY (args->phdr == dlpi->dlpi_phdr);
+ TEST_COMPARE (args->phnum, dlpi->dlpi_phnum);
+ }
+
+ return 0;
+}
+
+static int
+do_test (void)
+{
+ /* Avoid a copy relocation. */
+ struct r_debug *debug = xdlsym (RTLD_DEFAULT, "_r_debug");
+ struct link_map *l = (struct link_map *) debug->r_map;
+ TEST_VERIFY_EXIT (l != NULL);
+
+ do
+ {
+ printf ("info: checking link map %p (%p) for \"%s\"\n",
+ l, l->l_phdr, l->l_name);
+
+ /* Cause dlerror () to return an error message. */
+ dlsym (RTLD_DEFAULT, "does-not-exist");
+
+ /* Use the extension that link maps are valid dlopen handles. */
+ const ElfW(Phdr) *phdr;
+ int phnum = dlinfo (l, RTLD_DI_PHDR, &phdr);
+ TEST_VERIFY (phnum >= 0);
+ /* Verify that the error message has been cleared. */
+ TEST_COMPARE_STRING (dlerror (), NULL);
+
+ TEST_VERIFY (phdr == l->l_phdr);
+ TEST_COMPARE (phnum, l->l_phnum);
+
+ /* Check that we can find PT_DYNAMIC among the array. */
+ {
+ bool dynamic_found = false;
+ for (int i = 0; i < phnum; ++i)
+ if (phdr[i].p_type == PT_DYNAMIC)
+ {
+ dynamic_found = true;
+ TEST_COMPARE ((ElfW(Addr)) l->l_ld, l->l_addr + phdr[i].p_vaddr);
+ }
+ TEST_VERIFY (dynamic_found);
+ }
+
+ /* Check that dl_iterate_phdr finds the link map with the same
+ program headers. */
+ {
+ struct dlip_callback_args args =
+ {
+ .l = l,
+ .phdr = phdr,
+ .phnum = phnum,
+ .found = false,
+ };
+ TEST_COMPARE (dl_iterate_phdr (dlip_callback, &args), 0);
+ TEST_VERIFY (args.found);
+ }
+
+ if (l->l_prev == NULL)
+ {
+ /* This is the executable, so the information is also
+ available via getauxval. */
+ TEST_COMPARE_STRING (l->l_name, "");
+ TEST_VERIFY (phdr == (const ElfW(Phdr) *) getauxval (AT_PHDR));
+ TEST_COMPARE (phnum, getauxval (AT_PHNUM));
+ }
+
+ l = l->l_next;
+ }
+ while (l != NULL);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/manual/dynlink.texi b/manual/dynlink.texi
index dbf3de11769d8e57..7dcac64889e389fd 100644
--- a/manual/dynlink.texi
+++ b/manual/dynlink.texi
@@ -30,9 +30,9 @@ location @var{arg}, based on @var{request}. The @var{handle} argument
must be a pointer returned by @code{dlopen} or @code{dlmopen}; it must
not have been closed by @code{dlclose}.
-On success, @code{dlinfo} returns 0. If there is an error, the function
-returns @math{-1}, and @code{dlerror} can be used to obtain a
-corresponding error message.
+On success, @code{dlinfo} returns 0 for most request types; exceptions
+are noted below. If there is an error, the function returns @math{-1},
+and @code{dlerror} can be used to obtain a corresponding error message.
The following operations are defined for use with @var{request}:
@@ -84,6 +84,15 @@ This request writes the TLS module ID for the shared object @var{handle}
to @code{*@var{arg}}. The argument @var{arg} must be the address of an
object of type @code{size_t}. The module ID is zero if the object
does not have an associated TLS block.
+
+@item RTLD_DI_PHDR
+This request writes the address of the program header array to
+@code{*@var{arg}}. The argument @var{arg} must be the address of an
+object of type @code{const ElfW(Phdr) *} (that is,
+@code{const Elf32_Phdr *} or @code{const Elf64_Phdr *}, as appropriate
+for the current architecture). For this request, the value returned by
+@code{dlinfo} is the number of program headers in the program header
+array.
@end vtable
The @code{dlinfo} function is a GNU extension.

View File

@ -148,7 +148,7 @@ end \
Summary: The GNU libc libraries Summary: The GNU libc libraries
Name: glibc Name: glibc
Version: %{glibcversion} Version: %{glibcversion}
Release: 32%{?dist} Release: 33%{?dist}
# In general, GPLv2+ is used by programs, LGPLv2+ is used for # In general, GPLv2+ is used by programs, LGPLv2+ is used for
# libraries. # libraries.
@ -461,6 +461,28 @@ Patch253: glibc-upstream-2.34-187.patch
Patch254: glibc-upstream-2.34-188.patch Patch254: glibc-upstream-2.34-188.patch
Patch255: glibc-upstream-2.34-189.patch Patch255: glibc-upstream-2.34-189.patch
Patch256: glibc-upstream-2.34-190.patch Patch256: glibc-upstream-2.34-190.patch
Patch257: glibc-upstream-2.34-191.patch
Patch258: glibc-upstream-2.34-192.patch
Patch259: glibc-upstream-2.34-193.patch
Patch260: glibc-upstream-2.34-194.patch
Patch261: glibc-upstream-2.34-195.patch
Patch262: glibc-upstream-2.34-196.patch
Patch263: glibc-upstream-2.34-197.patch
Patch264: glibc-upstream-2.34-198.patch
Patch265: glibc-upstream-2.34-199.patch
Patch266: glibc-upstream-2.34-200.patch
Patch267: glibc-upstream-2.34-201.patch
Patch268: glibc-upstream-2.34-202.patch
Patch269: glibc-upstream-2.34-203.patch
Patch270: glibc-upstream-2.34-204.patch
Patch271: glibc-upstream-2.34-205.patch
Patch272: glibc-upstream-2.34-206.patch
Patch273: glibc-upstream-2.34-207.patch
Patch274: glibc-upstream-2.34-208.patch
Patch275: glibc-upstream-2.34-209.patch
Patch276: glibc-upstream-2.34-210.patch
Patch277: glibc-upstream-2.34-211.patch
Patch278: glibc-upstream-2.34-212.patch
############################################################################## ##############################################################################
# Continued list of core "glibc" package information: # Continued list of core "glibc" package information:
@ -2517,6 +2539,32 @@ fi
%files -f compat-libpthread-nonshared.filelist -n compat-libpthread-nonshared %files -f compat-libpthread-nonshared.filelist -n compat-libpthread-nonshared
%changelog %changelog
* Thu May 12 2022 Florian Weimer <fweimer@redhat.com> - 2.34-33
- Sync with upstream branch release/2.34/master,
commit 91c2e6c3db44297bf4cb3a2e3c40236c5b6a0b23:
- dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo
- manual: Document the dlinfo function
- x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896]
- x86: Fix bug in strncmp-evex and strncmp-avx2 [BZ #28895]
- x86: Set .text section in memset-vec-unaligned-erms
- x86-64: Optimize bzero
- x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only)
- x86: Improve vec generation in memset-vec-unaligned-erms.S
- x86-64: Fix strcmp-evex.S
- x86-64: Fix strcmp-avx2.S
- x86: Optimize strcmp-evex.S
- x86: Optimize strcmp-avx2.S
- manual: Clarify that abbreviations of long options are allowed
- Add HWCAP2_AFP, HWCAP2_RPRES from Linux 5.17 to AArch64 bits/hwcap.h
- aarch64: Add HWCAP2_ECV from Linux 5.16
- Add SOL_MPTCP, SOL_MCTP from Linux 5.16 to bits/socket.h
- Update kernel version to 5.17 in tst-mman-consts.py
- Update kernel version to 5.16 in tst-mman-consts.py
- Update syscall lists for Linux 5.17
- Add ARPHRD_CAN, ARPHRD_MCTP to net/if_arp.h
- Update kernel version to 5.15 in tst-mman-consts.py
- Add PF_MCTP, AF_MCTP from Linux 5.15 to bits/socket.h
* Thu Apr 28 2022 Carlos O'Donell <carlos@redhat.com> - 2.34-32 * Thu Apr 28 2022 Carlos O'Donell <carlos@redhat.com> - 2.34-32
- Sync with upstream branch release/2.34/master, - Sync with upstream branch release/2.34/master,
commit c66c92181ddbd82306537a608e8c0282587131de: commit c66c92181ddbd82306537a608e8c0282587131de: