glibc/glibc-upstream-2.34-290.patch
Arjun Shankar b68cfb3330 Sync with upstream branch release/2.34/master
Upstream commit: b2f32e746492615a6eb3e66fac1e766e32e8deb1

- malloc: Simplify implementation of __malloc_assert
- Update syscall-names.list for Linux 5.18
- x86: Add missing IS_IN (libc) check to strncmp-sse4_2.S
- x86: Move mem{p}{mov|cpy}_{chk_}erms to its own file
- x86: Move and slightly improve memset_erms
- x86: Add definition for __wmemset_chk AVX2 RTM in ifunc impl list
- x86: Put wcs{n}len-sse4.1 in the sse4.1 text section
- x86: Align entry for memrchr to 64-bytes.
- x86: Add BMI1/BMI2 checks for ISA_V3 check
- x86: Cleanup bounds checking in large memcpy case
- x86: Add bounds `x86_non_temporal_threshold`
- x86: Add sse42 implementation to strcmp's ifunc
- x86: Fix misordered logic for setting `rep_movsb_stop_threshold`
- x86: Align varshift table to 32-bytes
- x86: ZERO_UPPER_VEC_REGISTERS_RETURN_XTEST expect no transactions
- x86: Shrink code size of memchr-evex.S
- x86: Shrink code size of memchr-avx2.S
- x86: Optimize memrchr-avx2.S
- x86: Optimize memrchr-evex.S
- x86: Optimize memrchr-sse2.S
- x86: Add COND_VZEROUPPER that can replace vzeroupper if no `ret`
- x86: Create header for VEC classes in x86 strings library
- x86_64: Add strstr function with 512-bit EVEX
- x86-64: Ignore r_addend for R_X86_64_GLOB_DAT/R_X86_64_JUMP_SLOT
- x86_64: Implement evex512 version of strlen, strnlen, wcslen and wcsnlen
- x86_64: Remove bzero optimization
- x86_64: Remove end of line trailing spaces
- nptl: Fix ___pthread_unregister_cancel_restore asynchronous restore
- linux: Fix mq_timereceive check for 32 bit fallback code (BZ 29304)
2022-07-22 13:20:49 +02:00

57 lines
2.5 KiB
Diff

commit 6e008c884dad5a25f91085c68d044bb5e2d63761
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Tue Jun 14 13:50:11 2022 -0700
x86: Fix misordered logic for setting `rep_movsb_stop_threshold`
Move the setting of `rep_movsb_stop_threshold` to after the tunables
have been collected so that the `rep_movsb_stop_threshold` (which
is used to redirect control flow to the non_temporal case) will
use any user value for `non_temporal_threshold` (set using
glibc.cpu.x86_non_temporal_threshold)
(cherry picked from commit 035591551400cfc810b07244a015c9411e8bff7c)
diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
index 2e43e67e4f4037d3..560bf260e8fbd7bf 100644
--- a/sysdeps/x86/dl-cacheinfo.h
+++ b/sysdeps/x86/dl-cacheinfo.h
@@ -898,18 +898,6 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
if (CPU_FEATURE_USABLE_P (cpu_features, FSRM))
rep_movsb_threshold = 2112;
- unsigned long int rep_movsb_stop_threshold;
- /* ERMS feature is implemented from AMD Zen3 architecture and it is
- performing poorly for data above L2 cache size. Henceforth, adding
- an upper bound threshold parameter to limit the usage of Enhanced
- REP MOVSB operations and setting its value to L2 cache size. */
- if (cpu_features->basic.kind == arch_kind_amd)
- rep_movsb_stop_threshold = core;
- /* Setting the upper bound of ERMS to the computed value of
- non-temporal threshold for architectures other than AMD. */
- else
- rep_movsb_stop_threshold = non_temporal_threshold;
-
/* The default threshold to use Enhanced REP STOSB. */
unsigned long int rep_stosb_threshold = 2048;
@@ -951,6 +939,18 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
SIZE_MAX);
#endif
+ unsigned long int rep_movsb_stop_threshold;
+ /* ERMS feature is implemented from AMD Zen3 architecture and it is
+ performing poorly for data above L2 cache size. Henceforth, adding
+ an upper bound threshold parameter to limit the usage of Enhanced
+ REP MOVSB operations and setting its value to L2 cache size. */
+ if (cpu_features->basic.kind == arch_kind_amd)
+ rep_movsb_stop_threshold = core;
+ /* Setting the upper bound of ERMS to the computed value of
+ non-temporal threshold for architectures other than AMD. */
+ else
+ rep_movsb_stop_threshold = non_temporal_threshold;
+
cpu_features->data_cache_size = data;
cpu_features->shared_cache_size = shared;
cpu_features->non_temporal_threshold = non_temporal_threshold;