7829fb09a2
In commit 0b053c9518
("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.
While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.
Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.
The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.
It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:
static inline void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
barrier_data(s);
}
int main(void)
{
char buff[20];
memzero_explicit(buff, sizeof(buff));
return 0;
}
$ gcc -O2 test.c
$ gdb a.out
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400400 <+0>: lea -0x28(%rsp),%rax
0x0000000000400405 <+5>: movq $0x0,-0x28(%rsp)
0x000000000040040e <+14>: movq $0x0,-0x20(%rsp)
0x0000000000400417 <+23>: movl $0x0,-0x18(%rsp)
0x000000000040041f <+31>: xor %eax,%eax
0x0000000000400421 <+33>: retq
End of assembler dump.
$ clang -O2 test.c
$ gdb a.out
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004004f0 <+0>: xorps %xmm0,%xmm0
0x00000000004004f3 <+3>: movaps %xmm0,-0x18(%rsp)
0x00000000004004f8 <+8>: movl $0x0,-0x8(%rsp)
0x0000000000400500 <+16>: lea -0x18(%rsp),%rax
0x0000000000400505 <+21>: xor %eax,%eax
0x0000000000400507 <+23>: retq
End of assembler dump.
As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.
Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller <smueller@chronox.de>
Tested-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: mancha security <mancha1@zoho.com>
Cc: Mark Charlebois <charlebm@gmail.com>
Cc: Behan Webster <behanw@converseincode.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
44 lines
1.1 KiB
C
44 lines
1.1 KiB
C
#ifndef __LINUX_COMPILER_H
|
|
#error "Please don't include <linux/compiler-intel.h> directly, include <linux/compiler.h> instead."
|
|
#endif
|
|
|
|
#ifdef __ECC
|
|
|
|
/* Some compiler specific definitions are overwritten here
|
|
* for Intel ECC compiler
|
|
*/
|
|
|
|
#include <asm/intrinsics.h>
|
|
|
|
/* Intel ECC compiler doesn't support gcc specific asm stmts.
|
|
* It uses intrinsics to do the equivalent things.
|
|
*/
|
|
#undef barrier_data
|
|
#undef RELOC_HIDE
|
|
#undef OPTIMIZER_HIDE_VAR
|
|
|
|
#define barrier_data(ptr) barrier()
|
|
|
|
#define RELOC_HIDE(ptr, off) \
|
|
({ unsigned long __ptr; \
|
|
__ptr = (unsigned long) (ptr); \
|
|
(typeof(ptr)) (__ptr + (off)); })
|
|
|
|
/* This should act as an optimization barrier on var.
|
|
* Given that this compiler does not have inline assembly, a compiler barrier
|
|
* is the best we can do.
|
|
*/
|
|
#define OPTIMIZER_HIDE_VAR(var) barrier()
|
|
|
|
/* Intel ECC compiler doesn't support __builtin_types_compatible_p() */
|
|
#define __must_be_array(a) 0
|
|
|
|
#endif
|
|
|
|
#ifndef __HAVE_BUILTIN_BSWAP16__
|
|
/* icc has this, but it's called _bswap16 */
|
|
#define __HAVE_BUILTIN_BSWAP16__
|
|
#define __builtin_bswap16 _bswap16
|
|
#endif
|
|
|