kernel-ark/kernel
Richard Guy Briggs 32a1dbaece audit: try harder to send to auditd upon netlink failure
There are several reports of the kernel losing contact with auditd when
it is, in fact, still running.  When this happens, kernel syslogs show:
	"audit: *NO* daemon at audit_pid=<pid>"
although auditd is still running, and is apparently happy, listening on
the netlink socket. The pid in the "*NO* daemon" message matches the pid
of the running auditd process.  Restarting auditd solves this.

The problem appears to happen randomly, and doesn't seem to be strongly
correlated to the rate of audit events being logged.  The problem
happens fairly regularly (every few days), but not yet reproduced to
order.

On production kernels, BUG_ON() is a no-op, so any error will trigger
this.

Commit 34eab0a7cd ("audit: prevent an older auditd shutdown from
orphaning a newer auditd startup") eliminates one possible cause.  This
isn't the case here, since the PID in the error message and the PID of
the running auditd match.

The primary expected cause of error here is -ECONNREFUSED when the audit
daemon goes away, when netlink_getsockbyportid() can't find the auditd
portid entry in the netlink audit table (or there is no receive
function).  If -EPERM is returned, that situation isn't likely to be
resolved in a timely fashion without administrator intervention.  In
both cases, reset the audit_pid.  This does not rule out a race
condition.  SELinux is expected to return zero since this isn't an INET
or INET6 socket.  Other LSMs may have other return codes.  Log the error
code for better diagnosis in the future.

In the case of -ENOMEM, the situation could be temporary, based on local
or general availability of buffers.  -EAGAIN should never happen since
the netlink audit (kernel) socket is set to MAX_SCHEDULE_TIMEOUT.
-ERESTARTSYS and -EINTR are not expected since this kernel thread is not
expected to receive signals.  In these cases (or any other unexpected
ones for now), report the error and re-schedule the thread, retrying up
to 5 times.

v2:
	Removed BUG_ON().
	Moved comma in pr_*() statements.
	Removed audit_strerror() text.

Reported-by: Vipin Rathor <v.rathor@gmail.com>
Reported-by: <ctcard@hotmail.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: applied rgb's fixup patch to correct audit_log_lost() format issues]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-11-04 08:23:50 -05:00
..
bpf bpf: fix out of bounds access in verifier log 2015-09-09 14:11:55 -07:00
configs kconfig: add xenconfig defconfig helper 2015-06-16 11:04:29 +01:00
debug debug: prevent entering debug mode on panic/exception. 2015-02-19 12:39:03 -06:00
events perf: Fix races in computing the header sizes 2015-09-18 09:20:26 +02:00
gcov gcov: add support for GCC 5.1 2015-06-30 19:44:57 -07:00
irq genirq/msi: Do not use pci_msi_[un]mask_irq as default methods 2015-10-16 12:40:43 +02:00
livepatch livepatch: Improve error handling in klp_disable_func() 2015-07-14 22:48:06 +02:00
locking locking/lockdep: Fix hlock->pin_count reset on lock stack rebuilds 2015-09-23 09:48:53 +02:00
power Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block 2015-09-02 13:10:25 -07:00
printk kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
rcu rcu: Suppress lockdep false positive for rcp->exp_funnel_mutex 2015-09-20 21:01:22 -07:00
sched sched/core: Add missing lockdep_unpin() annotations 2015-10-23 12:02:10 +02:00
time timekeeping: Increment clock_was_set_seq in timekeeping_init() 2015-10-16 15:50:22 +02:00
trace tracing: Do not allow stack_tracer to record stack in NMI 2015-10-20 21:52:23 -04:00
.gitignore
acct.c acct: check FMODE_CAN_WRITE 2015-04-11 22:27:55 -04:00
async.c
audit_fsnotify.c audit: clean simple fsnotify implementation 2015-08-06 16:14:53 -04:00
audit_tree.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
audit_watch.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
audit.c audit: try harder to send to auditd upon netlink failure 2015-11-04 08:23:50 -05:00
audit.h Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
auditfilter.c audit: implement audit by executable 2015-08-06 16:17:25 -04:00
auditsc.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
backtracetest.c
bounds.c
capability.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
cgroup_freezer.c cgroup: allow a cgroup subsystem to reject a fork 2015-07-14 17:29:23 -04:00
cgroup_pids.c cgroup: pids: fix invalid get/put usage 2015-08-25 14:19:25 -04:00
cgroup.c Revert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem" 2015-09-16 11:51:12 -04:00
compat.c compat: cleanup coding in compat_get_bitmap() and compat_put_bitmap() 2015-06-04 23:57:18 +02:00
configs.c
context_tracking.c context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-05-07 12:02:51 +02:00
cpu_pm.c kernel/cpu_pm: fix cpu_cluster_pm_exit comment 2015-09-03 02:42:20 +02:00
cpu.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-08-31 20:26:22 -07:00
cpuset.c cpuset: use trialcs->mems_allowed as a temp variable 2015-08-10 11:18:41 -04:00
crash_dump.c
cred.c kernel/cred.c: remove unnecessary kdebug atomic reads 2015-09-10 13:29:01 -07:00
delayacct.c
dma.c
elfcore.c
exec_domain.c Remove rest of exec domains. 2015-04-12 21:03:31 +02:00
exit.c kernel: exit: fix typo in comment 2015-08-07 13:59:49 +02:00
extable.c kernel/extable.c: remove duplicated include 2015-09-10 13:29:01 -07:00
fork.c Revert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem" 2015-09-16 11:51:12 -04:00
freezer.c
futex_compat.c
futex.c futex: Make should_fail_futex() static 2015-07-20 21:43:54 +02:00
groups.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
hung_task.c kernel/hung_task.c: change hung_task.c to use for_each_process_thread() 2015-04-15 16:35:22 -07:00
irq_work.c
jump_label.c locking/static_keys: Add selftest 2015-08-03 11:34:16 +02:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS 2015-05-12 09:46:00 +02:00
Kconfig.preempt
kexec_core.c kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo 2015-09-10 13:29:01 -07:00
kexec_file.c kexec: split kexec_file syscall code to kexec_file.c 2015-09-10 13:29:01 -07:00
kexec_internal.h kexec: split kexec_file syscall code to kexec_file.c 2015-09-10 13:29:01 -07:00
kexec.c kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
kmod.c kmod: don't run async usermode helper as a child of kworker thread 2015-10-23 17:55:10 +09:00
kprobes.c perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safe 2015-08-04 10:16:54 +02:00
ksysfs.c kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
kthread.c kernel/kthread.c:kthread_create_on_node(): clarify documentation 2015-09-04 16:54:41 -07:00
latencytop.c
Makefile sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
membarrier.c sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
memremap.c memremap: fix highmem support 2015-10-26 16:55:56 -04:00
module_signing.c PKCS#7: Appropriately restrict authenticated attributes and content type 2015-08-12 17:01:01 +01:00
module-internal.h
module.c module: Fix locking in symbol_put_addr() 2015-08-24 10:37:01 +09:30
notifier.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 08:40:25 -07:00
nsproxy.c
padata.c
panic.c kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path 2015-06-30 19:44:57 -07:00
params.c Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
pid_namespace.c
pid.c rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() 2015-07-22 15:27:32 -07:00
profile.c mm: rename alloc_pages_exact_node() to __alloc_pages_node() 2015-09-08 15:35:28 -07:00
ptrace.c seccomp: add ptrace options for suspend/resume 2015-07-15 11:52:52 -07:00
range.c
reboot.c kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
relay.c kernel/relay.c: use kvfree() in relay_free_page_array() 2015-06-30 19:44:59 -07:00
resource.c mm: enhance region_is_ram() to region_intersects() 2015-08-10 23:07:05 -04:00
seccomp.c Merge tag 'seccomp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into next 2015-07-20 17:19:19 +10:00
signal.c signal: fix information leak in copy_siginfo_to_user 2015-08-07 04:39:40 +03:00
smp.c smp: Fix error case handling in smp_call_function_*() 2015-04-19 13:19:23 -07:00
smpboot.c smpboot: allow passing the cpumask on per-cpu thread registration 2015-09-04 16:54:41 -07:00
smpboot.h
softirq.c
stacktrace.c
stop_machine.c stop_machine: Remove cpu_stop_work's from list in cpu_stop_park() 2015-08-03 12:21:28 +02:00
sys_ni.c sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
sys.c vfs: Commit to never having exectuables on proc and sysfs. 2015-07-10 10:39:25 -05:00
sysctl_binary.c
sysctl.c sysctl: fix int -> unsigned long assignments in INT_MIN case 2015-09-10 13:29:01 -07:00
task_work.c task_work: remove fifo ordering guarantee 2015-09-05 13:46:58 -07:00
taskstats.c
test_kprobes.c
torture.c rcu: Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() 2015-05-27 12:56:15 -07:00
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c capabilities: ambient capabilities 2015-09-04 16:54:41 -07:00
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog.c watchdog: rename watchdog_suspend() and watchdog_resume() 2015-09-04 16:54:41 -07:00
workqueue_internal.h
workqueue.c workqueue: make sure delayed work run in local cpu 2015-09-30 13:06:46 -04:00