Discussion:
[PATCH v11 0/3] printk: Make printk() completely async
(too old to reply)
Sergey Senozhatsky
2016-04-07 16:34:16 UTC
Permalink
Hello,

This patch set makes printk() completely asynchronous: new messages
are getting upended to the kernel printk buffer, but instead of 'direct'
printing the actual print job is performed by a dedicated kthread.
This has the advantage that printing always happens from a schedulable
context and thus we don't lockup any particular CPU or even interrupts.

against next-20160407

v11:
-- switch default to sync printk
-- make `synchronous' param RW (Andrew, Jan)
-- set RT priority to printk kthread (Andrew)
-- correct comments (Andrew)

v10:
-- simplify printk_kthread_need_flush_console (Jan, Petr)

v9:
-- move need_flush_console assignment down in vprintk_emit (Jan)
-- simplify need_flush_console assignment rules (Petr)
-- clear need_flush_console in printing function (Petr)
-- rename need_flush_console (Petr)

v8:
-- rename kthread printing function (Petr)
-- clear need_flush_console in console_unlock() under logbuf (Petr)

v7:
-- do not set global printk_sync in panic in vrintk_emit() (Petr)
-- simplify vprintk_emit(). drop some of local variables (Petr)
-- move handling of LOGLEVEL_SCHED messages back to printk_deferred()
so we wake_up_process()/console_trylock() in vprintk_emit() only
for !in_sched messages

v6:
-- move wake_up_process out of logbuf lock (Jan, Byungchul)
-- do not disable async printk in recursion handling code.
-- rebase against next-20160321 (w/NMI patches)

v5:
-- make printk.synchronous RO (Petr)
-- make printing_func() correct and do not use wait_queue (Petr)
-- do not panic() when can't allocate printing thread (Petr)
-- do not wake_up_process() only in IRQ, prefer vprintk_emit() (Jan)
-- move wake_up_klogd_work_func() to a separate patch (Petr)
-- move wake_up_process() under logbuf lock so printk recursion logic can
help us out
-- switch to sync_print mode if printk recursion occured
-- drop "printk: Skip messages on oops" patch

v4:
-- do not directly wake_up() the printing kthread from vprintk_emit(), need
to go via IRQ->wake_up() to avoid sched deadlocks (Jan)

v3:
-- use a dedicated kthread for printing instead of using wq (Jan, Tetsuo, Tejun)

v2:
- use dedicated printk workqueue with WQ_MEM_RECLAIM bit
- fallback to system-wide workqueue only if allocation of printk_wq has
failed
- do not use system_wq as a fallback wq. both console_lock() and onsole_unlock()
can spend a significant amount of time; so we need to use system_long_wq.
- rework sync/!sync detection logic
a) we can have deferred (in_sched) messages before we allocate printk_wq,
so the only way to handle those messages is via IRQ context
b) even in printk.synchronous mode, deferred messages must not be printed
directly, and should go via IRQ context
c) even if we allocated printk_wq and have !sync_printk mode, we must route
deferred messages via IRQ context
- so this adds additional bool flags to vprint_emit() and introduces a new
pending bit to `printk_pending'
- fix build on !PRINTK configs


Jan Kara (2):
printk: Make printk() completely async
printk: Make wake_up_klogd_work_func() async

Sergey Senozhatsky (1):
printk: make printk.synchronous param rw

Documentation/kernel-parameters.txt | 12 +++
kernel/printk/printk.c | 155 +++++++++++++++++++++++++++++++++---
2 files changed, 157 insertions(+), 10 deletions(-)
--
2.8.0
Sergey Senozhatsky
2016-04-07 16:34:31 UTC
Permalink
From: Jan Kara <***@suse.cz>

Currently, printk() sometimes waits for message to be printed to console
and sometimes it does not (when console_sem is held by some other
process). In case printk() grabs console_sem and starts printing to
console, it prints messages from kernel printk buffer until the buffer
is empty. When serial console is attached, printing is slow and thus
other CPUs in the system have plenty of time to append new messages to
the buffer while one CPU is printing. Thus the CPU can spend unbounded
amount of time doing printing in console_unlock(). This is especially
serious problem if the printk() calling console_unlock() was called with
interrupts disabled.

In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to printk buffer faster than
we are able to print). So these machines are unable to boot with serial
console attached. Another observed issue is that due to slow printk,
hardware discovery is slow and udev times out before kernel manages to
discover all the attached HW. Also during artificial stress testing SATA
disk disappears from the system because its interrupts aren't served for
too long.

This patch makes printk() completely asynchronous (similar to what
printk_deferred() did until now). It appends message to the kernel
printk buffer and wake_up()s a special dedicated kthread to do the
printing to console. This has the advantage that printing always happens
from a schedulable contex and thus we don't lockup any particular CPU or
even interrupts. Also it has the advantage that printk() is fast and
thus kernel booting is not slowed down by slow serial console.
Disadvantage of this method is that in case of crash there is higher
chance that important messages won't appear in console output (we may
need working scheduling to print message to console). We somewhat
mitigate this risk by switching printk to the original method of
immediate printing to console if oops is in progress.

Async printk, for the time being, is considered to be less reliable
than the synchronous one, so by default we keep printk operating in
synchronous mode. There is a printk.synchronous kernel parameter which
permits to select sync/async mode as a boot parameter or later on
from user space via sysfs knob.

printk() is expected to work under different conditions and in different
scenarios, including corner cases of OOM when all of the workers are busy
(e.g. allocating memory), thus printk() uses its own dedicated printing
kthread, rather than relying on workqueue (even with WQ_MEM_RECLAIM bit
set we potentially can receive delays in printing until workqueue
declares a ->mayday, as noted by Tetsuo Handa).

Signed-off-by: Jan Kara <***@suse.cz>
Signed-off-by: Sergey Senozhatsky <***@gmail.com>
---
Documentation/kernel-parameters.txt | 12 +++++
kernel/printk/printk.c | 99 ++++++++++++++++++++++++++++++++++---
2 files changed, 104 insertions(+), 7 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 6152ee4..17b9cb3 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3125,6 +3125,18 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
printk.time= Show timing data prefixed to each printk message line
Format: <bool> (1/Y/y=enable, 0/N/n=disable)

+ printk.synchronous=
+ This parameter controls whether kernel messages must be
+ printed to console synchronously or asynchronously.
+ Asynchronous printing avoids kernel stalling behind
+ slow serial console and thus avoids softlockups,
+ interrupt timeouts, or userspace timing out during
+ heavy printing. printk switches back to synchronous mode
+ during early boot or when oops is happening. For the
+ time being, synchronous mode considered to be more
+ reliable, besides, for debugging, printing messages to
+ console immediately (synchronous mode) is desirable.
+
processor.max_cstate= [HW,ACPI]
Limit processor to maximum C-state
max_cstate=9 overrides any DMI blacklist limit.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index bfbf284..64a98ea 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -46,6 +46,8 @@
#include <linux/utsname.h>
#include <linux/ctype.h>
#include <linux/uio.h>
+#include <linux/kthread.h>
+#include <linux/sched/rt.h>

#include <asm/uaccess.h>
#include <asm-generic/sections.h>
@@ -284,6 +286,16 @@ static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
static char *log_buf = __log_buf;
static u32 log_buf_len = __LOG_BUF_LEN;

+/* Control whether printing to console must be synchronous. */
+static bool __read_mostly printk_sync = true;
+module_param_named(synchronous, printk_sync, bool, S_IRUGO);
+MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
+
+/* Printing kthread for async printk */
+static struct task_struct *printk_kthread;
+/* When `true' printing thread has messages to print */
+static bool printk_kthread_need_flush_console;
+
/* Return log buffer address */
char *log_buf_addr_get(void)
{
@@ -1608,6 +1620,8 @@ asmlinkage int vprintk_emit(int facility, int level,
const char *dict, size_t dictlen,
const char *fmt, va_list args)
{
+ /* cpu currently holding logbuf_lock in this function */
+ static unsigned int logbuf_cpu = UINT_MAX;
static bool recursion_bug;
static char textbuf[LOG_LINE_MAX];
char *text = textbuf;
@@ -1617,8 +1631,6 @@ asmlinkage int vprintk_emit(int facility, int level,
int this_cpu;
int printed_len = 0;
bool in_sched = false;
- /* cpu currently holding logbuf_lock in this function */
- static unsigned int logbuf_cpu = UINT_MAX;

if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
@@ -1757,12 +1769,35 @@ asmlinkage int vprintk_emit(int facility, int level,
if (!in_sched) {
lockdep_off();
/*
- * Try to acquire and then immediately release the console
- * semaphore. The release will print out buffers and wake up
- * /dev/kmsg and syslog() users.
+ * Attempt to print the messages to console asynchronously so
+ * that the kernel doesn't get stalled due to slow serial
+ * console. That can lead to softlockups, lost interrupts, or
+ * userspace timing out under heavy printing load.
+ *
+ * However we resort to synchronous printing of messages during
+ * early boot, when synchronous printing was explicitly
+ * requested by a kernel parameter, or when console_verbose()
+ * was called to print everything during panic / oops.
+ * Unlike bust_spinlocks() and oops_in_progress,
+ * console_verbose() sets console_loglevel to MOTORMOUTH and
+ * never clears it, while oops_in_progress can go back to 0,
+ * switching printk back to async mode; we want printk to
+ * operate in sync mode once panic() occurred.
*/
- if (console_trylock())
- console_unlock();
+ if (console_loglevel != CONSOLE_LOGLEVEL_MOTORMOUTH &&
+ printk_kthread) {
+ /* Offload printing to a schedulable context. */
+ printk_kthread_need_flush_console = true;
+ wake_up_process(printk_kthread);
+ } else {
+ /*
+ * Try to acquire and then immediately release the
+ * console semaphore. The release will print out
+ * buffers and wake up /dev/kmsg and syslog() users.
+ */
+ if (console_trylock())
+ console_unlock();
+ }
lockdep_on();
}

@@ -2722,6 +2757,56 @@ static int __init printk_late_init(void)
late_initcall(printk_late_init);

#if defined CONFIG_PRINTK
+static int printk_kthread_func(void *data)
+{
+ while (1) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ if (!printk_kthread_need_flush_console)
+ schedule();
+
+ __set_current_state(TASK_RUNNING);
+ /*
+ * Avoid an infinite loop when console_unlock() cannot
+ * access consoles, e.g. because console_suspended is
+ * true. schedule(), someone else will print the messages
+ * from resume_console().
+ */
+ printk_kthread_need_flush_console = false;
+
+ console_lock();
+ console_unlock();
+ }
+
+ return 0;
+}
+
+/*
+ * Init async printk via late_initcall, after core/arch/device/etc.
+ * initialization.
+ */
+static int __init init_printk_kthread(void)
+{
+ struct task_struct *thread;
+ struct sched_param param = {
+ .sched_priority = MAX_RT_PRIO - 1,
+ };
+
+ if (printk_sync)
+ return 0;
+
+ thread = kthread_run(printk_kthread_func, NULL, "printk");
+ if (IS_ERR(thread)) {
+ pr_err("printk: unable to create printing thread\n");
+ printk_sync = true;
+ return PTR_ERR(thread);
+ }
+
+ sched_setscheduler(thread, SCHED_FIFO, &param);
+ printk_kthread = thread;
+ return 0;
+}
+late_initcall(init_printk_kthread);
+
/*
* Delayed printk version, for scheduler-internal messages:
*/
--
2.8.0
Sergey Senozhatsky
2016-04-07 16:34:31 UTC
Permalink
From: Jan Kara <***@suse.cz>

Offload printing of scheduler deferred messages from IRQ context
to a schedulable printing kthread, when possible (the same way we
do it in vprintk_emit()). Otherwise, the CPU can spend unbounded
amount of time doing printing in console_unlock() from IRQ.

Signed-off-by: Jan Kara <***@suse.cz>
Signed-off-by: Sergey Senozhatsky <***@gmail.com>
Reviewed-by: Petr Mladek <***@suse.com>
---
kernel/printk/printk.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 64a98ea..89f5441 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2820,9 +2820,16 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
int pending = __this_cpu_xchg(printk_pending, 0);

if (pending & PRINTK_PENDING_OUTPUT) {
- /* If trylock fails, someone else is doing the printing */
- if (console_trylock())
- console_unlock();
+ if (printk_kthread) {
+ wake_up_process(printk_kthread);
+ } else {
+ /*
+ * If trylock fails, someone else is doing
+ * the printing
+ */
+ if (console_trylock())
+ console_unlock();
+ }
}

if (pending & PRINTK_PENDING_WAKEUP)
--
2.8.0
Sergey Senozhatsky
2016-04-07 16:34:32 UTC
Permalink
Change `synchronous' printk param to be RW, so user space
can change printk mode back and forth to/from sync mode
(which is considered to be more reliable).

Signed-off-by: Sergey Senozhatsky <***@gmail.com>
---
kernel/printk/printk.c | 63 ++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 53 insertions(+), 10 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 89f5441..5ae6f73 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -288,9 +288,6 @@ static u32 log_buf_len = __LOG_BUF_LEN;

/* Control whether printing to console must be synchronous. */
static bool __read_mostly printk_sync = true;
-module_param_named(synchronous, printk_sync, bool, S_IRUGO);
-MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
-
/* Printing kthread for async printk */
static struct task_struct *printk_kthread;
/* When `true' printing thread has messages to print */
@@ -1785,7 +1782,7 @@ asmlinkage int vprintk_emit(int facility, int level,
* operate in sync mode once panic() occurred.
*/
if (console_loglevel != CONSOLE_LOGLEVEL_MOTORMOUTH &&
- printk_kthread) {
+ !printk_sync && printk_kthread) {
/* Offload printing to a schedulable context. */
printk_kthread_need_flush_console = true;
wake_up_process(printk_kthread);
@@ -2757,6 +2754,16 @@ static int __init printk_late_init(void)
late_initcall(printk_late_init);

#if defined CONFIG_PRINTK
+/*
+ * kernel_param_ops.set is called from two places:
+ * - from parse_args()->printk_sync_set(), when we can't kthread_run(), so
+ * we just set the param value. The actual initalization happens later,
+ * from late_initcall().
+ *
+ * - from user space via sysfs knob; we can kthread_run() there (if needed).
+ */
+static bool printk_initcall_done;
+
static int printk_kthread_func(void *data)
{
while (1) {
@@ -2780,11 +2787,7 @@ static int printk_kthread_func(void *data)
return 0;
}

-/*
- * Init async printk via late_initcall, after core/arch/device/etc.
- * initialization.
- */
-static int __init init_printk_kthread(void)
+static int __init_printk_kthread(void)
{
struct task_struct *thread;
struct sched_param param = {
@@ -2794,6 +2797,9 @@ static int __init init_printk_kthread(void)
if (printk_sync)
return 0;

+ if (printk_kthread)
+ return 0;
+
thread = kthread_run(printk_kthread_func, NULL, "printk");
if (IS_ERR(thread)) {
pr_err("printk: unable to create printing thread\n");
@@ -2805,6 +2811,43 @@ static int __init init_printk_kthread(void)
printk_kthread = thread;
return 0;
}
+
+static int printk_sync_set(const char *val, const struct kernel_param *kp)
+{
+ static DEFINE_MUTEX(printk_sync_lock);
+ int ret;
+
+ mutex_lock(&printk_sync_lock);
+ ret = param_set_bool(val, kp);
+ if (ret) {
+ mutex_unlock(&printk_sync_lock);
+ return ret;
+ }
+
+ if (printk_initcall_done)
+ ret = __init_printk_kthread();
+ mutex_unlock(&printk_sync_lock);
+ return ret;
+}
+
+static const struct kernel_param_ops param_ops_printk_sync = {
+ .set = printk_sync_set,
+ .get = param_get_bool,
+};
+
+module_param_cb(synchronous, &param_ops_printk_sync, &printk_sync,
+ S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
+
+/*
+ * Init async printk via late_initcall, after core/arch/device/etc.
+ * initialization.
+ */
+static __init int init_printk_kthread(void)
+{
+ printk_initcall_done = true;
+ return __init_printk_kthread();
+}
late_initcall(init_printk_kthread);

/*
@@ -2820,7 +2863,7 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
int pending = __this_cpu_xchg(printk_pending, 0);

if (pending & PRINTK_PENDING_OUTPUT) {
- if (printk_kthread) {
+ if (!printk_sync && printk_kthread) {
wake_up_process(printk_kthread);
} else {
/*
--
2.8.0
Pan Xinhui
2016-04-08 04:06:17 UTC
Permalink
Post by Sergey Senozhatsky
Change `synchronous' printk param to be RW, so user space
can change printk mode back and forth to/from sync mode
(which is considered to be more reliable).
---
kernel/printk/printk.c | 63 ++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 53 insertions(+), 10 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 89f5441..5ae6f73 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -288,9 +288,6 @@ static u32 log_buf_len = __LOG_BUF_LEN;
/* Control whether printing to console must be synchronous. */
static bool __read_mostly printk_sync = true;
-module_param_named(synchronous, printk_sync, bool, S_IRUGO);
-MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
-
/* Printing kthread for async printk */
static struct task_struct *printk_kthread;
/* When `true' printing thread has messages to print */
@@ -1785,7 +1782,7 @@ asmlinkage int vprintk_emit(int facility, int level,
* operate in sync mode once panic() occurred.
*/
if (console_loglevel != CONSOLE_LOGLEVEL_MOTORMOUTH &&
- printk_kthread) {
+ !printk_sync && printk_kthread) {
/* Offload printing to a schedulable context. */
printk_kthread_need_flush_console = true;
wake_up_process(printk_kthread);
@@ -2757,6 +2754,16 @@ static int __init printk_late_init(void)
late_initcall(printk_late_init);
#if defined CONFIG_PRINTK
+/*
+ * - from parse_args()->printk_sync_set(), when we can't kthread_run(), so
+ * we just set the param value. The actual initalization happens later,
+ * from late_initcall().
+ *
+ * - from user space via sysfs knob; we can kthread_run() there (if needed).
+ */
+static bool printk_initcall_done;
+
static int printk_kthread_func(void *data)
{
while (1) {
@@ -2780,11 +2787,7 @@ static int printk_kthread_func(void *data)
return 0;
}
-/*
- * Init async printk via late_initcall, after core/arch/device/etc.
- * initialization.
- */
-static int __init init_printk_kthread(void)
+static int __init_printk_kthread(void)
{
struct task_struct *thread;
struct sched_param param = {
@@ -2794,6 +2797,9 @@ static int __init init_printk_kthread(void)
if (printk_sync)
return 0;
+ if (printk_kthread)
+ return 0;
+
thread = kthread_run(printk_kthread_func, NULL, "printk");
if (IS_ERR(thread)) {
pr_err("printk: unable to create printing thread\n");
@@ -2805,6 +2811,43 @@ static int __init init_printk_kthread(void)
printk_kthread = thread;
return 0;
}
+
+static int printk_sync_set(const char *val, const struct kernel_param *kp)
+{
+ static DEFINE_MUTEX(printk_sync_lock);
+ int ret;
+
+ mutex_lock(&printk_sync_lock);
+ ret = param_set_bool(val, kp);
+ if (ret) {
+ mutex_unlock(&printk_sync_lock);
+ return ret;
+ }
+
+ if (printk_initcall_done)
+ ret = __init_printk_kthread();
+ mutex_unlock(&printk_sync_lock);
+ return ret;
+}
+
+static const struct kernel_param_ops param_ops_printk_sync = {
+ .set = printk_sync_set,
+ .get = param_get_bool,
+};
+
+module_param_cb(synchronous, &param_ops_printk_sync, &printk_sync,
+ S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
+
+/*
+ * Init async printk via late_initcall, after core/arch/device/etc.
+ * initialization.
+ */
+static __init int init_printk_kthread(void)
+{
+ printk_initcall_done = true;
+ return __init_printk_kthread();
hello,

One confusion, Why not use a lock to protect __init_printk_kthread from parallel call? Otherwise I think there is a race.
But for simplicity, maybe you could write codes as below.

+ int ret = __init_printk_kthread();
+ printk_initcall_done = true;
+ return ret;

In my opinion, using a lock is better.

thanks
xinhui
Post by Sergey Senozhatsky
+}
late_initcall(init_printk_kthread);
/*
@@ -2820,7 +2863,7 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
int pending = __this_cpu_xchg(printk_pending, 0);
if (pending & PRINTK_PENDING_OUTPUT) {
- if (printk_kthread) {
+ if (!printk_sync && printk_kthread) {
wake_up_process(printk_kthread);
} else {
/*
Sergey Senozhatsky
2016-04-08 05:28:06 UTC
Permalink
On (04/08/16 12:04), Pan Xinhui wrote:
[..]
Post by Pan Xinhui
Post by Sergey Senozhatsky
+/*
+ * Init async printk via late_initcall, after core/arch/device/etc.
+ * initialization.
+ */
+static __init int init_printk_kthread(void)
+{
+ printk_initcall_done = true;
+ return __init_printk_kthread();
hello,
One confusion, Why not use a lock to protect __init_printk_kthread from parallel call? Otherwise I think there is a race.
But for simplicity, maybe you could write codes as below.
+ int ret = __init_printk_kthread();
+ printk_initcall_done = true;
+ return ret;
In my opinion, using a lock is better.
Hello,

I though about this, but isn't late_initcall() happening before kernel
starts /sbin/init? who can race with

late_initcall() -> init_printk_kthread() -> __init_printk_kthread()?

looking at

static int __ref kernel_init(void *unused)
{
int ret;

kernel_init_freeable();
/* need to finish all async __init code before freeing the memory */
async_synchronize_full();
free_initmem();
.

if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;

__init (and init_printk_kthread is __init) is finished and freed by the
time kernel try_to_run_init_process. isn't it?


sysfs knob -> __init_printk_kthread() is protected by printk_sync_lock
mutex, obviously there can be parallel calls from user space.

-ss
Pan Xinhui
2016-04-08 05:46:19 UTC
Permalink
Post by Sergey Senozhatsky
[..]
Post by Pan Xinhui
Post by Sergey Senozhatsky
+/*
+ * Init async printk via late_initcall, after core/arch/device/etc.
+ * initialization.
+ */
+static __init int init_printk_kthread(void)
+{
+ printk_initcall_done = true;
+ return __init_printk_kthread();
hello,
One confusion, Why not use a lock to protect __init_printk_kthread from parallel call? Otherwise I think there is a race.
But for simplicity, maybe you could write codes as below.
+ int ret = __init_printk_kthread();
+ printk_initcall_done = true;
+ return ret;
In my opinion, using a lock is better.
Hello,
I though about this, but isn't late_initcall() happening before kernel
starts /sbin/init? who can race with
late_initcall() -> init_printk_kthread() -> __init_printk_kthread()?
yep, you are right. I made a mistake.
Post by Sergey Senozhatsky
looking at
static int __ref kernel_init(void *unused)
{
int ret;
kernel_init_freeable();
/* need to finish all async __init code before freeing the memory */
async_synchronize_full();
free_initmem();
..
if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;
__init (and init_printk_kthread is __init) is finished and freed by the
time kernel try_to_run_init_process. isn't it?
good explanation. thanks
Post by Sergey Senozhatsky
sysfs knob -> __init_printk_kthread() is protected by printk_sync_lock
mutex, obviously there can be parallel calls from user space.
-ss
Petr Mladek
2016-04-21 11:07:34 UTC
Permalink
Post by Sergey Senozhatsky
[..]
Post by Pan Xinhui
Post by Sergey Senozhatsky
+/*
+ * Init async printk via late_initcall, after core/arch/device/etc.
+ * initialization.
+ */
+static __init int init_printk_kthread(void)
+{
+ printk_initcall_done = true;
+ return __init_printk_kthread();
hello,
One confusion, Why not use a lock to protect __init_printk_kthread from parallel call? Otherwise I think there is a race.
But for simplicity, maybe you could write codes as below.
+ int ret = __init_printk_kthread();
+ printk_initcall_done = true;
+ return ret;
In my opinion, using a lock is better.
Hello,
I though about this, but isn't late_initcall() happening before kernel
starts /sbin/init? who can race with
late_initcall() -> init_printk_kthread() -> __init_printk_kthread()?
looking at
static int __ref kernel_init(void *unused)
{
int ret;
kernel_init_freeable();
/* need to finish all async __init code before freeing the memory */
async_synchronize_full();
free_initmem();
..
if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;
__init (and init_printk_kthread is __init) is finished and freed by the
time kernel try_to_run_init_process. isn't it?
Please, what is the purpose of "printk_initcall_done" then? If I get
this correctly, it will always be true when printk_sync_set() is
called.
Post by Sergey Senozhatsky
sysfs knob -> __init_printk_kthread() is protected by printk_sync_lock
mutex, obviously there can be parallel calls from user space.
I think that this could not happen. We have discussed similar problem
with livepatching some time ago. AFAIK, writes to sysfs are
synchronized out of box.

Huh, the code is not trivial, so I rather generated a backtrace to be
on the safe side:

[ 370.628592] WARNING: CPU: 2 PID: 3768 at kernel/printk/printk.c:2827 printk_sync_set+0x42/0x70
[ 370.628593] Modules linked in:
[ 370.628597] CPU: 2 PID: 3768 Comm: bash Not tainted 4.6.0-rc4-4-default+ #2691
[ 370.628599] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 370.628600] 0000000000000000 ffff88003709fd28 ffffffff814206fc 0000000000000000
[ 370.628604] 0000000000000000 ffff88003709fd68 ffffffff8106b4e1 00000b0b00000246
[ 370.628606] 0000000000000000 ffffffff81dbce18 ffff88013923a3e0 ffff88013a705c80
[ 370.628609] Call Trace:
[ 370.628614] [<ffffffff814206fc>] dump_stack+0x85/0xc9
[ 370.628617] [<ffffffff8106b4e1>] __warn+0xd1/0xf0
[ 370.628619] [<ffffffff8106b5cd>] warn_slowpath_null+0x1d/0x20
[ 370.628621] [<ffffffff810d4112>] printk_sync_set+0x42/0x70
[ 370.628624] [<ffffffff8108ecfa>] param_attr_store+0x6a/0xd0
[ 370.628627] [<ffffffff8108dffd>] module_attr_store+0x1d/0x30
[ 370.628631] [<ffffffff8129ad44>] sysfs_kf_write+0x44/0x60
[ 370.628634] [<ffffffff8129a344>] kernfs_fop_write+0x144/0x190
[ 370.628637] [<ffffffff81214e08>] __vfs_write+0x28/0xe0
[ 370.628641] [<ffffffff810c04e9>] ? percpu_down_read+0x49/0x80
[ 370.628643] [<ffffffff81218fc1>] ? __sb_start_write+0xd1/0xf0
[ 370.628645] [<ffffffff81218fc1>] ? __sb_start_write+0xd1/0xf0
[ 370.628647] [<ffffffff81215d12>] vfs_write+0xa2/0x1a0
[ 370.628650] [<ffffffff812362b6>] ? __fget_light+0x66/0x90
[ 370.628652] [<ffffffff81217089>] SyS_write+0x49/0xa0
[ 370.628656] [<ffffffff8191c37c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 370.628658] ---[ end trace eb22d849233b4c4c ]---

and the writes are really synchronized. At least here:

kernfs_fop_write()
mutex_lock(&of->mutex)
ops->write()
mutex_unlock(&of->mutex);

It means that the printk_sync_lock is superfluous.


Otherwise the patch looks fine. I am bit concerned because
the synchronization between the setting and testing of
printk_sync/printk_kthread is a bit weak. But it works
because we always either wakeup the kthread or call
the console directly. So, we are on the safe side.


Best Regards,
Petr

PS: I am sorry for the late comment. I was not able to do so
immediately and I wrongly marked the mail as read :-(
Sergey Senozhatsky
2016-04-22 01:27:09 UTC
Permalink
Hello,

On (04/21/16 13:07), Petr Mladek wrote:
[..]
Post by Petr Mladek
Please, what is the purpose of "printk_initcall_done" then? If I get
this correctly, it will always be true when printk_sync_set() is
called.
well, this is a bit ugly, yes. kernel_param_ops defines ->set callback
as printk_sync_set(). and this ->set callback is getting called from 2
different paths (but it's really about underlying __init_printk_kthread()).

__init_printk_kthread() can be executed from:


1) when command line is getting parsed, and we have printk.synchronous=[0|1]

[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.6.0-rc4-next-20160421 ... printk.synchronous=0
[..]
[ 0.000000] [<ffffffff8106857b>] printk_sync_set+0x12/0x52
[ 0.000000] [<ffffffff8104d9f7>] parse_args+0x1ad/0x2bb
[ 0.000000] [<ffffffff8106a3b8>] ? vprintk_default+0x18/0x1a
[ 0.000000] [<ffffffff8188fbda>] start_kernel+0x175/0x378
[ 0.000000] [<ffffffff8188f852>] ? set_init_arg+0x55/0x55
[ 0.000000] [<ffffffff8188f28e>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff8188f3c8>] x86_64_start_kernel+0x138/0x148

can't invoke __init_printk_kthread().



2) late_initcall(init_printk_kthread)

can invoke __init_printk_kthread() at this point.



2) when write to /sys/module/printk/parameters/synchronous happens from user space

[ 65.441956] [<ffffffff8106857b>] printk_sync_set+0x12/0x52
[ 65.441959] [<ffffffff8104db6a>] param_attr_store+0x65/0x8b
[ 65.441960] [<ffffffff8104cf7d>] module_attr_store+0x19/0x25
[ 65.441963] [<ffffffff811411da>] sysfs_kf_write+0x36/0x38
[ 65.441964] [<ffffffff81140657>] kernfs_fop_write+0xe8/0x12e
[ 65.441966] [<ffffffff810f2535>] __vfs_write+0x21/0xc3
[ 65.441967] [<ffffffff810f1888>] ? filp_close+0x57/0x61
[ 65.441969] [<ffffffff81064ed9>] ? percpu_down_read+0xe/0x37
[ 65.441970] [<ffffffff810f2751>] vfs_write+0xb9/0x143
[ 65.441971] [<ffffffff810f28a8>] SyS_write+0x49/0x84
[ 65.441974] [<ffffffff8144959b>] entry_SYSCALL_64_fastpath+0x13/0x8f

can invoke __init_printk_kthread().



alternatively, I had this idea to re-define ->set() callback in init_printk_kthread().

IOW, by default we use param_set_bool(), so parse_args() will not cause any
problems:

static /*** can't 'const' anymore ***/ struct kernel_param_ops param_ops_printk_sync = {
.set = param_set_bool,
.get = param_get_bool,
};

and change it to printk_sync_set() in:

static __init int init_printk_kthread(void)
{
param_ops_printk_sync.set = printk_sync_set;
return __init_printk_kthread();
}

but I think that this bool flag is simpler to understand after all.
Post by Petr Mladek
Post by Sergey Senozhatsky
sysfs knob -> __init_printk_kthread() is protected by printk_sync_lock
mutex, obviously there can be parallel calls from user space.
I think that this could not happen. We have discussed similar problem
with livepatching some time ago. AFAIK, writes to sysfs are
synchronized out of box.
oh, indeed.

kernfs_fop_write(struct file *file)
{
..
mutex_lock(&((struct seq_file *)file->private_data)->mutex);
ops->write();
mutex_unlock(&((struct seq_file *)file->private_data)->mutex);
..
}

good to know! will drop the mutex and re-spin.

[..]
Post by Petr Mladek
Otherwise the patch looks fine. I am bit concerned because
the synchronization between the setting and testing of
printk_sync/printk_kthread is a bit weak. But it works
because we always either wakeup the kthread or call
the console directly. So, we are on the safe side.
thanks.
Post by Petr Mladek
PS: I am sorry for the late comment. I was not able to do so
immediately and I wrongly marked the mail as read :-(
no prob!

-ss
Petr Mladek
2016-04-22 08:41:21 UTC
Permalink
Post by Sergey Senozhatsky
Hello,
[..]
Post by Petr Mladek
Please, what is the purpose of "printk_initcall_done" then? If I get
this correctly, it will always be true when printk_sync_set() is
called.
well, this is a bit ugly, yes. kernel_param_ops defines ->set callback
as printk_sync_set(). and this ->set callback is getting called from 2
different paths (but it's really about underlying __init_printk_kthread()).
Ah, I see and feel shame. It is actually explained in the comment
above printk_initcall_done declaration. Well, the explanation confused
me a bit ;-) I suggest to change it sligtly:

/*
* printk_sync_set() can be called from two places:
*
* - early from start_kernel()->parse_args(). But we can't kthread_run()
* at this stage, so we just set the param value. The actual
* initalization happens later, from the late_initcall().
*
* - even later from user space via sysfs knob. We can kthread_run()
* there if needed.
*/

Or we could write this even more explicitely:

/*
* Prevent starting kthreads from start_kernel()->parse_args(). It is not
* possible at this stage. Instead do so via the inticall.
*/
static bool printk_initcall_done;

In each case, I would move the comment and the declaration right above the
printk_sync_set().
Post by Sergey Senozhatsky
alternatively, I had this idea to re-define ->set() callback in init_printk_kthread().
IOW, by default we use param_set_bool(), so parse_args() will not cause any
static /*** can't 'const' anymore ***/ struct kernel_param_ops param_ops_printk_sync = {
.set = param_set_bool,
.get = param_get_bool,
};
static __init int init_printk_kthread(void)
{
param_ops_printk_sync.set = printk_sync_set;
return __init_printk_kthread();
}
but I think that this bool flag is simpler to understand after all.
In addition, there would be problems to handle a possible change via
the sysfs knob. The bool flag looks much better to me :-)

Thanks a lot for detailed explanation.

Best Regards,
Petr
Sergey Senozhatsky
2016-04-22 12:14:46 UTC
Permalink
Hello,
Post by Petr Mladek
Ah, I see and feel shame. It is actually explained in the comment
above printk_initcall_done declaration. Well, the explanation confused
/*
*
* - early from start_kernel()->parse_args(). But we can't kthread_run()
* at this stage, so we just set the param value. The actual
* initalization happens later, from the late_initcall().
*
* - even later from user space via sysfs knob. We can kthread_run()
* there if needed.
*/
/*
* Prevent starting kthreads from start_kernel()->parse_args(). It is not
* possible at this stage. Instead do so via the inticall.
*/
static bool printk_initcall_done;
will take a look, thanks!
Post by Petr Mladek
In each case, I would move the comment and the declaration right above the
printk_sync_set().
I'm also thinking about renaming it to `printk_kthread_can_run', feels
like this name gives better description.

will resend shortly.

-ss

Jan Kara
2016-04-20 15:16:42 UTC
Permalink
Post by Sergey Senozhatsky
Change `synchronous' printk param to be RW, so user space
can change printk mode back and forth to/from sync mode
(which is considered to be more reliable).
I finally found time to have a look. The patch looks good to me. You can
add:

Reviewed-by: Jan Kara <***@suse.cz>

Honza
--
Jan Kara <***@suse.com>
SUSE Labs, CR
Sergey Senozhatsky
2016-04-21 02:02:07 UTC
Permalink
On (04/20/16 17:16), Jan Kara wrote:
[..]
Post by Jan Kara
I finally found time to have a look. The patch looks good to me. You can
thanks.

-ss
Sergey Senozhatsky
2016-04-16 02:54:17 UTC
Permalink
Post by Sergey Senozhatsky
Hello,
This patch set makes printk() completely asynchronous: new messages
are getting upended to the kernel printk buffer, but instead of 'direct'
printing the actual print job is performed by a dedicated kthread.
This has the advantage that printing always happens from a schedulable
context and thus we don't lockup any particular CPU or even interrupts.
Hello,

Sir, is there anything else you want me to improve in this patch set?

-ss
Post by Sergey Senozhatsky
against next-20160407
-- switch default to sync printk
-- make `synchronous' param RW (Andrew, Jan)
-- set RT priority to printk kthread (Andrew)
-- correct comments (Andrew)
-- simplify printk_kthread_need_flush_console (Jan, Petr)
-- move need_flush_console assignment down in vprintk_emit (Jan)
-- simplify need_flush_console assignment rules (Petr)
-- clear need_flush_console in printing function (Petr)
-- rename need_flush_console (Petr)
-- rename kthread printing function (Petr)
-- clear need_flush_console in console_unlock() under logbuf (Petr)
-- do not set global printk_sync in panic in vrintk_emit() (Petr)
-- simplify vprintk_emit(). drop some of local variables (Petr)
-- move handling of LOGLEVEL_SCHED messages back to printk_deferred()
so we wake_up_process()/console_trylock() in vprintk_emit() only
for !in_sched messages
-- move wake_up_process out of logbuf lock (Jan, Byungchul)
-- do not disable async printk in recursion handling code.
-- rebase against next-20160321 (w/NMI patches)
-- make printk.synchronous RO (Petr)
-- make printing_func() correct and do not use wait_queue (Petr)
-- do not panic() when can't allocate printing thread (Petr)
-- do not wake_up_process() only in IRQ, prefer vprintk_emit() (Jan)
-- move wake_up_klogd_work_func() to a separate patch (Petr)
-- move wake_up_process() under logbuf lock so printk recursion logic can
help us out
-- switch to sync_print mode if printk recursion occured
-- drop "printk: Skip messages on oops" patch
-- do not directly wake_up() the printing kthread from vprintk_emit(), need
to go via IRQ->wake_up() to avoid sched deadlocks (Jan)
-- use a dedicated kthread for printing instead of using wq (Jan, Tetsuo, Tejun)
- use dedicated printk workqueue with WQ_MEM_RECLAIM bit
- fallback to system-wide workqueue only if allocation of printk_wq has
failed
- do not use system_wq as a fallback wq. both console_lock() and onsole_unlock()
can spend a significant amount of time; so we need to use system_long_wq.
- rework sync/!sync detection logic
a) we can have deferred (in_sched) messages before we allocate printk_wq,
so the only way to handle those messages is via IRQ context
b) even in printk.synchronous mode, deferred messages must not be printed
directly, and should go via IRQ context
c) even if we allocated printk_wq and have !sync_printk mode, we must route
deferred messages via IRQ context
- so this adds additional bool flags to vprint_emit() and introduces a new
pending bit to `printk_pending'
- fix build on !PRINTK configs
printk: Make printk() completely async
printk: Make wake_up_klogd_work_func() async
printk: make printk.synchronous param rw
Documentation/kernel-parameters.txt | 12 +++
kernel/printk/printk.c | 155 +++++++++++++++++++++++++++++++++---
2 files changed, 157 insertions(+), 10 deletions(-)
--
2.8.0
Joe Perches
2016-04-16 05:45:29 UTC
Permalink
Post by Sergey Senozhatsky
Post by Sergey Senozhatsky
Hello,
This patch set makes printk() completely asynchronous: new messages
are getting upended to the kernel printk buffer, but instead of 'direct'
printing the actual print job is performed by a dedicated kthread.
This has the advantage that printing always happens from a schedulable
context and thus we don't lockup any particular CPU or even interrupts.
Hello,
Sir, is there anything else you want me to improve in this patch set?
I'm not sir, but my preference would be to move as much of the
async/thread functionality as possible into a separate file.
Sergey Senozhatsky
2016-04-21 02:13:15 UTC
Permalink
On (04/15/16 22:44), Joe Perches wrote:
[..]
Post by Joe Perches
Post by Sergey Senozhatsky
Sir, is there anything else you want me to improve in this patch set?
I'm not sir, but my preference would be to move as much of the
async/thread functionality as possible into a separate file.
hm, we are talking about some 50-60 lines of code in total (seems
that the patch set adds more comments than code), but your point
is interesting. let's say, if there will be more opinions that it
better land in async_printk.{c,h} files, then I'll take a look on
it. how does it sound?

-ss
Joe Perches
2016-04-21 02:17:27 UTC
Permalink
Post by Sergey Senozhatsky
[..]
Post by Joe Perches
Post by Sergey Senozhatsky
Sir, is there anything else you want me to improve in this patch set?
I'm not sir, but my preference would be to move as much of the
async/thread functionality as possible into a separate file.
hm, we are talking about some 50-60 lines of code in total (seems
that the patch set adds more comments than code), but your point
is interesting. let's say, if there will be more opinions that it
better land in async_printk.{c,h} files, then I'll take a look on
it. how does it sound?
I think printk.c is pretty large, complicated and should
be broken up into several bits.

I did that once, but it's a real development timing issue 
https://lkml.org/lkml/2012/10/17/41

cheers, Joe
Jan Kara
2016-04-21 13:19:38 UTC
Permalink
Post by Joe Perches
Post by Sergey Senozhatsky
[..]
Post by Joe Perches
Post by Sergey Senozhatsky
Sir, is there anything else you want me to improve in this patch set?
I'm not sir, but my preference would be to move as much of the
async/thread functionality as possible into a separate file.
hm, we are talking about some 50-60 lines of code in total (seems
that the patch set adds more comments than code), but your point
is interesting. let's say, if there will be more opinions that it
better land in async_printk.{c,h} files, then I'll take a look on
it. how does it sound?
I think printk.c is pretty large, complicated and should
be broken up into several bits.
I did that once, but it's a real development timing issue 
https://lkml.org/lkml/2012/10/17/41
As much as that may be a useful excercise, I don't think it really belongs
in this rather small patch set.

Honza
--
Jan Kara <***@suse.com>
SUSE Labs, CR
Loading...