Bug#817816: System freeze with CPU hotplug
Package: linux-image-4.4.0-1-amd64
Version: 4.4.4-1
Severity: critical
Justification: renders system unusable
When I run the following commands, the system freezes:
# echo 0 | tee /sys/devices/system/cpu/cpu*/online && echo 1 | sudo tee /sys/devices/system/cpu/cpu*/online
The system freezes randomly when the CPUs are being onlined or offlined. The system is installed on a VMware virtual machine with 4 processors. Here's a stacktrace of the infinite loop:
-----------------------------------------------------------------------------------------------------------------------[regs]
RAX: 0xFFFFFFFF8160AC40 RBX: 0x0000000000000003 RCX: 0x0000000000000000 RDX: 0x0000000000000003 o d i t s Z a P c
RSI: 0x0000000000000286 RDI: 0xFFFF88004E6B7D98 RBP: 0xFFFF88004E6B7D98 RSP: 0xFFFF88007CFCFDC0 RIP: 0xFFFFFFFF8110AD52
R8 : 0xFFFF88007F60F380 R9 : 0x0000000000000000 R10: 0xFFFFFFFF81B004C0 R11: 0x0000000000000000 R12: 0xFFFF88004E6B7DBC
R13: 0x0000000000000282 R14: 0xFFFF88007F60F300 R15: 0xFFFFFFFF8110AD10
CS: 0010 DS: 0000 ES: 0000 FS: 0000 GS: 0000 SS: 0018
-----------------------------------------------------------------------------------------------------------------------[code]
=> 0xffffffff8110ad52 <multi_cpu_stop+66>: mov ebx,DWORD PTR [rbp+0x20]
0xffffffff8110ad55 <multi_cpu_stop+69>: cmp edx,ebx
0xffffffff8110ad57 <multi_cpu_stop+71>: je 0xffffffff8110ad74 <multi_cpu_stop+100>
0xffffffff8110ad59 <multi_cpu_stop+73>: cmp ebx,0x2
0xffffffff8110ad5c <multi_cpu_stop+76>: je 0xffffffff8110ad9e <multi_cpu_stop+142>
0xffffffff8110ad5e <multi_cpu_stop+78>: cmp ebx,0x3
0xffffffff8110ad61 <multi_cpu_stop+81>: jne 0xffffffff8110ad68 <multi_cpu_stop+88>
0xffffffff8110ad63 <multi_cpu_stop+83>: test r14b,r14b
-----------------------------------------------------------------------------------------------------------------------------
multi_cpu_stop (data=0xffff88004e6b7d98) at /build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c:197
197 /build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c: No such file or directory.
gdb$ bt
#0 multi_cpu_stop (data=0xffff88004e6b7d98) at /build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c:197
#1 0xffffffff8110afed in cpu_stopper_thread (cpu=<optimized out>) at /build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c:456
#2 0xffffffff81097d49 in smpboot_thread_fn (data=0xffff88004e6b7d98) at /build/linux-xT7CCq/linux-4.4.4/kernel/smpboot.c:163
#3 0xffffffff81094dfd in kthread (_create=0xffff88007ce82100) at /build/linux-xT7CCq/linux-4.4.4/kernel/kthread.c:209
#4 0xffffffff8158ed8f in ret_from_fork () at /build/linux-xT7CCq/linux-4.4.4/arch/x86/entry/entry_64.S:486
#5 0x0000000000000000 in ?? ()
gdb$ x/x $rbp+0x20
0xffff88004e6b7db8: 0x00000003
In the function multi_cpu_stop(), curstate equals MULTI_STOP_RUN and seems to never become equal to MULTI_STOP_EXIT. Let me know if you require additional informations.
Thanks
Reply to: