Ticket #156 (new defect)

Opened 4 years ago

Last modified 4 years ago

Crashing in SUSE when USB device disconnected during write

Reported by: niamurph Owned by:
Milestone: libusb/libusbx 1.2.0 Component: libusb-1.0 Linux backend
Keywords: disconnect Cc: niamurph@…
Blocked By: Blocks:

Description (last modified by stuge) (diff)

I am having a problem with libussb 1.0.9 on SUSE Enterprise Linux SP2. When I disconnect a USB device
while writing to it I get an intermittant crash and as can be seen from the dump below the itransfer structure contains a value of
num_iso_packets which is 149879376 which is huge enough that I presume it is corruption.

Note that this looks like it might be related to the remaining race condition described at http://libusb.org/ticket/81#comment:14
Comment 15 of that bug then says there might be a solution in 1.0.10, but I do not knwo if that is being actively worked - of course it is not certain that the race condition I am seeing is the same one.

Core was generated by `vxc -t -a 10.53.59.250 -d CSFniamurph2'.
Program terminated with signal 11, Segmentation fault.
#0  op_clear_transfer_priv (itransfer=0x89f9c38) at os/linux_usbfs.c:1965
1965    os/linux_usbfs.c: No such file or directory.
        in os/linux_usbfs.c
(gdb) bt
#0  op_clear_transfer_priv (itransfer=0x89f9c38) at os/linux_usbfs.c:1965
#1  0xaa1941ea in usbi_handle_disconnect (handle=0xa12d18f8) at io.c:2450
#2  0xaa1955de in op_handle_events (ctx=0x8a43920, fds=0xa12d4708, nfds=3, num_ready=0)
    at os/linux_usbfs.c:2366
#3  0xaa193607 in handle_events (ctx=0x8a43920, tv=0xa84eb014) at io.c:1944
#4  0xaa193e57 in libusb_handle_events_timeout_completed (ctx=0x8a43920, tv=0xa84eb04c, 
    completed=0xa84eb098) at io.c:2024
#5  0xaa193f19 in libusb_handle_events_completed (ctx=0x8a43920, completed=0xa84eb098)
    at io.c:2123
#6  0xaa194677 in do_sync_bulk_transfer (dev_handle=0xa12d18f8, endpoint=<optimized out>, 
    buffer=0xaa1af7d9 "\002\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\311"..., length=7129, transferred=0xa84eb0dc, timeout=1000, type=3 '\003') at sync.c:178
#7  0xaa1a7b82 in hid_write () from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#8  0xaa1a6a5a in createLCDBody ()
   from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#9  0xaa1a45dc in updateLCD () from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#10 0xaa1a0d0d in lcdEventUpdate ()
   from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#11 0xb42d2879 in start_thread () from /lib/libpthread.so.0
#12 0xb3d17ffe in clone () from /lib/libc.so.6
Backtrace stopped: Not enough registers or memory available to unwind further
(gdb) print itransfer
$1 = (struct usbi_transfer *) 0x89f9c38
(gdb) print *itransfer
$2 = {num_iso_packets = 149879376, list = {prev = 0x8a43970, next = 0x8a43970}, 
  timeout = {tv_sec = 105075, tv_usec = 368421}, transferred = 0, flags = 12 '\f', 
  lock = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = -1, __nusers = 0, {
        __spins = 0, __list = {__next = 0x0}}}, 
    __size = '\000' <repeats 12 times>"\377, \377\377\377\000\000\000\000\000\000\000", 
    __align = 0}}
(gdb)

I modified the code to return from op_clear_transfer_priv in the case where the itransfer was flags=12 (interrupted and device disconnected)
to see where I coud get to ( I think that change could have cause memeory leak, but this was an experiment) and it still crashed now with the following:

Program terminated with signal 11, Segmentation fault.
#0  0xaa0e0852 in libusb_submit_transfer (transfer=0x89fe03c) at io.c:1290
1290    io.c: No such file or directory.
        in io.c
(gdb) bt
#0  0xaa0e0852 in libusb_submit_transfer (transfer=0x89fe03c) at io.c:1290
#1  0xaa0e191f in libusb_control_transfer (dev_handle=0x8d83e58, bmRequestType=33 '!', 
    bRequest=9 '\t', wValue=<optimized out>, wIndex=<optimized out>, 
    data=0xaa0fac00 "\001\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336\333\336"..., wLength=<optimized out>, timeout=1000) at sync.c:98
#2  0xaa0f4b27 in hid_write () from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#3  0xaa0f3a34 in createLCDBody ()
   from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#4  0xaa0f15dc in updateLCD () from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#5  0xaa0edd0d in lcdEventUpdate ()
   from /opt/cisco/vxc/plugins/libCiscoKeyboardPlugin_4387.so
#6  0xb421f879 in start_thread () from /lib/libpthread.so.0
#7  0xb3c64ffe in clone () from /lib/libc.so.6
Backtrace stopped: Not enough registers or memory available to unwind further
(gdb) print transfer
$1 = (struct libusb_transfer *) 0x89fe03c
(gdb) print *transfer
$2 = {dev_handle = 0x8d83e58, flags = 2 '\002', endpoint = 0 '\000', type = 0 '\000', 
  timeout = 1000, status = LIBUSB_TRANSFER_COMPLETED, length = 7137, actual_length = 0, 
  callback = 0xaa0e1590 <ctrl_transfer_cb>, user_data = 0xa8438098, 
  buffer = 0x89f8b00 "!\t\001\002", num_iso_packets = 0, iso_packet_desc = 0x89fe064}
(gdb) print *transfer->dv_handle
There is no member named dv_handle.
(gdb) print *transfer->dev_handle
$3 = {lock = {__data = {__lock = 143085064, __count = 33, __owner = -1278254104, 
      __kind = 144697920, __nusers = 144697928, {__spins = 144698128, __list = {
          __next = 0x89feb10}}}, 
    __size = "\bN\207\b!\000\000\000\350c\317\263@\352\237\bH\352\237\b\020\353\237\b", 
    __align = 143085064}, claimed_interfaces = 0, list = {prev = 0x0, next = 0x20}, 
  dev = 0x48, os_priv = 0x8d83e80 ""}
(gdb)

Change History

comment:1 Changed 4 years ago by stuge

  • Component changed from libusb-1.0 to libusb-1.0 Linux backend
  • Description modified (diff)

Yeah, I also think that you are hitting the race that I refer to in my added comment in the commit message for the #81 fix. Unfortunately it's not trivial.

comment:2 Changed 4 years ago by hjelmn

  • Milestone set to 1.0.16

The libusbx commit https://github.com/libusbx/libusbx/commit/3fadb8b4facf5106f1127e389e1855013c1701ca will probably fix this one.

I will pull it to my repository.

Note: See TracTickets for help on using tickets.