Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-working CDC class serial port #65

Open
pcjc2 opened this issue Mar 28, 2024 · 20 comments
Open

Non-working CDC class serial port #65

pcjc2 opened this issue Mar 28, 2024 · 20 comments

Comments

@pcjc2
Copy link

pcjc2 commented Mar 28, 2024

I've been having issues getting any serial communications to work when exporting a CDC class USB device (from a raspberry pi based industrial HMI).

The device seems to appear in Windows, but I have no serial data transferred. Wireshark does show usbip protocol data being exchanged (just short / empty URBs I think).

I double checked that using usbip on a Linux box, I can successfully transfer serial data to the device.

The endpoint device is our own hardware, based on an Atmel SAM4S processor, using the Atmel CDC driver. (So it's not necessarily bug free from experience with their early USB support code).

Currently using the latest release version of usbip-win2. Will try to see if the WDM driver works or not, as I've got a feeling one of the other older usbip projects did previously connect with these boards.

Any known issues that might affect a CDC class serial interface with the current usbip-win2 driver?

Any suggestions what debug steps to take to help identify the issue.

Peter

@vadimgrn
Copy link
Owner

Hi, driver log is required
https://github.com/vadimgrn/usbip-win2/tree/develop?tab=readme-ov-file#obtaining-usbip-logs-on-windows

@pcjc2
Copy link
Author

pcjc2 commented Mar 28, 2024

usbip-flt.txt
usbip-ude.txt

CDC serial device is 1-1.3.1 on my remote system.

One thing I notice in devcon, is that the port comes up as "ports" class, not "usb". I tried (naively) adding your filter driver to that "just in case", but no difference.

USB\VID_03EB&PID_2404\12...EF
Name: Communication Device Class ASF example (COM8)
Setup Class: {4d36e978-e325-11ce-bfc1-08002be10318} Ports
Upper class filters:
usbip2_filter
Controlling service:
usbser

(NB: I added the usbip2_filter. No apparent change in behaviour, but I mention it in case it makes the logs above strange in any way!)

@pcjc2
Copy link
Author

pcjc2 commented Mar 28, 2024

Dumping the device using Thesycon USB descriptor dumper V2.17.0 in case this contains any clues.

(This is on the Windows machine running usbip-win2, not directly attached to the hardware).

Information for device XYZ (VID=0x03EB PID=0x2404):

*** ERROR: Descriptor has errors! ***


Connection Information:

Device current bus speed: HighSpeed
Device supports USB 1.1 specification
Device supports USB 2.0 specification
Device address: 0x0001
Current configuration value: 0x01
Number of open pipes: 1


Device Descriptor:

0x12 bLength
0x01 bDescriptorType
0x0200 bcdUSB
0x02 bDeviceClass (Communication Device Class)
0x00 bDeviceSubClass
0x00 bDeviceProtocol
0x40 bMaxPacketSize0 (64 bytes)
0x03EB idVendor
0x2404 idProduct
0x0100 bcdDevice
0x01 iManufacturer "ABCD"
0x02 iProduct "XYZ"
0x03 iSerialNumber "12...EF"
0x01 bNumConfigurations

Device Qualifier Descriptor is not available. Error code: 0x000005AA


Configuration Descriptor:

0x09 bLength
0x02 bDescriptorType
0x0043 wTotalLength (67 bytes)
0x02 bNumInterfaces
0x01 bConfigurationValue
0x00 iConfiguration
0xC0 bmAttributes (Self-powered Device)
0x32 bMaxPower (100 mA)

Interface Descriptor:

0x09 bLength
0x04 bDescriptorType
0x00 bInterfaceNumber
0x00 bAlternateSetting
0x01 bNumEndPoints
0x02 bInterfaceClass (Communication Device Class)
0x02 bInterfaceSubClass (Abstract Control Model - ACM)
0x01 bInterfaceProtocol (ITU-T V.250)
0x00 iInterface

CDC Header Functional Descriptor:

0x05 bFunctionalLength
0x24 bDescriptorType
0x00 bDescriptorSubtype
0x0110 bcdCDC

CDC Abstract Control Management Functional Descriptor:

0x04 bFunctionalLength
0x24 bDescriptorType
0x02 bDescriptorSubtype
0x02 bmCapabilities

CDC Union Functional Descriptor:

0x05 bFunctionalLength
0x24 bDescriptorType
0x06 bDescriptorSubtype
0x00 bControlInterface
0x01 bSubordinateInterface(0)

CDC Call Management Functional Descriptor:

0x05 bFunctionalLength
0x24 bDescriptorType
0x01 bDescriptorSubtype
0x03 bmCapabilities
0x01 bDataInterface

Endpoint Descriptor:

0x07 bLength
0x05 bDescriptorType
0x83 bEndpointAddress (IN endpoint 3)
0x03 bmAttributes (Transfer: Interrupt / Synch: None / Usage: Data)
0x0040 wMaxPacketSize (1 x 64 bytes)
0x10 bInterval (32768 microframes)

Interface Descriptor:

0x09 bLength
0x04 bDescriptorType
0x01 bInterfaceNumber
0x00 bAlternateSetting
0x02 bNumEndPoints
0x0A bInterfaceClass (CDC Data)
0x00 bInterfaceSubClass
0x00 bInterfaceProtocol
0x00 iInterface

Endpoint Descriptor:

0x07 bLength
0x05 bDescriptorType
0x81 bEndpointAddress (IN endpoint 1)
0x02 bmAttributes (Transfer: Bulk / Synch: None / Usage: Data)
0x0040 wMaxPacketSize (64 bytes)
0x00 bInterval
*** ERROR: Invalid wMaxPacketSize. Must be 512 bytes in high speed mode.

Endpoint Descriptor:

0x07 bLength
0x05 bDescriptorType
0x02 bEndpointAddress (OUT endpoint 2)
0x02 bmAttributes (Transfer: Bulk / Synch: None / Usage: Data)
0x0040 wMaxPacketSize (64 bytes)
0x00 bInterval
*** ERROR: Invalid wMaxPacketSize. Must be 512 bytes in high speed mode.

Microsoft OS Descriptor is not available. Error code: 0x000005AA


String Descriptor Table

Index LANGID String
0x00 0x0000 0x0409
0x01 0x0409 "ABCD"
0x02 0x0409 "XYZ"
0x03 0x0409 "12...EF"


Connection path for device:
usbip root hub #1
Root Hub
XYZ (VID=0x03EB PID=0x2404) Port: 1

Running on: Windows 10 or greater (Build Version 19045)

Brought to you by TDD v2.17.0, Feb 23 2021, 14:04:02

@pcjc2
Copy link
Author

pcjc2 commented Mar 28, 2024

TL;DR none of these things helped....

Just tried the old WDM driver version, also behaves the same (no obvious serial data transferred). I don't have any way to provoke the remote device to send data towards the PC spontaneously, so it isn't possible to determine if the data gets PC -> device but is blocked device -> PC.

Thinking along the lines of the audio device issues mentioned in other issues (and looking at the descriptor errors reported by the TDD tool, "*** ERROR: Invalid wMaxPacketSize. Must be 512 bytes in high speed mode." I tried pulling the source-code and re-building with the following change.

diff --git a/driver/libdrv/usb_util.cpp b/driver/libdrv/usb_util.cpp
index a0da79f4..d795f818 100644
--- a/driver/libdrv/usb_util.cpp
+++ b/driver/libdrv/usb_util.cpp
@@ -22,7 +22,7 @@ usb_device_speed get_usb_speed(USHORT bcdUSB)
        case 0x0110:
                return USB_SPEED_FULL;
        case 0x0200:
-               return USB_SPEED_HIGH;
+               return USB_SPEED_FULL; // USB_SPEED_HIGH;
        case 0x0250:
                return USB_SPEED_WIRELESS;
        case 0x0300:

The device supporting the USB2.0 standard doesn't actually mean it is connected at high speed, and is emitting descriptors that are valid for being connected at high-speed, like this function suggests. The device will (should) alter its descriptors to be "correct" if it actually connects at high speed. At the server end, this device (which only supports "Full" speed) connects at Full speed, and has descriptors that match this.

Obviously my patch is just to test and see if it makes any difference to the windows built in CDC class serial driver. The TDD tool picks up the device as connected at "Full speed" now with this change, and is happy the descriptors are valid.

The serial port still doesn't achieve communication unfortunately. Usb speed can be passed as attach parameter if that really fix such issues.

@vadimgrn
Copy link
Owner

vadimgrn commented Mar 29, 2024

I see that USB 3.0 device, for example, can be plugged in USB 2.0 port on server side. But UdecxUsbDeviceInitSetSpeed must be called before the creation of virtual device. At that point I have a response from the server on OP_REQ_IMPORT. This response contains this structure.

struct usbip_usb_device 
{
        char path[usbip::DEV_PATH_MAX];
        char busid[usbip::BUS_ID_SIZE];

        UINT32 busnum;
        UINT32 devnum;
        UINT32 speed; // enum usb_device_speed

        UINT16 idVendor;
        UINT16 idProduct;
        UINT16 bcdDevice; // Device Release Number

        UINT8 bDeviceClass;
        UINT8 bDeviceSubClass;
        UINT8 bDeviceProtocol;

        UINT8 bConfigurationValue;
        
        UINT8 bNumConfigurations;
        UINT8 bNumInterfaces;
};

I do not have any other information to find out USB device speed.

@vadimgrn
Copy link
Owner

[1]53a8 log:usbip_usb_device(path '/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.3/1-1.3.1', busid 1-1.3.1, busnum 1, devnum 18, USB_SPEED_FULL,vid 0x3eb, pid 0x2404, rev 0x100, class/sub/proto 2/0/0, bConfigurationValue 1, bNumConfigurations 1, bNumInterfaces 2)
[1]53a8 prepare_init:USB_SPEED_FULL -> UdecxUsbFullSpeed

@pcjc2
Copy link
Author

pcjc2 commented Mar 29, 2024

Yes, the server is reporting "Full" speed for all my exported devices.

Do we have the impression that the Microsoft UDE driver is buggy, and is not correctly emulating this device speed? Is there any way to contact Microsoft and query this I wonder?

I wonder if the timing of INTERRUPT URBs is causing issues, as when the FULL speed descriptor bInterval value (16) is interpreted as a HIGH speed descriptor (2^(16-1) = 32,768 micro-frames (although the reference below suggests the polling interval will actually be 32 micro-frames, so ~ 4ms interval).

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/usbspec/ns-usbspec-_usb_endpoint_descriptor

I tried to test the WDM driver with the FT232 serial adapter (which works in the UDE driver). WDM driver doesn't work with the FT232 adapter either, so my test failing to get the CDC device to work with the WDM driver after patching to adjust the emulated device speed to "Full" might not be conclusive.

My next thought is to try altering the device descriptor to make it more appropriate for High speed (at least in respect of bInterval). Unfortunately I don't think I can actually increase the buffer sizes on the hardware to make that valid as per High speed device spec.

Did you spot anything interesting in the device logs? I saw some unlinks in Wireshark captures, and entries on the server side:

[122305.351193] usbip-host 1-1.5.1: recv a header, 0
[122305.451389] usbip-host 1-1.5.1: reset full-speed USB device number 9 using dwc_otg
[122305.583611] usbip-host 1-1.5.1: device reset
[125108.161770] usbip-host 1-1.5.1: stub up
[125109.166709] usbip-host 1-1.5.1: usb_set_interface done: inf 1 alt 0
[125109.543995] usbip-host 1-1.5.1: unlinked by a call to usb_unlink_urb()
[125109.734184] usbip-host 1-1.5.1: unlinked by a call to usb_unlink_urb()

@pcjc2
Copy link
Author

pcjc2 commented Mar 29, 2024

Is there a good place within the driver to intercept and adjust descriptors for testing?

I tried this:

diff --git a/drivers/ude/device.cpp b/drivers/ude/device.cpp
index b35b95bb..45aa5c3f 100644
--- a/drivers/ude/device.cpp
+++ b/drivers/ude/device.cpp
@@ -348,7 +348,12 @@ PAGED NTSTATUS endpoint_add(_In_ UDECXUSBDEVICE device, _In_ UDECX_USB_ENDPOINT_
                 NT_ASSERT(epd.bLength == len);
                 NT_ASSERT(sizeof(endp.descriptor) >= len);
                 RtlCopyMemory(&endp.descriptor, &epd, len);
+                // XXX: KLUDGE TEST
+                if (endp.descriptor.bInterval == 16) {
+                    endp.descriptor.bInterval = 4;
+                }
                 insert_endpoint_list(endp);
+
         } else {
                 NT_ASSERT(epd == EP0);
                 static_cast<USB_ENDPOINT_DESCRIPTOR&>(endp.descriptor) = epd;

But it doesn't affect the descriptors being reported by TDD. I guess the place tried to manipulate the descriptor is not correct. (I don't know what code-path descriptors are fetched by a user-space program like TDD).

@vadimgrn
Copy link
Owner

vadimgrn commented Mar 29, 2024

Grep post_process_transfer_buffer

UDE has several oddities, for that reason the filter driver was developed. The weirdest thing is that transfers can arrive before reporting about changing ifnum/altnum.

@pcjc2
Copy link
Author

pcjc2 commented Mar 31, 2024

I wrote some horrific code to hot-patch the descriptors as read, and so far didn't get any better behavior from the serial communications. For my ultimate application, I think I need to switch to using serial/IP to attach to my electronic devices remotely... I should be able to add a TCP/IP transport to our debug software. Should avoid the high-rate of network traffic associated with the INTERRUPT transfers.

For general utility though, I'd love to help get usbip-win2 working with more USB devices, as the ability to remote connect an arbitrary USB device (such as motor controller) saves me standing out in the rain tethered to an electrical panel. (Siemens motor drive controller is another device that fails to operate correctly with usbip-win2, but I suspect this may be due to Siemens communicating with the USB on the device with a custom driver-stack).

Did you have any thoughts on the UCX part of the USB3.0 driver stack? I can't tell if it looks relevant. Seem you can catch various events EVT_UCX_... that might be interesting for your driver? What I can't quite see is whether the UDE driver actually implements the virtual xHCI controller in a way that you can hook UCX into:

@pcjc2
Copy link
Author

pcjc2 commented Mar 31, 2024

Is this a good reference? (I guess you've seen it)

https://github.com/microsoft/UDE/blob/master/UDEMbimClientSample/

https://github.com/microsoft/UDE/blob/master/UDEMbimClientSample/controller.cpp has some notes on extra steps taken (for compatibility) to ensure the virtual controller looks exactly like any other to the system.

@pcjc2
Copy link
Author

pcjc2 commented Mar 31, 2024

Finally - something interesting!

On a hunch, I tried disabling the filter driver with:
devcon.exe classfilter usb upper !usbip2_filter

Rebooted computer, connected remote serial port... and IT WORKS!

Does this give any clue where the issue might be?. (Currently my driver is built with a few other kludges, so when I have time, I need to try the stock build of the driver with the filter disabled).

ROOT\USB\0000
Name: USBip 3.X Emulated Host Controller
Setup Class: {36fc9e60-c465-11cf-8056-444553540000} USB
Upper class filters:
USBPcap
Controlling service:
usbip2_ude

I'm wondering if instead of the filter driver (or in addition), the main driver needs to catch some UCX events. (I get the impression that the emulated xHCI driver in the UDE stack uses Ucx01000.sys internally.)

@vadimgrn
Copy link
Owner

vadimgrn commented Apr 1, 2024

The filter driver is required to implement QueryBusTime for ISOCH devices. Also it reports select configuration/interface because Ude swallows them or reports too late. If it works without the filter driver, look in its log to find out the issue. Or comment out some parts of it.

I'd like to get rid of the filter, it is the extra source of errors. But that's not possible until Microsoft implements at least QueryBusTime.

@vadimgrn
Copy link
Owner

vadimgrn commented Apr 1, 2024

I don't think that UCX is relevant for UDE drivers, there are no any references to it in UDE docs. It may be used internally by UDE to implement usb 3.0 hub.

@pcjc2
Copy link
Author

pcjc2 commented Apr 1, 2024

Unfortunately disabling the filter driver doesn't seem to be the fix (possibly it affects some timings though).

I've been able to reproduce the device not working after removing some of my descriptor editing patches and rebuild the driver. I added some back in, but now can't currently make the device work.

At least some aspect might be state on the server, as I've got multiple devices exported. At one point, one would consistently work, the other would consistently NOT work. After issuing an unbind / bind on the hub driver at the server, I've got both consistently not working, and can't seem to reproduce whatever triggered them to start working when I removed the filter driver.

@pcjc2
Copy link
Author

pcjc2 commented Apr 1, 2024

I figured out how to temporarily get the device working. It appears the port is not getting to the "configured" state on the Linux end of things.

For the non-working USB devices, we have:

pi@raspberrypi:~ $ cat /sys/bus/usb/devices/1-1.5:1.0/1-1.5-port1/state
addressed
pi@raspberrypi:~ $ cat /sys/bus/usb/devices/1-1.5:1.0/1-1.5-port5/state
addressed

When attaching with a Linux usbip client, we get to:

pi@raspberrypi:~ $ cat /sys/bus/usb/devices/1-1.5:1.0/1-1.5-port1/state
configured
pi@raspberrypi:~ $ cat /sys/bus/usb/devices/1-1.5:1.0/1-1.5-port5/state
configured

Interestingly, after detaching from the Linux usbip, I can then attach with usbip-win2 and the device works.

I'm guessing somewhere along the lines, the emulated device is not being (properly) configured by the USB stack on Windows, and this stops the unit working.

In the non-working state, Linux side dmesg has entries such as:

[437570.317338] usbip-host 1-1.5.1: stub up
[437571.187614] usbip-host 1-1.5.1: endpoint 0 is stalled
[437571.255878] usbip-host 1-1.5.1: endpoint 0 is stalled
[437571.308142] usbip-host 1-1.5.1: endpoint 0 is stalled
[437571.376613] usbip-host 1-1.5.1: endpoint 0 is stalled
[437571.427412] usbip-host 1-1.5.1: CMD_SUBMIT: invalid epnum 1
[437571.485910] usbip-host 1-1.5.1: CMD_SUBMIT: invalid epnum 3
[437571.536741] usbip-host 1-1.5.1: endpoint 0 is stalled
[437571.587565] usbip-host 1-1.5.1: CMD_SUBMIT: invalid epnum 1
[437571.597457] usbip-host 1-1.5.1: CMD_SUBMIT: invalid epnum 3
[437571.686012] usbip-host 1-1.5.1: endpoint 0 is stalled

Can you point me to where in the windows driver/stack you'd expect the usb device configuration to be initialised?

@vadimgrn
Copy link
Owner

vadimgrn commented Apr 1, 2024

Grep endpoints_configure, select_configuration

@pcjc2
Copy link
Author

pcjc2 commented Apr 1, 2024

I'm getting the very distinct feeling that I only accidentally got things working a couple of times when reloading drivers (and filter) I had broken the filter driver functionality. In these circumstances, the serial device might work, if it had been left appropriately configured by a previous (e.g. Linux) usbip client.

I probably need to focus on determining the difference in config set by the usbip-win2 vs. directly connected windows, and see if I can work out where things start to go wrong.

In the mean time, I've started a small attempt at a serial/ip windows driver based on this project, as I think it should have almost all the tricky mechanics needed to do that... I'm tempted to just UDE emulate USB serial adapter locally, rather than try to write serial port driver. It is convoluted doing in that way, but it might also help me learn more about how the usbser.sys driver behaves. (Hopefully I can strip out the usbip protocol stuff and just push serial data into the emulated endpoint pipes.)

@pcjc2
Copy link
Author

pcjc2 commented Apr 2, 2024

More info, I've tried a few combinations and exploring the wire-side USB traffic. Observations:

Directly to windows PC works fine. See a single SET_CONFIGURATION (1) and then things jump straight into CDC SET_LINE_CODING, SET_CONTORL_LINE_STATE etc...

Directly to raspberry PI, similar.

Via usb-winip2 (plugged to raspberry PI), things are strange.

Initial enumeration as per Linux, then the port is bound to usbip. We have a SET_CONFIGURATION (0) and various discriptors read. Then we get to the bit (I think) windows is doing:

SET_CONFIGURATION(1)
SET_CONFIGURATION(1)
SET_INTERFACE(1, alt=0)
SET_INTERFACE(1, alt=0)

These are doubled up on the wire. (Need to check I've not got two filter drivers loaded... this is looking slightly suspiciously like I might need a reboot). [EDIT... checked, still get duplication with reboot and upstream driver release].

SET_INTERFACE isn't seen in the working / directly attached traces, so I'm wondering if it might be part of the issue.
[EDIT - probably not. I don't think it causes any harm].

The very interesting / strange part is after I connect to the port with a terminal emulator, we start to see INTERRUPT polling on endpoint 3 (0x83?), which is part of the CDC control side. Normally we don't see traffic on this. These are all NAK'd by the hardware device.

[EDIT... checked wireshark. Only see one or two attempts to end URBs on EP#3, they time-out / get unlinked. The NAK storm on the wire seems to be generated at the server side. Seeing 6-7 NAKs on EP#3 for every bulk IN on EP#1. Would be nice if I could stop the NAKs, as potentially the problem could be the embedded USB stack on the device is being overwhelmed.

@pcjc2
Copy link
Author

pcjc2 commented Apr 3, 2024

Possibly the USB implementation in the device is faulty, but disabling the SELECT_INTERFACE urb from being sent along as follows, seems to help get these particular devices working:

diff --git a/drivers/ude_filter/int_dev_ctrl.cpp b/drivers/ude_filter/int_dev_ctrl.cpp
index c2e2c1bf..909c3309 100644
--- a/drivers/ude_filter/int_dev_ctrl.cpp
+++ b/drivers/ude_filter/int_dev_ctrl.cpp
@@ -161,7 +161,7 @@ void post_process_urb(_In_ filter_ext &fltr, _Inout_ libdrv::RemoveLockGuard &lc
                        char buf[libdrv::SELECT_INTERFACE_STR_BUFSZ];
                        TraceDbg("dev %04x, %s", ptr04x(fltr.self), libdrv::select_interface_str(buf, sizeof(buf), r));
                }
-               send = true;
+               send = false;
                break;
        case URB_FUNCTION_SELECT_CONFIGURATION:
                static_assert(is_request_function(URB_FUNCTION_SELECT_CONFIGURATION));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants