Discussion:
[RFC/PATCH] USB reset on atomic context
Inaky Perez-Gonzalez
2008-08-13 19:05:55 UTC
Permalink
I don't like that for a facility that's not strictly needed
and could be implemented in an alternative way.
There are simply too many drivers (considering usbfs a driver) that have
trouble implementing pre/post_reset
Well, it's not strictly needed because before it was almost impossible to
do reset from an atomic context -- from previous conversations had in the
mailing lists , one of the resons suggested for the lack of final error
handling in many drivers was that.

What is the alternative way you were thinking of?
The reset-device workqueue entry could be per-interface instead of
per-device (stored in the usb_interface structure rather than the
usb_device structure). The locking would become a little more
complicated but it would do what you want.
I don't see we are gaining anything from this--it was my first implementation
try though. However, resets are device-wide. It doesn't matter who is resetting
the device, an interface driver or a device-wide driver, the whole thing is
going down. This is how it is now when you call usb_reset_device().

So if an interface driver calls usb_*_reset_device(), the rest of the intefaces
will have to suck it up, because they are going to be reset. We can't do anything
about it (other than hooking up the pre calls).

But then, this call doesn't change any of that; this call just allows that same API
function to be called from an atomic context in a safe way. I guess this is why I
I don't understand the controversy.

OTOH, there are a couple of little bugs I found I need to address...more on the
other subthread.

Thanks,
--
Inaky
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Inaky Perez-Gonzalez
2008-08-13 19:07:04 UTC
Permalink
Hi All
This patch introduces a new call to be able to do a USB reset
from an atomic contect. This is quite helpful in USB callbacks
to handle errors (when the only thing that can be done is to
do a device reset). The Wireless USB code for using wire adaptors
needs this functionality.
I've since changed how resets are handled in the UWB/WUSB code and this
is no longer necessary for HWAs (reset requests go via the UWB stack
which carries them in out in the uwbd thread).
Ouch...doesn't that make the error handling more twisted? Where do you have
the code so i can take a look?

I don't see how it'd be different for DWA though -- the base WA code is
the same both for HWA and DWA.
--
Inaky

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Vrabel
2008-08-14 12:17:51 UTC
Permalink
Post by Inaky Perez-Gonzalez
Hi All
This patch introduces a new call to be able to do a USB reset
from an atomic contect. This is quite helpful in USB callbacks
to handle errors (when the only thing that can be done is to
do a device reset). The Wireless USB code for using wire adaptors
needs this functionality.
I've since changed how resets are handled in the UWB/WUSB code and this
is no longer necessary for HWAs (reset requests go via the UWB stack
which carries them in out in the uwbd thread).
Ouch...doesn't that make the error handling more twisted? Where do you have
the code so i can take a look?
It's not so bad. The attached patch gives you some idea of the changes.
Most of them are related to the changes to the struct uwb_event.

I also think it makes more sense as it will allow the RC to inform PALs
of impending resets in a h/w independent way (the code does make any use
of this yet).

There are still problems with events still queued after a RC has been
removed, but these aren't unique to the reset event.
Post by Inaky Perez-Gonzalez
I don't see how it'd be different for DWA though -- the base WA code is
the same both for HWA and DWA.
DWAs don't have a radio controller or other PALs to worry about.

David
--
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park, Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ http://www.csr.com/
Inaky Perez-Gonzalez
2008-08-14 18:30:49 UTC
Permalink
Post by David Vrabel
Post by Inaky Perez-Gonzalez
Hi All
This patch introduces a new call to be able to do a USB reset
from an atomic contect. This is quite helpful in USB callbacks
to handle errors (when the only thing that can be done is to
do a device reset). The Wireless USB code for using wire adaptors
needs this functionality.
I've since changed how resets are handled in the UWB/WUSB code and this
is no longer necessary for HWAs (reset requests go via the UWB stack
which carries them in out in the uwbd thread).
Ouch...doesn't that make the error handling more twisted? Where do you have
the code so i can take a look?
It's not so bad. The attached patch gives you some idea of the changes.
Most of them are related to the changes to the struct uwb_event.
I also think it makes more sense as it will allow the RC to inform PALs
of impending resets in a h/w independent way (the code does make any use
of this yet).
Oh, I see what you mean -- out of couriosity could the same effect be
achievable with pre_reset() handlers for PALs that are hanging out of
that same USB device? [granted, that'd not work for PCI based ones].
--
Inaky
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Inaky Perez-Gonzalez
2008-08-13 19:20:15 UTC
Permalink
I'm not sure I remember all of them either! The main issues that
What happens if a driver requests a reset and then is unbound
before the reset happens?
The reset is still pending until it happens -- the way I see it is: the
intention of the driver was to reset the device. The fact that it is
delayed because it decided that from an atomic context is a mere artifact.

If the driver decided to reset the device, is because the device has to
be reset to bring it to a known state, so it should be completed.
What happens if a driver requests a reset while it is being
bound or unbound?
Same as above. While bound it would cause the driver to be unbound and
then reenumeration. While unbinding it'd cause nothing but the reset,
as driver is "unbound" once we exit .disconnect().
Neither of these appears to be a showstopper for your patch. But there
are some questionable aspects, noted below.
diff -r d60c9a74e581 drivers/usb/core/hub.c
--- a/drivers/usb/core/hub.c Tue Aug 12 09:14:24 2008 -0700
+++ b/drivers/usb/core/hub.c Tue Aug 12 15:14:57 2008 -0700
@@ -3447,3 +3447,33 @@ int usb_reset_device(struct usb_device *
return ret;
}
EXPORT_SYMBOL_GPL(usb_reset_device);
+
+
+void __usb_queue_reset_device(struct work_struct *ws)
+{
+ struct usb_device *udev =
+ container_of(ws, struct usb_device, reset_ws);
+ usb_reset_device(udev);
+ usb_put_dev(udev);
+}
You don't need usb_put_dev here (or the corresponding usb_get_dev)
since the workqueue entry is reliably cancelled before the device
structure is destroyed.
/me mumbles to himself how still he feels unueasy about it, but
yeah, you are right. I guess this is me still thinking about the
first patch I implemented where this was completely outside the stack.

In fact, there was also a bug in there -- when cancelling the workqueue
on device_disable() we also had to drop that reference count if work
was pending.
However you _do_ need to lock and unlock the device.
lock_device_for_reset? or just lock_device?

I need also some help here: how does that interact with being called from
.probe() or .disconnect()...as in where is the lock taken?

Would it ok to do something dirty like?

result = usb_trylock_device(udev);
if (result < 0)
did_lock = 0;
usb_reset_device(udev);
if (did_lock)
usb_unlock_device(udev);

[sorry, can't find the locking rules, but looks to me that at least
.probe and .disconnect would have it locked and would need that
hack or something equivalent and more proper].
If this routine were put in usb.c rather than hub.c then it could be
made static.
Good point.
+/**
+ * Reset a USB device from an atomic context
+ *
+ *
+ * This function can be used to reset a USB device from an atomic
+ * context, where usb_composite_device_reset() or usb_device_reset()
+ * won't work.
+ *
+ * If a reset is pending already, it won't schedule another one.
+ * No need to lock/unlock because schedule_work() does it's own.
s/it's/its/
ack
diff -r d60c9a74e581 include/linux/usb.h
--- a/include/linux/usb.h Tue Aug 12 09:14:24 2008 -0700
+++ b/include/linux/usb.h Tue Aug 12 15:14:57 2008 -0700
@@ -326,6 +326,7 @@ struct usb_bus {
struct mon_bus *mon_bus; /* non-null when associated */
int monitored; /* non-zero when monitored */
#endif
+ struct work_struct reset_ws; /* for resets in atomic context */
This appears to be a mistake, left over from an earlier incarnation.
Yeah, it was a pasting error.

Ok, I'll work on the fixes and repost the patch -- I'll add that ugly locking
stuff waiting for a better sollution or confirmation.

Thanks!
--
Inaky
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alan Stern
2008-08-13 19:54:09 UTC
Permalink
Post by Inaky Perez-Gonzalez
What happens if a driver requests a reset and then is unbound
before the reset happens?
The reset is still pending until it happens -- the way I see it is: the
intention of the driver was to reset the device. The fact that it is
delayed because it decided that from an atomic context is a mere artifact.
If the driver decided to reset the device, is because the device has to
be reset to bring it to a known state, so it should be completed.
I tend to agree. Oliver is more concerned about this.
Post by Inaky Perez-Gonzalez
What happens if a driver requests a reset while it is being
bound or unbound?
Same as above. While bound it would cause the driver to be unbound and
then reenumeration. While unbinding it'd cause nothing but the reset,
as driver is "unbound" once we exit .disconnect().
Yes -- what I meant was more like "What _should_ happen if a driver
requests a reset while it is being bound or unbound?" In particular,
if the request is made while the disconnect method is running, should
it be ignored?

Or even more to the point, if the actual reset is delayed until the
disconnect method is running (or has completed!), should it be skipped?
This is related to the locking question...
Post by Inaky Perez-Gonzalez
However you _do_ need to lock and unlock the device.
lock_device_for_reset? or just lock_device?
It depends. For the patch in its current state, plain old
usb_lock_device. For what Oliver wants to do, it should be
usb_lock_device_for_reset.

The difference is that usb_lock_device_for_reset requires an interface
argument. With your current patch, the interface information has been
lost when the time comes to do the locking.
Post by Inaky Perez-Gonzalez
I need also some help here: how does that interact with being called from
.probe() or .disconnect()...as in where is the lock taken?
The situation is a bit of a mess. In fact, usb_lock_device_for_reset
is strictly speaking not correct; it assumes that if it was called
while the interface is being bound then it was called by the binding
thread. That whole thing ought to be rewritten.
Post by Inaky Perez-Gonzalez
Would it ok to do something dirty like?
result = usb_trylock_device(udev);
if (result < 0)
did_lock = 0;
usb_reset_device(udev);
if (did_lock)
usb_unlock_device(udev);
[sorry, can't find the locking rules, but looks to me that at least
.probe and .disconnect would have it locked and would need that
hack or something equivalent and more proper].
The intended usage goes like this:

did_lock = usb_trylock_device(udev);
if (did_lock >= 0) {
usb_reset_device(udev);
if (did_lock)
usb_unlock_device(udev);
}

This also could stand some improvement.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Inaky Perez-Gonzalez
2008-08-13 22:36:27 UTC
Permalink
Post by Alan Stern
Post by Inaky Perez-Gonzalez
What happens if a driver requests a reset while it is being
bound or unbound?
Same as above. While bound it would cause the driver to be unbound and
then reenumeration. While unbinding it'd cause nothing but the reset,
as driver is "unbound" once we exit .disconnect().
Yes -- what I meant was more like "What _should_ happen if a driver
requests a reset while it is being bound or unbound?"
What happens now when we call usb_reset_device() from .probe()? The call
is allowed, and it will unbind the drivers and reset the device. So
when .probe() finishes, .disconnect() is inmediately called. I don't
really see an issue with it.
Post by Alan Stern
In particular,
if the request is made while the disconnect method is running, should
it be ignored?
Or even more to the point, if the actual reset is delayed until the
disconnect method is running (or has completed!), should it be skipped?
This is related to the locking question...
No, it should not. Reset cleans hw state and thus, if issues, it is because
it is needed. It should proceed. It might have to wait until .disconnect()
finishes though (due to locking).
Post by Alan Stern
Post by Inaky Perez-Gonzalez
However you _do_ need to lock and unlock the device.
lock_device_for_reset? or just lock_device?
It depends. For the patch in its current state, plain old
usb_lock_device. For what Oliver wants to do, it should be
usb_lock_device_for_reset.
If my device fails to come out of suspend, I might want to use
reset to recover it. Likewise, using lock_device_for_reset()
would rule out being able to use queue_reset_device() while
the device is probing or releasing (versys lock_device() just
delaying it until .probe() or .disconnect() finish).
Post by Alan Stern
did_lock = usb_trylock_device(udev);
if (did_lock >= 0) {
usb_reset_device(udev);
if (did_lock)
usb_unlock_device(udev);
}
This also could stand some improvement.
I'd say move to usb_lock_device(); that way when in .probe() or .disconnect()
context it still works, just delayed until those calls are done.

My only concern is then if anything in .probe() or .disconnect() flushes
the system workqueue...then we have a deadlock. That's technically a bug in
the driver, but caused by the patch.

I could hijack the ksuspend_usbd workqueue to do it, as that is internal, but
that'd also mean ksuspend_usb_{init,cleanup} need to be taken out of the
#if CONFIG_PM (who doesn't do PM these days, anyway?).
--
Inaky
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alan Stern
2008-08-14 15:33:38 UTC
Permalink
Post by Inaky Perez-Gonzalez
Post by Alan Stern
Post by Inaky Perez-Gonzalez
What happens if a driver requests a reset while it is being
bound or unbound?
Same as above. While bound it would cause the driver to be unbound and
then reenumeration. While unbinding it'd cause nothing but the reset,
as driver is "unbound" once we exit .disconnect().
Yes -- what I meant was more like "What _should_ happen if a driver
requests a reset while it is being bound or unbound?"
What happens now when we call usb_reset_device() from .probe()? The call
is allowed, and it will unbind the drivers and reset the device.
Not strictly true. The call is allowed, and if the drivers have proper
pre_reset and post_reset methods then they won't be unbound.
(Conversely, drivers that don't have pre_reset and post_reset methods
_will_ be unbound -- except for the driver currently being probed.)
Post by Inaky Perez-Gonzalez
So
when .probe() finishes, .disconnect() is inmediately called. I don't
really see an issue with it.
It isn't an issue, except as regards the locking. If a driver calls
usb_reset_device directly from within its probe thread, there's no
problem. But what if it queues a delayed reset request?

My feeling is that the queued request should wait until probe returns.
If the probe was successful then the reset can be carried out;
otherwise it should be cancelled just as in the case of unbinding.
Post by Inaky Perez-Gonzalez
Post by Alan Stern
Or even more to the point, if the actual reset is delayed until the
disconnect method is running (or has completed!), should it be skipped?
This is related to the locking question...
No, it should not. Reset cleans hw state and thus, if issues, it is because
it is needed. It should proceed. It might have to wait until .disconnect()
finishes though (due to locking).
Well, there's the difficulty. After disconnect has returned, the
driver has absolutely no claim on the device. It shouldn't be able to
_touch_ the device, let alone reset it! That's Oliver's main argument.

If a driver feels strongly that a reset really is needed to clean the
hardware state before unbinding, it should call usb_reset_device
directly from its disconnect method. It should _not_ force the
disconnect method to block until a queued reset has finished. That
approach is very likely to cause deadlock.

(Note incidentally, that usb-storage's disconnect method _does_ wait
until a reset running in another thread has finished. Fortunately that
other thread isn't doing the equivalent of a delayed reset, but it's
still something to watch out for -- the same locking mechanism has to
work for both cases.)
Post by Inaky Perez-Gonzalez
If my device fails to come out of suspend, I might want to use
reset to recover it.
usbcore already does this. Well, in reality it does a logical
disconnect followed by a reset and re-enumeration.
Post by Inaky Perez-Gonzalez
Likewise, using lock_device_for_reset()
would rule out being able to use queue_reset_device() while
the device is probing or releasing (versys lock_device() just
delaying it until .probe() or .disconnect() finish).
Agreed. For this reason, if we want interface drivers to be able to
use usb_queue_reset_device while binding or unbinding,
usb_lock_device_for_reset will have to be rewritten first.

On the other hand, if you want to call usb_queue_reset_device from some
part of the core without regard to interfaces or their drivers, then
usb_lock_device is the way to go. I don't know why you would want to
do this, but maybe there are good reasons for it in Wireless USB.
Post by Inaky Perez-Gonzalez
I'd say move to usb_lock_device(); that way when in .probe() or .disconnect()
context it still works, just delayed until those calls are done.
Unfortunately we _can't_ simply delay until disconnect is done. Not if
we want the queued reset to be cancelled when the interface is unbound.
There needs to be some synchronization between the unbinding code and
the workqueue code. In fact, the unbinding routine has to wait until
the workqueue routine has agreed to cancel the reset.

This means the workqueue routine can't simply use usb_lock_device,
since it may need to wake up while the device is still locked. This is
another reason for rewriting usb_lock_device_for_reset.
Post by Inaky Perez-Gonzalez
My only concern is then if anything in .probe() or .disconnect() flushes
the system workqueue...then we have a deadlock. That's technically a bug in
the driver, but caused by the patch.
A workqueue should _never_ be flushed except by the code that manages
the workqueue itself.
Post by Inaky Perez-Gonzalez
I could hijack the ksuspend_usbd workqueue to do it, as that is internal, but
that'd also mean ksuspend_usb_{init,cleanup} need to be taken out of the
#if CONFIG_PM (who doesn't do PM these days, anyway?).
s390 doesn't. Maybe others as well.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Oliver Neukum
2008-08-14 08:40:06 UTC
Permalink
Post by Alan Stern
Post by Inaky Perez-Gonzalez
What happens if a driver requests a reset and then is unbound
before the reset happens?
The reset is still pending until it happens -- the way I see it is: the
intention of the driver was to reset the device. The fact that it is
delayed because it decided that from an atomic context is a mere artifact.
If the driver decided to reset the device, is because the device has to
be reset to bring it to a known state, so it should be completed.
That's an assumption. You code infrastructure. There's no basis to be sure
that it'll be used only for error handling.
Post by Alan Stern
I tend to agree. Oliver is more concerned about this.
Two related reasons:

1. Consistency
A driver directly resetting a device after returning from disconnect()
is buggy.
2. Guarantee to user space
Now we guarantee to user space collectively that once all drivers are
kicked off the device the device is user space's. User space can confidently
do things like flashing the device's rom subject to proper locking
in user space without fear of a reset initiated in the kernel.

The basic idea is good and will lead to better error handling but it belongs
on the interface level.

Regards
Oliver


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Inaky Perez-Gonzalez
2008-08-14 18:41:50 UTC
Permalink
Post by Oliver Neukum
Post by Alan Stern
Post by Inaky Perez-Gonzalez
...
If the driver decided to reset the device, is because the device has to
be reset to bring it to a known state, so it should be completed.
That's an assumption. You code infrastructure. There's no basis to be sure
that it'll be used only for error handling.
Post by Alan Stern
I tend to agree. Oliver is more concerned about this.
1. Consistency
A driver directly resetting a device after returning from disconnect()
is buggy.
2. Guarantee to user space
Now we guarantee to user space collectively that once all drivers are
kicked off the device the device is user space's. User space can confidently
do things like flashing the device's rom subject to proper locking
in user space without fear of a reset initiated in the kernel.
Okay, so as it seems there is consensus that a delayed reset shouldn't
happen after disconnect returns, I'll bend to that. More on it on the other
thread.
--
Inaky
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alan Stern
2008-08-13 19:24:26 UTC
Permalink
Post by Inaky Perez-Gonzalez
I don't see we are gaining anything from this--it was my first implementation
try though. However, resets are device-wide. It doesn't matter who is resetting
the device, an interface driver or a device-wide driver, the whole thing is
going down. This is how it is now when you call usb_reset_device().
So if an interface driver calls usb_*_reset_device(), the rest of the intefaces
will have to suck it up, because they are going to be reset. We can't do anything
about it (other than hooking up the pre calls).
I think Oliver is concerned about this possibility (which can occur
even for devices having only one interface):

A driver binds to an interface.

The driver queues a delayed-reset request.

Before the reset is carried out, the driver gets unbound from
the interface.

Another driver binds to that interface, or usbfs claims it.

The reset finally occurs, messing up the new driver.

If the delayed-reset request was stored in the usb_interface structure
then the core would know to cancel it when the interface was unbound.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Inaky Perez-Gonzalez
2008-08-13 22:39:13 UTC
Permalink
+ did_lock = result < 0 ? 0 : 1;
Isn't this a strangely convoluted way to write
did_lock = result >= 0;
ack to that....
I'm looking at what you're doing with great interest, because of
ub_reset_task(). But I cannot jump on the bandwagon just yet,
because ub wants to know when an oustanding reset exists, and
therefore it needs a notification of reset end. Maybe for WUSB
it's enough to fire a reset request and forget it happened.
We could add code for that:

usb_queue_reset_pending(udev)
{
return work_pending(&udev->reset_ws);
}

Why does it need to know about an outstanding request, anyway?

We could add something to __usb_queue_reset() for notification,
something like waking up a wait queue (udev->reset_wq). Would that work
for you?
--
Inaky
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Pete Zaitcev
2008-08-13 23:40:17 UTC
Permalink
Why does [ub] need to know about an outstanding request, anyway?
It has to suspend other activity while request is under way,
doesn't it?

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Inaky Perez-Gonzalez
2008-08-14 19:43:57 UTC
Permalink
b) if (cancel_work_sync(&udev->reset_ws))
__usb_queue_reset_device(&udev->reset_ws)
In both cases, if there was a pending reset, it is executed before we
finish disconnecting the driver from the interface.
You've got the right idea, but doing it correctly will be more
difficult owing to the locking. The basic point is that when a driver
unbinds, we should either wait for its queued reset to complete or
cancel the queued reset. IMO the most sensible action is to wait until
the disconnect method returns and then cancel any queued reset.
Well, in this case the locking becomes moot -- it has been decided that
a reset scheduled from a .disconnect() should be killed, then the
"barrier" just kills it.
[note: I really want to avoid binding to the interface -- it becomes
messier]
See, now that's a completely different matter. If you're not bound to
an interface then you're acting on behalf of the device as a whole --
in which case this discussion is moot.
I guess v3 is going to be interface-based, as you guys seem to like
it way more.
Do you guys see any other places where that "barrier" should be done?
It is needed in usb_unbind_interface and in the "not registered" path
of usb_driver_release_interface.
ack
--
Inaky

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...