这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@kainino0x
Copy link
Contributor

@kainino0x kainino0x commented Jan 31, 2019

Adds the most minimal API I could find for handling device/adapter loss cases. Please see the "case studies" section and example code to better understand how this works.

Also fixes #11.

Copy link
Member

@kenrussell kenrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kai and I talked through these scenarios in person, and Kai iterated on the API design multiple times as he wrote example code which would recover from the various failure scenarios. Looks good to me.

@kainino0x kainino0x requested review from Kangz and kdashg February 1, 2019 02:39
@kainino0x kainino0x requested a review from litherum February 1, 2019 03:12
@Kangz
Copy link
Contributor

Kangz commented Feb 1, 2019

Thanks for putting this up, it trades synchronicity of create device in favor of making it easy to write robust device and adapter loss handling. Nice!

@kainino0x kainino0x force-pushed the errors2-deviceloss branch 3 times, most recently from 9d67aa9 to 49c447b Compare February 2, 2019 02:06
@beaufortfrancois
Copy link
Contributor

beaufortfrancois commented Feb 4, 2019

What do you think of using an event handler instead of this devicelost callback function? Mixing callback functions and promises isn't something I've seen a lot in Web APIs. Note that this proposal would be similar to how Web Bluetooth and Web USB handles device disconnections.

See my proposal below

Promise<GPUDevice> requestDevice(GPUDeviceDescriptor descriptor);
partial interface GPUDevice : EventTarget {
  attribute EventHandler ondevicelost;
};

ondevicelost is an EventHandler which is called when something goes fatally wrong on the GPU device (e.g. unexpected out-of-memory, crash, or native device loss).

[
    Constructor(DOMString type, GPUDeviceLostEventInit gpuDeviceLostEventInitDict),
    Exposed=Window
]
interface GPUDeviceLostEvent : Event {
  readonly attribute DOMString message;
};

dictionary GPUDeviceLostEventInit : EventInit {
  required DOMString message;
};

Fire an event named "devicelost" at GPUDevice using GPUDeviceLostEvent with its message attribute initialized to the error message that triggered device lost.


Example would have to be updated as well. Note that this.device needs to be reset before and after this change.

    try {
      this.device = await this.adapter.requestDevice({ /* options */ });
      this.device.addEventListener('devicelost', function(event) {
        // Device was lost.
        console.error("device lost", info);
        // Try to get a device again.
        this.ensureDevice();
      });
      this.rendering = true;
    } catch (e) {

@kainino0x
Copy link
Contributor Author

kainino0x commented Feb 4, 2019

For a while before I opened this PR I had it as an event rather than a callback. However I tentatively changed it to a callback and left this TODO:

(TODO: this could probably be an ondevicelost event instead. However, this (1) makes it required and (2) makes it clear that no error can be missed between creation and event listener registration.)

Do you have specific advice?

  1. Not necessary for it to be required, it's just nice because it forces developers to notice they need to do something there.
  2. I'm not sure whether we can guarantee the event handler will be called if the device is lost before it's set. This seems to work in WebGL today, but I'm not sure if it's guaranteed/guaranteeable:
(() => {
  const cvs = document.createElement('canvas');
  const gl = cvs.getContext('webgl');
  const ext = gl.getExtension('WEBGL_lose_context');
  ext.loseContext();
  cvs.addEventListener('webglcontextlost', (e) => { console.warn(e); });
})();

This is necessary because the app has to know it's never going to leave a context loss on the floor, because if it does it won't know to reinitialize.

callback GPUDeviceLostCallback = void (GPUDeviceLostInfo info);
interface GPUDeviceLostInfo {
readonly attribute GPUDevice device;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RC: remove or rename to lostDevice

@beaufortfrancois
Copy link
Contributor

2. I'm not sure whether we can guarantee the event handler will be called if the device is lost before it's set. This seems to work in WebGL today, but I'm not sure if it's guaranteed/guaranteeable:

In that case, we could have a lostboolean attribute added to GPUDevice that would reflect the device lost state. This is what does the Web Bluetooth spec. Note that WebUSB doesn't do this.

partial interface GPUDevice : EventTarget {
  readonly attribute boolean lost;
  attribute EventHandler ondevicelost;
};

@kainino0x
Copy link
Contributor Author

I've been trying hard for there to not be two separate signals the app has to check to make sure they do things right, so I'd strongly like to avoid it.

If it were something like readonly attribute Promise<void> ondevicelost, and app has to do device.ondevicelost.then(...), it would guarantee that a late-registered handler would be called. But this kind of design would be very unusual, I think.

@grorg
Copy link
Contributor

grorg commented Feb 11, 2019

Discussed at the 11 Feb 2019 WebGPU Meeting

@kainino0x
Copy link
Contributor Author

Discussed requestAdapter vs listAdapters at the meeting. The issue is somewhat orthogonal.

I will consider at Rafael's other comments, try to describe the meaning of requestAdapter rejection, and then we'll try to merge this ASAP so it doesn't collide with requestAdapter vs listAdapters.

Copy link
Contributor

@RafaelCintron RafaelCintron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we made the decision to leave requestAdapter in the API for now, there are changes I'd like to see to this PR in light of that decision.

});
this.rendering = true;
} catch (e) {
// Request failed (likely due to adapter loss).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the adapter can also become lost, we should change requestAdapter so that it accepts onAdapterLost callback like you've done for requestDevice. If we assume that all devices become lost when an adapter becomes lost, we can even dispense with onDeviceLost in favor of just having onAdapterLost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually considered and specifically avoided this additional signal. I don't think there is any case in which the application would need to know about an adapter loss, aside from knowing that the device(s) were lost. They can try adapter.createDevice to discover whether the adapter will no longer vend any new devices.

Also, since one device on an adapter can be lost without other devices on the same adapter being lost, the signal needed to be on the device.

// Device was lost.
console.error("device lost", info);
// Try to get a device again.
this.ensureDevice();
Copy link
Contributor

@RafaelCintron RafaelCintron Feb 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the device becomes lost because the adapter becomes lost, this code will end up in an infinite loop of rejected promises at the hands of requestDevice. I think if either the device or the adapter becomes lost, the developer should re-request an adapter.

Update: Re-reading the code sample code a second time, I realize that it will not result in an infinite loop. The first requestDevice after receiving the callback will get rejected, causing the catch clause to run and clear out the adapter.

As @kainino0x points out below, there are cases where the device can become lost without the adapter becoming lost so this seems warranted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This said, the sample code could just always try on a newly requested adapter instead of retrying on the same adapter before trying on a new adapter. Both would be valid choices for an app, but the distinction also seems minimal. Maybe it would be better for the sample to be simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided not to simplify the example, because it's useful to demonstrate the use case. However, I did refactor the example to be iterative instead of recursive, which is clearer. I think it is equivalent.

@kainino0x kainino0x force-pushed the errors2-deviceloss branch 3 times, most recently from 7dcac62 to 653eb9a Compare February 13, 2019 01:51
@kainino0x
Copy link
Contributor Author

FYI: Updated the example (and rebased), but haven't resolved onDeviceLost vs event vs promise yet. So that part is not ready for review.

@kainino0x
Copy link
Contributor Author

kainino0x commented Feb 14, 2019

Here is a more comprehensive investigation of the options I've considered so far. @beaufortfrancois and @jungkees in particular, would you be able to comment on this?

Premises:

  1. A device will never be lost more than once.
  2. An application needs to be able to guarantee it won't miss a device loss happening.
    1. An application probably should not be able to write code that may or may not miss a device loss. (That is, any code they write should either never miss, or always miss (because the code is wrong).)
  3. An application should be able to register a handler that won't collide with other applications on the same page. (It would be common for multiple apps, e.g. chart widgets or model viewers, to exist together in the same page.)
    1. An application probably should not be able to easily collide with other applications on the same page.
  4. It probably shouldn't mismatch other Web Platform APIs.

Options:

  • Option 1: Event handler registered before device creation (precedent: canvas receives webglcontextlost; navigator.usb receives ondisconnect; navigator.bluetooth receives ongattserverdisconnected).
    • Doesn't express premise 1.
    • Satisfies premise 3, but not premise 3.i.: Each app on the page requires extra logic to make sure it doesn't collide with others. Since it doesn't happen by default, apps won't be compatible with running multiple-per-page unless they add this extra logic.
    • (Aside: In WebGL, canvas can only have one context, so this works.)
    • This could work if requestAdapter always vended a new adapter object, rather than returning the same one multiple times.
let device;
adapter.addEventListener('ondevicelost', (e) => {
  if (e.lostDevice !== device) { return; }
  // ...
});
device = await adapter.createDevice(...);
  • Option 2: Event handler on device: registered after device creation (edit: ServiceWorker has precedent for this).
    • May break premise 2; definitely breaks premise 2.i. Not sure we can guarantee the event is delivered to the handler. Even if we could, it's tricky for the app to make sure it registers the event listener early enough (just inserting an extra await between creation and registration could break it).
const device = await adapter.createDevice(...); // Note that this awaits
device.addEventListener('ondevicelost', (e) => { /* ... */ }); // Not guaranteed?
  • Option 3: Provide callback on creation (as an argument to createDevice, or in the device descriptor).

    • Unusual (breaks premise 4).
  • Option 4: Provide event listener on creation (as an argument to createDevice, or in the device descriptor).

    • Unusual (breaks premise 4).
    • Assuming the device is an EventTarget, an application can still accidentally do the wrong thing by doing the addEventListener too late.
  • Option 5: Device has readonly attribute Promise<GPUDeviceLostInfo> onlost. It is always pending until device loss.

    • Unusual (breaks premise 4). (Edit: Promises guide gives this as an example, and ServiceWorker actually uses it.)
    • Easy to guarantee the event is handled.
    • Could potentially be merged with getObjectStatus, if getObjectStatus only resolves on invalid (pending forever on valid or on device loss). (edit: unlikely)
const device = await adapter.createDevice(...);
device.onlost.then(() => { /* ... */ }); // Guaranteed, no matter how late it is added.

@RafaelCintron
Copy link
Contributor

RafaelCintron commented Feb 20, 2019

@kainino0x , thank you for the comprehensive investigation.

Given the fact that a device cannot become lost more than once, I think the promise approach is best because it handles multiple clients and guarantees notifications will not be missed.

I'm not concerned about breaking premise 4 because in the other APIs you list, a 'device' can become, restored/reconnected, etc. Hence, a callback or event is the only viable option because there can be multiple notifications. WebGPU does not have this problem.

@kainino0x
Copy link
Contributor Author

I think I forgot to mention above: the first option (webglcontextlost-style) would work if requestAdapter always vends a new Adapter object instead of returning the same one for the same physical adapter.

@jungkees
Copy link

For #198 (comment), I think the promise option fits very well. As @RafaelCintron mentioned, it's a state transition rather than an event that can fire multiple times.

I'd like to suggest .lost or .whenLost instead of .onlost. onlost sounds like a name of an event handler. The example 4 in https://w3ctag.github.io/promises-guide/#webidl-examples shows that it uses like stateMachine.loaded without on. It's also different from .lost() which is an action. So, device.lost sounds fine to me. (https://w3ctag.github.io/promises-guide/#state-transitions has a guide when to use a method and when to use an attribute.)

And actually we have a similar attribute in the platform: https://w3c.github.io/ServiceWorker/#dom-serviceworkercontainer-ready. People initially proposed .whenReady but landed on .ready.

FYI, there's a patter for option 2 (event handler on device) as well. We can define a task source on the EventTarget (like https://w3c.github.io/ServiceWorker/#dfn-client-message-queue) and queue events before the event handlers or event listeners are attached. We can make the task source hold that events until it is enabled. And we can enable it when the first event handler is attached and/or by addig an API to enable it like https://w3c.github.io/ServiceWorker/#navigator-service-worker-startMessages.

@kainino0x
Copy link
Contributor Author

Thank you very much for the review, and pointer to the prior work with ServiceWorkerContainer.ready and option 2. It'll be helpful to take a look at how those work.

@kainino0x
Copy link
Contributor Author

Applied changes to have Promise<GPUDeviceLostInfo> lost; on GPUDevice.

Also changed the example again to remove the loop; ensureDevice doesn't need to try more than twice to get a device, because it always starts from the top on device loss. I think(?) the loop was an artifact of an old intermediate state.

@kainino0x
Copy link
Contributor Author

Changes are not too major; see 8ecd8bb.

Maybe we can cover and finish this up at the upcoming meeting. Apologies again for the late review request.

Copy link
Contributor

@beaufortfrancois beaufortfrancois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the GPUDevice.lost promise pattern. The thing that is unusual is the fact that the promise may never resolve during the life of the web app if everything goes well.
Maybe adding a note like:

The returned lost promise will never reject and may be pending forever.

@kainino0x
Copy link
Contributor Author

Thanks for the review!

Copy link
Contributor

@Kangz Kangz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good, thanks for the many many iterations @kainino0x ! Hopefully we can resolve any remaining issues in today's call and merge the PR.

@kainino0x
Copy link
Contributor Author

Done. I'm going to go ahead with merge. Thanks all!

@kainino0x kainino0x changed the title ErrorHandling.md: Fatal Errors with Promises+callbacks ErrorHandling.md: Fatal Errors with Promises Feb 28, 2019
@kainino0x kainino0x merged commit afd563d into gpuweb:master Feb 28, 2019
@kainino0x kainino0x deleted the errors2-deviceloss branch February 28, 2019 23:36
JusSn pushed a commit to JusSn/gpuweb that referenced this pull request Mar 26, 2019
ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better specified context events

9 participants