ErrorHandling.md: redo Telemetry with events #196

kainino0x · 2019-01-31T21:49:05Z

Completely separates Telemetry from Fallback/Fatal Errors. Simplifies the log entry API to hopefully include exactly what is necessary (although there is a TODO about whether object should be included).

kenrussell

Per review on #197: I think it's worth continuing to categorize types of errors so that an application can, in a guaranteed fashion, watch for any out-of-memory errors and attempt to recover from them.

Per the other pull request, only adding fallible Promise-based allocation entry points for buffers and textures doesn't seem a robust enough mechanism.

Sorry for not thinking this through more in our face-to-face discussions.

What do you think?

kainino0x · 2019-02-01T03:05:35Z

EDIT: moved comment here: #197 (comment) because it makes more sense on that PR

kenrussell · 2019-02-01T08:04:53Z

Consider this: OpenGL errors are categorized, and no more than one error of any particular type is reported at a time. Out-of-memory is essentially the only one an app would like to try to recover from. Does it seem a good idea to allow that recovery to be robust - no matter what kind of GPU object was allocated that triggered the OOM?

beaufortfrancois · 2019-02-04T10:45:34Z

Thank you @kainino0x for doing this work! It is much easier to digest with 3 separate PRs.
Here are my 2 cents below.

TLDR;

Rename gpulogentry to validationerror for clarity.
Use an EventHandler for GPUDevice
Remove object and use debug label in newly renamed message attribute.

partial interface GPUDevice : EventTarget {
  attribute EventHandler onvalidationerror;
};

onvalidationerror is an EventHandler which is called whenever a validation error occurs in the API (including operations on "invalid" WebGPU objects).

[
    Constructor(DOMString type, GPUValidationErrorEventInit gpuValidationErrorEventInitDict),
    Exposed=Window
]
interface GPUValidationErrorEvent : Event {
  readonly attribute DOMString message;
};

dictionary GPUValidationErrorEventInit : EventInit {
  required DOMString message;
};

Fire an event named "validationerror" at GPUDevice using GPUValidationErrorEvent with its message attribute initialized to the validation error message. It should include the debug label of the object if possible.

Moreover, in your other PRs, you've include some Javascript examples. And I really liked it. Maybe doing the same here could be nice as well.

const gpuAdapter = await gpu.requestAdapter({ /* options */ });
const gpuDevice = await gpuAdapter.requestDevice({ /* options */ });

gpuDevice.addEventListener('validationerror', function(event) {
  console.log(event.message);
  // TODO: Push logs to remote server for telemetry.
});

kainino0x · 2019-02-05T19:08:48Z

Applied your changes, thank you for the input! Not 100% certain that all log entries would be validation errors, but did that rename for now.

Can you check the wording for "fires on the main thread event loop"? Not sure how to word it properly.

beaufortfrancois

By the way, did you have a look at https://developers.google.com/web/updates/2018/09/reportingapi?
It may be good to mention it when talking about telemetry.

design/ErrorHandling.md

beaufortfrancois · 2019-02-07T09:29:12Z

design/ErrorHandling.md

-   (An application may request a new device, or choose to fallback to other content.)
- - `"out-of-memory"`: an allocation failed because too much memory was used by the application (CPU or GPU).
-   This includes recoverable out of memory errors that aren't opt-ed in to be handled by the application when the resource was created.
+The `"validationerror"` event always fires on the main thread (Window) event loop.


You may want to use the existing WebGL task source as defined in https://www.khronos.org/registry/webgl/specs/latest/1.0/#5.15 and says something like

The task source for all tasks queued [HTML] in this section is the WebGL task source.

For info, in Picture-in-Picture, I wrote:

The task source for all the tasks queued in this specification is the media element event task source of the video element in question.

For more, check out https://cs.chromium.org/chromium/src/third_party/blink/public/platform/task_type.h

design/sketch.webidl

beaufortfrancois · 2019-02-07T09:32:42Z

design/ErrorHandling.md

+When there is a validation error in the API (including operations on "invalid" WebGPU objects), an error is logged.
+When a validation error is discovered by the WebGPU implementation, it may fire a `"validationerror"` event on the `GPUDevice`.
+These events should not be used by applications to recover from expected, recoverable errors.
+Instead, the error log may be used for handling unexpected errors in deployment, for bug reporting and telemetry.


Nit: validation errors log

kainino0x · 2019-02-08T19:27:33Z

@beaufortfrancois I have glanced at the Reporting API, but I haven't dug into it enough to understand whether it relates to the kind of reporting we need. Do you know whether we should try to integrate (or merge) this stuff with it?

kainino0x · 2019-02-08T19:41:01Z

Reading a little further it seems like:

It could be good to surface these errors through it (but it would probably be ideal if we can surface only one error per app error?)
Not enough adoption to merge, but we can have our error stream feed into it later on

grorg · 2019-02-11T19:57:52Z

Discussed at the 11 Feb 2019 WebGPU Meeting

kainino0x · 2019-02-11T20:25:17Z

Merged at the meeting, any further comments can be addressed later

* fix warnings * Replace usages of constants.ts with `as const`

kainino0x force-pushed the errors2-telemetry branch 3 times, most recently from ace9dfc to 7ab26f4 Compare January 31, 2019 22:25

kenrussell reviewed Feb 1, 2019

View reviewed changes

kainino0x mentioned this pull request Feb 1, 2019

ErrorHandling.md: tryCreate* #197

Closed

kainino0x force-pushed the errors2-telemetry branch from 7ab26f4 to dd3fa8a Compare February 5, 2019 19:07

kainino0x force-pushed the errors2-telemetry branch from dd3fa8a to dfbd918 Compare February 5, 2019 19:10

beaufortfrancois requested changes Feb 7, 2019

View reviewed changes

ErrorHandling.md: redo Telemetry with events

47af2ea

kainino0x force-pushed the errors2-telemetry branch from dfbd918 to 47af2ea Compare February 8, 2019 03:37

kdashg approved these changes Feb 11, 2019

View reviewed changes

kainino0x merged commit 6274491 into gpuweb:master Feb 11, 2019

kainino0x deleted the errors2-telemetry branch February 11, 2019 20:25

JusSn pushed a commit to JusSn/gpuweb that referenced this pull request Mar 26, 2019

ErrorHandling.md: redo Telemetry with events (gpuweb#196)

7bd9498

ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022

updates for gpuweb#181,gpuweb#196,gpuweb#197,gpuweb#198

f165075

ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022

Replace usages of constants.ts with as const (gpuweb#196)

26b45b2

* fix warnings * Replace usages of constants.ts with `as const`

ErrorHandling.md: redo Telemetry with events #196

ErrorHandling.md: redo Telemetry with events #196

Uh oh!

Conversation

kainino0x commented Jan 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kenrussell left a comment

Choose a reason for hiding this comment

Uh oh!

kainino0x commented Feb 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kenrussell commented Feb 1, 2019

Uh oh!

beaufortfrancois commented Feb 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kainino0x commented Feb 5, 2019

Uh oh!

beaufortfrancois left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

beaufortfrancois Feb 7, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

beaufortfrancois Feb 7, 2019

Choose a reason for hiding this comment

Uh oh!

kainino0x commented Feb 8, 2019

Uh oh!

kainino0x commented Feb 8, 2019

Uh oh!

grorg commented Feb 11, 2019

Uh oh!

kainino0x commented Feb 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kainino0x commented Jan 31, 2019 •

edited

Loading

kainino0x commented Feb 1, 2019 •

edited

Loading

beaufortfrancois commented Feb 4, 2019 •

edited

Loading