Make Sec-CH-UA-Form-Factor a future proof way of selecting content

As proposed [here](https://github.com/WICG/ua-client-hints/pull/343#issuecomment-1658556415) by @djmitche (CC: @miketaylr) a new issue to discuss my ideas for a possible future proof way for the `Sec-CH-UA-Form-Factor` header.

My overall proposal is to make the `Sec-CH-UA-Form-Factor` header a clear future proof indicator on the kind of content and interaction is suitable for the device/client the user is using at the moment. So in addition to indicating the things we see today it should also be extensible into future developments (like contactlens screens, brain interaction, ...).

From the browser perspective this header is its indicator to tell the website what is the `best content` it can consume. I think this should be present on the first request. Which makes it something that should be allowed into the `low entropy` group and cannot contain any detail about the device.

So in the below proposal for this header is where I'm trying to stay at `useful detail` without going into `fingerprinting detail` (which is a tricky balance). 

Note that this also makes the `Sec-CH-UA-Mobile` effectively an extreme simplification of this header (perhaps in time even obsolete).

Essentially a site needs to know from the device:

- **What kind of output/content does the device support?**
  - The screen size (watch, phone ... TV ... )
  - Can it do VR ? Can it mix the surroundings with the VR content (i.e. supports VR & AR & MR --> XR).
  - Is it a technically limited screen (i.e. eInk does not support fast changes like animations)
- **What kind of input/interaction does the device support?**
  - Keyboard, Mouse, Touch, Gesture, ...
  - Perhaps also include kinds of sensors: Camera, Orientation, GPS, ...
- **What kind of attention from the user can you expect?**
  - High like in a browser or game
  - Medium like in a TV (a question is asked and then I stand up to get the remote)
  - Low like in a car (I'm driving ...)

More extensively the kinds of values I have in mind:

**The Output capabilities/ScreenType/Size indicator**
- `None`: No screen, Headless, Server-to-Server, etc.
- `Watch`: A (usually handheld, usually touch) screen < 2"
- `Phone`: A (usually handheld, usually touch) screen between 2" and 7"
- `Tablet`: A (usually handheld, usually touch) screen between 7" and 14"
- `Desktop`: A (usually movable but not handheld, usually no touch) screen between 15" and 30"
- `TV`: A fixed (usually wall mounted, no touch) large screen > 32"
- `VR`: A VR Headset that CANNOT mix the images from the outside world in view. So only suitable for VR content
- `XR`: A VR Headset that CAN mix the images from the outside world in view. So suitable for all VR/AR/MR content which is commonly called `XR`.
- `eInk`: A slow tablet sized display that only has 2 colors/greyscale and does not support fast changes like animations, movies and games.

**NOTE**: The above list needs discussion as it still mixes screen size and content capabilities.

**The Interaction capabilities indicator (multi valued)**
- `None`: No screen, Headless, Server-to-Server ... so no human interactions.
- `Keyboard`: A keyboard interaction
- `Mouse`: A mouse interaction
- `Touch`: A touch screen
- `Game`: A gamepad type controller (mini joysticks) as used on Playstation, Xbox, Nintendo switch, etc. suitable for fast interactions.
- `Remote`: A controller with only arrow keys, Ok and Cancel buttons: as used with many TVs and Set top boxes (like the "Google Chromecast with Google TV" and "Apple TV"). Only suitable for slow interactions.
- `Gesture`: A device that looks at gestures and motion of the user.
- `Voice`: A voice controlled device.
- `Camera`: It has a camera
- `Orientation`: It has an orientation sensor
- `Location`: It has location information (GPS and such)

If multiple interaction capabilities are present these should be always in alphabetical order to reduce fingerprinting.

**The Attention level indicator**
- `None`: No screen, Headless, Server-to-Server ... so no human attention at all.
- `Low`: You can expect to not gat a response at all from the user as they have other priorities. Common usage: Car
- `Medium`: The user should be able to respond within a minute. Common usage: TV (time to find the remote)
- `High`: The user should be able to respond within 1 or 2 seconds. Common usage: Normal Websites, Gaming


Kinds of field values I would expect

| Device           | Screen/Content | Interaction                       | Attention |
|------------------|----------------|-----------------------------------|-----------|
| Watch            | Watch          | Touch;Location;Orientation        | High      |
| Phone            | Phone          | Touch;Location;Orientation;Camera | High      |
| Tablet           | Tablet         | Touch;Location;Orientation;Camera | High      |
| Amazon Echo      | Tablet         | Touch                             | Medium    |
| PS5              | TV             | Game                              | High      |
| Nintendo Switch  | Phone          | Game                              | High      |
| Tesla            | Tablet         | Touch;Location                    | Low       |
| Google TV        | TV             | Remote                            | Medium    |
| Apple Vision Pro | XR             | Gesture;Orientation;Camera        | High      |
| PS4 VR Headset   | VR             | Gesture;Orientation               | High      |
| PS5 VR Headset   | XR             | Gesture;Orientation;Camera        | High      |

Note that it may seem to add a lot of entropy but I think it doesn't because just about all phones will have the same list here and from the other headers it is already known to be a phone. Same for all Tablets, all game consoles, etc.

To be discussed: How to fit this into the header.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make Sec-CH-UA-Form-Factor a future proof way of selecting content #344

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Device	Screen/Content	Interaction	Attention
Watch	Watch	Touch;Location;Orientation	High
Phone	Phone	Touch;Location;Orientation;Camera	High
Tablet	Tablet	Touch;Location;Orientation;Camera	High
Amazon Echo	Tablet	Touch	Medium
PS5	TV	Game	High
Nintendo Switch	Phone	Game	High
Tesla	Tablet	Touch;Location	Low
Google TV	TV	Remote	Medium
Apple Vision Pro	XR	Gesture;Orientation;Camera	High
PS4 VR Headset	VR	Gesture;Orientation	High
PS5 VR Headset	XR	Gesture;Orientation;Camera	High

Make Sec-CH-UA-Form-Factor a future proof way of selecting content #344

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions