Its not terribly far off from pre rendered or FMV games like Myst are doing it.
Cirk2
AMD Mobile chips tend to use a "half-generation" refresh of their main GPU architecture. For example the rdna3 Desktop line is gfx1100, gfx1101, gfx1102 depending on the model. The new mobile GPUs in the strix point CPUs (I. E. the new framework models) have gfx1150/gfx1151 denoting their rdna3.5 refresh.
The UDNA is in reference to amd stopping to have two lines of internal architectures: RDNA for graphics (Radeon cards) and CDNA for compute (like the MI line of chips) and instead have one architecture for both, labeld UDNA.
Yeah I really want to see the numbers on this.
It seems to be a absolute breaking issue for them, suggesting a actual noticeable delay. So it should at least be one frame (on 60hz display) of additional latency compared to x11. Especially since there is earlier work Using a Click based measurement setup showing no such delay.
The pi is either emitting a mouse move or detecting one on the physical device and measuring the time that passes until the photo diode/resistor (not an led) is detecting the cursor to move away.
It's essentially the same setup you would use to detect input to photon latency if you don't have a high speed camera.
Edit: Ah no there's another setup with a pi pico in there... Yeah it's binding and LED Lighting up to a button press to have two visual indicators (LED and Cursor Reaction) you can measure using a high speed camera. Them only having a 90fps phone camera makes it means the measurements are only in 11ms increments though.
Its highly dependent on implementation.
https://www.pugetsystems.com/labs/articles/stable-diffusion-performance-professional-gpus/
The experience on Linux is good (use docker otherwise python is dependency hell) but the basic torch based implementations (automatic, comfy) have bad performance. I have not managed to get shark to run on linux, the project is very windows focused and has no documentation for setup besides "run the installer".
Basically all of the vram trickery in torch is dependent on xformers, which is low-level cuda code and therefore does not work on amd. And has a running project to port it, but it's currently to incomplete to work.
Searched in the source:
So it's the airplay protocol.