Yep. Vulkan is recommended for cross-vendor setups, more commonly where there's integrated graphics.
I actually had ti and xtx variants, so vram was 12+24GB = 36 GB. Vulkan is implemented cross-vendor and running vulkan-based llama.cpp yielded similar (though slightly worse) performance than CUDA on the 3080ti as a point of reference.
I don't have this well documened but, from memory, Llama3.1 8B_k4 could reliably get arund 110 tk/s on CUDA and 100 on Vulkan on the same computer
I used this setup specifically to take advantage of the vastly increased VRAM of having two cards. I was able to use 32B_k4 models which were outside of the VRAM of either card and tracked power and RAM uasage with Lact. Performance seemed pretty great compared to my friend running the same models on a 4x4060ti setup using just CUDA.
If this is interesting to a lot of people, I could put this setup together to answer more questions / do a separate post. I took the setup apart because it physically used more space than what my case could accommodate and I had the 3080ti literally hanging out of a riser.
Alight.... Well played.
For those of you not on TikTok