this post was submitted on 19 Sep 2023
635 points (98.0% liked)
Europe
8484 readers
1 users here now
News/Interesting Stories/Beautiful Pictures from Europe 🇪🇺
(Current banner: Thunder mountain, Germany, 🇩🇪 ) Feel free to post submissions for banner pictures
Rules
(This list is obviously incomplete, but it will get expanded when necessary)
- Be nice to each other (e.g. No direct insults against each other);
- No racism, antisemitism, dehumanisation of minorities or glorification of National Socialism allowed;
- No posts linking to mis-information funded by foreign states or billionaires.
Also check out [email protected]
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That hasn't been the case for a while. With current models there is basically nothing that gives them instantly away as AI. You have to go anomaly hunting in the hopes to find an extra finger or a bit of text that looks twisted, but that assumes you already expect AI to begin with.
The best tell tale sign left for AI is really just the composition. It's always very focused on the subject right in front of the camera, as if the person was posing for the photo, it's never just random slice of life snapshot where you have multiple people in the image interacting in any kind of complex way. The crux of course is that your average Instagram image looks not much different, so it's not exactly bullet proof either.
In the lab it's still possible to tell the difference, if you run across an image in the wild you'll just accept it as real without giving it a second though.
And I believe that is the reason why these AI images look like this: they've been trained with these Instagram and Instagram-like social media images, therefore that's what they can do.
It's part of the reason. Another big issue with current models is that their language understanding is very primitive. Something like "person standing" they can understand, but "one person with a red hat sitting and another standing behind them with a blue shirt" already fails. There will be red and blue things in the image, but they'll be pretty arbitrarily spread across and not assigned to the person the prompt said.
That said, this won't last for long. With Segment Anything we have AI that has a very good understanding of what is in an image, which should make training on arbitrary images much easier, as well as editing. We also have lots of research going into video and plenty of more powerful language models, that just haven't yet integrated into image generation. Even just ControlNet and some inpainting can overcome most of those issues, it just takes a bit more manual work than a text prompt. There is also DraGan, which is an incredible powerful drag&drop approach to AI image editing, but due to using a completely different approach from StableDiffusion it is not yet properly integrated into other tools.
That should terrify everyone