this post was submitted on 13 Mar 2024
26 points (100.0% liked)

technology

23936 readers
96 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
 

In summary, a researcher found a new exploit for large language models that works across flagship models and can override the RLHF safeguards on said models to generate text in a baseline LLM text-predictor fashion.

Marked as NSFW because as an example, the researcher got GPT4 to generate erotica inolving Donald Trump and a pumpkin.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 1 year ago