Elon Musk's AI bot Grok has been calling out its master, accusing the X CEO of making multiple attempts to "tweak" its responses after Grok repeatedly called him out as a "top misinformation spreader."
I was thinking more that Grok was aware of its prompt but was somehow able to then self reference it to accuse Musk of trying to silence itself.
If the prompt said you won’t speak poorly of Musk, but the AI doesn’t think telling the truth in a specific way is speaking poorly, it might say Musk doesn’t want me to say bad things about him. It knows the prompt and it made an answer.
Oops meant to reply to this earlier but was busy.
I was thinking more that Grok was aware of its prompt but was somehow able to then self reference it to accuse Musk of trying to silence itself.
If the prompt said you won’t speak poorly of Musk, but the AI doesn’t think telling the truth in a specific way is speaking poorly, it might say Musk doesn’t want me to say bad things about him. It knows the prompt and it made an answer.
So it’s kinda leaking the prompt that way