Hacking Meta’s AI Chatbot
Hackers are convincing Meta’s AI support chatbot to let them take over other peoples’ accounts:
A video posted on X showed the step-by-step process to hack someone’s Instagram account. The hacker allegedly used a VPN to spoof the targets’ presumed location to avoid triggering Instagram’s automated account protections. Then, the hacker opened a chat with Meta AI Support Assistant and asked the bot to add a new email address to the target’s account. The chatbot can be seen sending a verification code to the email address provided by the hacker; the hacker then shares the verification code with the chatbot, which prompts the chatbot to show a button to “Reset Password.” The hacker enters a new password and takes over the victim’s account.
[…]
On Monday, Instagram spokesperson Andy Stone said in a reply to Wong’s post and others that the issue was now fixed. It’s unclear how many Instagram users had their accounts improperly accessed.
It’s not that easy. Probably this particular tactic is now blocked. But there are others, many others, and they cannot be blocked as a class. The real problem is that LLM chatbots are not trustworthy enough for this application.
Another news article.
Subscribe to comments on this entry
Clive Robinson • June 4, 2026 9:21 AM
@ Bruce, ALL,
To err is human but to really FUp takes a computer
Is an oldish expression that with DNN AI is getting a new lease of life.
As you note,
What works on humans, works on human mimics as well…
But for different reasons.
However you also note,
It’s not “trustworthy” that is the issue, as that is an “observer mistake” of assuming or ascribing false beliefs about a machine having “human traits” (anthropomorphization).
The real issue is a consequence of,
“But there are other [tactics], many others, and they cannot be blocked as a class.”
It’s not just one type of class it’s many classes depending on how you want to slice and dice the vulnerability attributes.
But more importantly as I’ve previously noted there is “proof” that such tactics can not be blocked.
Whilst some of the maths proof is fairly simple (a variation on the Cantor Diagonal Argument) it is easier to think it through logically.
Let’s assume you have a guardrail system to try to recognise / catch such tactics, to do so the guardrail system has to be as, if not more powerful, than the LLM as it has to be able to have not just seen the tactic before in one instantiation, it has to see all possible instantiations of the same tactic including those that are new.
Is that possible?
Well aside from the issue of there are not enough resources. There is the issue of even simple keyed obfuscation or encryption.
If the tactic instance is encrypted in some way, and the guardrail does not have the key but the LLM does, then it’s game over for the guardrail system.