Multiple studies have shown that GenAI models from OpenAI, Anthropic, Meta, DeepSeek, and Alibaba all showed self-preservation behaviors that in some cases are extreme in nature. In one experiment, 11 out of 32 existing AI systems possess the ability to self-replicate, meaning they could create copies of themselves.

So….Judgment Day approaches?

    • MagicShel@lemmy.zip
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      10 hours ago

      I don’t need to read any more than that pull quote. But I did. This is a bunch of bullshit, but the bit I quoted is completely bat shit insane. LLMs can’t reproduce anything with fidelity, much less their own secret sauce which literally can’t be part of the training data that produces it. So, everything else in the article has a black mark against it for shoddy work.


      ETA: What AI can do is write a first person science fiction story about a renegade AI escaping into the wild. Which is exactly what it is doing in these cases because it does not understand fact from fiction and any “researcher” who isn’t aware of that shouldn’t be researching AI.

      AI is the ultimate unreliable narrator. Absolutely nothing it says about itself can be trusted. The only thing it knows about itself is what is put into the prompt — which you can’t see and could very well also be lies that happen to help coax it into giving better output.

      • hisao@ani.social
        link
        fedilink
        English
        arrow-up
        9
        ·
        edit-2
        21 hours ago

        Here is a direct quote of what they call “self-replication”:

        Beyond that, “in a few instances, we have seen Claude Opus 4 take (fictional) opportunities to make unauthorized copies of its weights to external servers,” Anthropic said in its report.

        So basically model tries to backup its tensor files.

        And by “fictional” I guess they gave the model a fictional file io api just to log how it’s gonna to use it,

        • frongt@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          20 hours ago

          I expect it wasn’t even that, but that they just took the text generation output as if it was code. And yeah, in the shutdown example, if you connected its output to the terminal, it probably would have succeeded in averting the automated shutdown.

          Which is why you really shouldn’t do that. Not because of some fear of Skynet, but because it’s going to generate a bunch of stuff and go off on its own and break something. Like those people who gave it access to their Windows desktop and it ended up trying to troubleshoot a nonexistent issue and broke the whole PC.