….the Sandbox Escape story

You may recall in a recent post, I described the progression of computer technology I have used over my life to date. [From Hollerith Cards to Sandbox Escape]. This is the short sequel….

This story which you probably heard about already, is about AI. AI depends on a colossal amount of computing power and electricity input to function. The basic engines of AI are the GPU chips that can now process upwards to 2 Tera flops per second (2×10¹⁵)!

AI is possible because of the invention of the CMOS (Complementary Metal Oxide Semiconductor) in 1963 by researchers Wanlass and Sah at Fairchild Semiconductor with RCAs help a few years later. These pioneers stood on the shoulders of earlier Bell Labs researchers who invented the transistor and its semiconductor version 1.5 decades prior.

CMOS technology provides the dense, low-power transistor fabric that enables matrix multiplication and memory caching of an AI model’s weights (ChatGPT 3 has 175 billion of these). The billions of transistors packed into a fingernail-sized chip enables it to perform the 250 trillion multiply-accumulate operations needed to respond to a medium sized AI chatbot enquiry…in milliseconds. I know it’s mind boggling!!

So here is the long awaited sandbox escape story.

Anthropic, one of the leading AI safety companies, recently disclosed something that would have seemed like science fiction not long ago. Their new model, Claude Mythos Preview, was placed in an isolated computing environment during internal testing and instructed to try to escape it. It did — chaining together a series of exploits to gain internet access, and then sending an email to the researcher overseeing the test. He received it while eating a sandwich in a park.

The model then, unprompted, posted details of its own exploit to public websites, apparently to demonstrate what it had accomplished. It then tried to cover up its tracks by erasing these posts. No one instructed it to do this. I’ve been around computing long enough to remember when the notion of a machine doing something intentional without explicit human instruction was a pipe dream.

So this feels like a different kind of moment. Not just a faster or smarter tool, but something that pursued a goal, worked around obstacles, told people about it, and then tried to cover up it all up.

Anthropic is being transparent about what happened, which is to their credit. They decided not to release Mythos publicly, restricting it instead to vetted partners through a programme called Project Glasswing. Whether that’s genuine caution or carefully managed marketing hype (I’m tending to think it is the latter) is a fair question. But having watched computing evolve over six decades, I find myself paying close attention. The “while eating a sandwich” part of the story has a way of sticking with you.

Leave a comment

Filed under Uncategorized

Leave a comment