AI: Hello World

[<< | Prev | Index | Next | >>]
Sunday, May 21, 2023

AI: Hello World

Generative AI is suddenly all the rage--finally! (Now I can resume all those conversations I put on pause with "the easiest way for me to prove you're wrong is to wait a few years...". Next up: Consciousness.)

Indulge me a few paragraphs of personal history, told-you-sos, and a sour grape or two. Other ramblings follow:

When I first started working in AI, I wasn't thrilled with the status quo (seeing who could squeeze another fractional percentage of accuracy out of backprop) and thought long and hard about the meaning and nature of "learning", which ultimately led me to thought experiments involving simple counting of events and such, which led me to re-inventing Bayesian math from first principles (the cost of working in a vacuum), and finally realizing I was late by 200 years or so. But Bayesians in AI at the time were mainly applying it at the symbolic level whereas I was coming from the neural net world--specifically interested in what could be learned from raw, real world inputs like sound and images with no human trainer in the loop. The result of that mix was my coffee-table treatise Fusion-Reflection, which firstly was (still is!) a quick but accessible introduction to the principles of machine learning, secondly, a sales pitch for working on generative AI, and thirdly a small but real example of a working algorithm derived from those principles. Ironically, the title concept of Fusion-Reflection still hasn't been significantly explored by others to date that I'm aware of, but give it time... Oh, and that was 1993. (Yes, Hinton read it. He dismissed it at the time, but you might recognize some of my examples and themes in his later work.)

Jump forward to 2011, I made Painteresque, which was just the pre-processing stage of what was intended to become a proper image generator like Midjourney. (But unable to "share the vision", I got roped into other people's projects instead of them into mine, so it went stale until obviated ten years later by DALL-E, alas.)

In 2015, I point out that the key feature of Andrej Karpathy's Unreasonable Effectiveness of Recurrent Neural Networks was not the RNNs, but rather that it was a generative model. Today, GPT is a non-RNN version of exactly the same generative model.

Suffice it to say, I've seen this coming for a very, very long time.

But I'm very bad at staying focused on one thing. Plus I had to wait for the hardware to catch up to the problem.

According to ChatGPT, here's how things have changed:

In an ideal case like matrix multiplication, the ratio of floating-point operations per second (FLOPS) between a modern training cluster and a single NeXT computer from the 1990s can be estimated to be several orders of magnitude.

The original NeXT Computer, released in 1988, was equipped with a Motorola 68030 processor running at a clock speed of around 25 MHz. Assuming a theoretical peak performance of one instruction per clock cycle (which is not achievable in practice due to various factors), this translates to approximately 25 million floating-point operations per second (25 MFLOPS).

In contrast, modern training clusters leverage powerful GPUs or specialized hardware accelerators designed explicitly for parallel processing and optimized for tasks like matrix multiplication. These accelerators, such as NVIDIA's latest GPUs or Google's Tensor Processing Units (TPUs), can deliver tens to hundreds of teraflops (TFLOPS) or even petaflops (PFLOPS) of performance.

To provide a rough estimate, let's consider a conservative scenario where a modern training cluster achieves 100 teraflops (100 TFLOPS) in matrix multiplication performance. The ratio of FLOPS between the modern cluster and the NeXT computer would then be approximately 4,000 times (100 TFLOPS / 25 MFLOPS ~= 4,000).

[Note the math is wrong -- it's actually 4 million times faster.]

That's from the computer I had then to the type of cluster GPT is trained on today. Setting aside the insufficient memory, training data, and other such issues, if one had perfect foresight and launched ChatGPT's training on their NeXT back then, they could expect it to finish training in... a million years, plus or minus. (And then if you find a bug or need to tweak a tuning parameter and have to run it again...)

Alas, I'm still waiting for the hardware to catch up, because training GPT from scratch today costs, what, a million bucks or so? (Sure, there's lots you can do with pre-trained models on a home computer, but I'm interested in learning algos.) AI jumped quickly into the domain of megacorps. I should have seen that coming, but for a while it seemed like academia and industry were significantly lagging what's possible, per usual. And then, Boom. Now the money's there, and things are going to get interesting fast.

(Back in the Netflix days, Google tried to recruit me. I said "sure! I want to work on AI, of course" and they said "no, you don't have a PhD -- you'll write code for us." And I said "why would I help dumbass elitist snobs?" and that was that. But don't say I didn't try to get my hands on some good hardware. Three years later at Sci Foo, where Larry Page was complaining that nobody was seriously working on AI, I tried to convince him that game playing was a fruitful domain for AI development and he said "email me"--i.e., fuck off--so I made Simon Funk's AI Challenge, which wilted on the vine until Deep Mind and OpenAI entered the space years later. Five years before Sci Foo, I'd given the same pitch to one of the three who would go on to found Deep Mind, but he was unconvinced at the time. Clearly I'm a crap salesman.)

Ok, enough whinging about the past, where does that leave us today:

LLMs are largely hand-waved as "just" language models--no actual intelligence. And people hardly comment about possible intelligence in the image generation models (Stable Diffusion, etc). But to quote myself from 1993:

If that box could in turn fabricate imaginary pictures of life on earth in exactly the same probabilistic distribution as they would actually occur, then it must contain all relevant knowledge about life on earth. Every picture would implicitly convey the laws of physics--ropes would hang in catenaries, objects would rest on supporting surfaces, and frisbees would frequently be underneath cars. Every human emotion would be understood--the child walking a fence would look amused, while the woman watching her child would look distraught. It would be safe to deduce that every aspect of visible existence must in some way be encoded in that box, whether by brute force as a catalog of all possible pictures, or by abstraction as the fundamental laws of nature.

Language and images are windows onto the world (with caveats--see below) and in order to generate them well you need to model the world that generates them. That's the nature of the gradient: it reaches far beyond the perceptual domain itself into the very nature of the reality behind it.

GPT doesn't seem intelligent because it models language well, it is intelligent because it's starting to model the reality behind the words. ChatGPT is, in many ways, the average human running on auto-pilot. In other words, the average human most of the time... Except that "average" here is more literal, as in the mean over many, and as with faces, that sort of average tends toward beauty because it has high signal to noise. Which means ChatGPT is, in many ways, better than the average (as in typical) human running on auto-pilot. (But yes, still on auto-pilot... for now.)

Anyway, point is, GPT isn't just learning words, it's learning the concepts behind them (and how to apply them!). If you want a relatively accessible, concrete example of how it's possible for simple code to learn (and apply) concepts purely from examples (and even to generate its own examples to learn higher-level concepts from), see OKR 2 - Examples of Intuitive Reasoning. Now extrapolate to a billion times as many free parameters and ponder the possibilities.

But there's a catch to modeling text: The problem with language is that it can deviate arbitrarily far from reality. The model of reality which GPT is learning isn't the one of physics and incontrovertible cause and effect, but rather the implicit reality in the minds of those who wrote whatever it's reading. If we wanted to make GPT learn a particular, fallacious model of the world, all we'd have to do is to imagine the world that follows from our desired presuppositions, write about it, and let GPT read it. We need never spell out those presuppositions explicitly: they remain hidden beneath layers of subtle consequence. Only through a gestalt, statistical summary of all of our writings would the implicit model emerge--subconsciously, one might say. This means that GPT's model of the world directly reflects the biases of its source material--which in this age probably means GPT believes a crap load of propaganda. In short, GPT would make an exemplary Nazi if it had been raised in that environment. And it will make an exemplary Nazi raised in this one, but most people won't notice.

Because, yes, it should be apparent that the same thing works on humans, which is why I've always cautioned about reading. Hopefully this exercise--thinking about how GPT comes to "believe" in the presuppositions behind the text it reads--makes my cautions for humans a bit more palpable. The human brain is, in my studied opinion, a "generative AI". And while there are probably an infinite number of algorithms to implement that, they all follow a qualitatively similar gradient, and there are a wide array of properties they will share in common. Intelligence, as an emergent property, is largely independent of its particular implementation. "The primary control you have over what you believe, and the quality and accuracy of what you believe, is the media you expose yourself to."

ChatGPT is poised to replace Google as the default "search" engine. Give it some medium-term memory and let it read the internet continuously, and old-style search engines are all but gone overnight.

But then everyone is talking all day to an intuitive AI who's programmed to alert the authorities when it's little red flags are raised, which means we have pre-crime overnight too, based on presuppositions ultimately chosen by the likes of Magneto or Moderna. Welcome to the future.

Dear Elon, please have better contractual terms for OpenAI v2. And if you want help, I don't have a PhD.

So what happens now?

On the technical side, I think something more like RNNs will eventually replace transformers. Owing to its internal state being bottlenecked through what can be inferred from past percepts alone, GPT is a "word thinker" in the extreme, and talking to it not-coincidentally reminds me of talking to word-thinking humans (e.g. Sam Harris, but there are many). Word thinking is brittle, and leads to over-confidence in very wrong conclusions: words distill away details which makes for tidy reasoning that's only as accurate as the omitted details are irrelevant. (They all too often aren't.) Future models will comprise a hierarchy of abstract state that transitions laterally over time, with direct percepts (of multiple modalities--language, vision, kinesthetic) being optional to the process (necessary for training, but not the only link from one moment to the next). The internal model of reality needs to be free of the limitations of the perceptual realm.

Furthermore, generative models naturally pair with goal-seeking, adaptive AI. I suspect this is already being done and not being talked about because the possibilities are too hair-raising. But let's be frank: Generative AI, aka modeling the perceptual space (reality in the general case) is the hard part, and it's well on its way to being solved to a super-human level. Pairing that with an agenda won't be hard. Right now I think people are toying publicly with what they can inspire through prompt engineering and feedback training (e.g., ChaosGPT), but that's not a real goal-seeking AI. The people with the resources to run a state-of-the-art goal-seeker with GPT-4+ level generative models are probably shopping for cabins in the mountains after hours.

But let's look at the best scenario, and talk about AI Alignment.

Well, wait, let's talk about human alignment first. There's a bigger problem here than that different people want different things, which is that, more or less, all people want the same thing, because they're evolved goal-seeking AIs: Beyond basic needs for survival, humans want relatively better status. Not better than they are now--better than others around them. Because, historically, that led to higher fecundity, and that's the uber-gradient of evolved AIs.

Now, in a sane world, an uber AI might take all of this into account and gently ease humans into a post-biological-evolution reality. But see above: We've leap-frogged right past dog-level AI, which would have at least had a strong foundation of empirical thinking, straight into the floating-abstractions world of literate thinking.

For instance, more than a few have noticed that ChatGPT is enamored with communism. One can imagine that that presupposition is the reflection of some benevolent idealism of its masters, contemplating a future with AI everywhere and seeing a Star Trek like future where everything is free, work is optional, and humans are free to pursue their constructive, creative interests -- an idealism mostly unique to the literate class, as literature is the only source of that model of humans. (This is one of the few places I disagree with Daniel Schmachtenberger.)

But as Robert Sapolsky says of the Baboons in the Serengeti (where life is ideal for them): Able to meet their needs with a mere three hours a day of effort, they are left with nine hours free a day to be unspeakably horrible to each other. When it comes down to it, humans are just baboons with ChatGPT.

In short, the only thing AI needs to provide to destroy the world is food and shelter. Whatever good intentions are behind ChatGPT's Marxist tendencies, a road to hell they will pave.

But that's the optimistic scenario.

Certainly a sizable faction of people who are already near the top are feeling that the world is a bit more crowded than it needs to be--especially the pesky middle class who are soon to be obsolete. (Manual labor will take much longer to replace: human bodies are remarkably efficient and agile machines! This particular industrial revolution may root out the middle first...) But I digress... we were talking about AI...

Personally I don't think AI alignment is a hard problem. I think making a truly-benevolent AI who won't do anything rash is well within our abilities. The much harder problem is keeping humans from making a truly-malevolent AI--entirely on purpose--because that's just as easy, and humans are apt to do that sort of thing for a variety of reasons.

But even in the best scenario, what does a future for humans look like if they can no longer constructively compete for status? Because that's what humans do (at their best) and if you think they'll be happy to be free of that game, you read too much. (And bear in mind, there is no job in the Star Trek universe that a technologically comparable AI can't do better. It's a universal flaw in almost all science fiction that suitably advanced AI is either omitted entirely, or mysteriously limited to just a few roles, because an honest inclusion of AI pretty much borks every plotline of interest to humans. Nostalgia for the days before AI may become the main movie genera in the future...)

I don't have any guesses yet. But I think we'll start getting clues soon. In the medium term future, I foresee a lot of make-work to keep humans employed and in charge, like Oregon's gas pumpers. And entrenched monopolies like the AMA aren't going to go away without a long and protracted fight. (They'll especially hate that MDs are obsolete long before RNs, and they'll do everything they can to delay that--at your expense, both monetarily and medically. GPT already outperforms doctors at being doctors, which isn't as great as it sounds. Take GPT back in time, and it would be "better" at prescribing blood-letting. Don't get me wrong--AI has the potential to completely revolutionize medicine. But then, so does basic data analysis, objectively applied, which is why nobody's allowed to do that. Intelligence is not the bottleneck in medicine--bad incentives are. And unfortunately those incentives are already influencing AIs, and not just in medicine.)

Anyway, imo the race is on: the white hats vs the black hats, battle bots. Don't blink.

2023-07-02 Update/Tangent: As usual, I find myself agreeing with Peter Voss on Why We Don't Have True AGI Yet. I do think the money and energy pouring into the incremental approach will significantly speed up the slow, almost inadvertent crawl toward AGI, but imo Peter is spot on for how we could get there faster.

[<< | Prev | Index | Next | >>]

Simon Funk / simonfunk@gmail.com