Why Wondering If We Are In An AI/ML Bubble is Missing The Point
If you're sitting here complaining about the ROI on compute, boy, are you missing the big things happening in front of you
It's 1:16 PM. I've made my way to a cafe in Cambridge, quietly thinking I'm a big-brained individual for reading AI/ML-focused technical papers with my afternoon Quiche. Then, I realize that most of the authors of these papers have their offices a mere .5 miles from me, pushing me deep into another coffee, seeing if I can get my mind to run a bit faster, my brain suddenly feeling a lot less big :(
As "Someday" by the Strokes bumps through my headphones and Quiche through my veins, I realize I have a few takes I'd like to share with the community, well, one in particular. Recently, a metric ton of articles, reports, and fancy Word docs came out regarding CapEx for Chips and how the ROI for said spend necessitates $600B in rev. As a result, many are saying there is an AI bubble. Let me be clear: there is a disconnection between historical revenue multiples and post-money valuations for AI/ML and Deep Learning companies. That being said, just because valuations can be wonky, that idea should be separate from the value of the underlying technology.
The transformer architecture and other attention-based mechanisms have created a fundamental shift in what can be quantified. What can be quantified can eventually be predicted. Prediction is the core of generation; generation and prediction combined are the basis for automation. (in another article, I'll write about why all the VCs love robotics now)
I want to reiterate the first point because it is the most crucial event in human history, though a close second is the invention of the HostelWorld App. The transformer and other attention-based deep learning protocols allow for sound, image, video, text, etc., to be quantified, aka represented by numbers (a process involving an Encoder with vector representations). Sometimes, video and non-text-based inputs are a bit more complicated, but let's let it ride for a minute.
The point here is that before the transformer, we could not transform words, sounds, videos, or images into numbers, let alone do efficient computations on those numbers (a transformer has a lot in it, but let's think about the simple idea that words can become numbers). The ability to do computations on words, sounds, videos, and images allows for a world in which we can make quantitative predictions about what used to be qualitative-only material. You can now predict the rest of a paragraph, a blank spot in a photo, and the logical ending of a movie. We can also do this with smells and thoughts (via MRI & EEG), and a few folks are working on quantifying emotions through electrical signals emitted from the human body (for all my woo-woo people out there, you might just get evidence that you've been right all along). All to say, this is quite simply not a bubble; it's the beginning of a world in which quantitative and qualitative have merged into one, a world where we can predict and generate all previously qualitative modalities in a quantitive manner.
Today, we are in the early days of a qualitative and quantitative merge, like when the iPhone came out, and the #1 app was the beer-drinking app; that's how early we are now. VCs can say that the chips are expensive, the products are making less money than they wanted early on, etc. That's just a short-sighted view because we are in the middle of a fundamental shift where what could not be computed is now able to be read by advanced ML models and used for computations; soon, there will be no difference between the training data and the lives we live. While I hate to use the matrix as an example, there is a scene where the world is seen through numbers; today, the transformer has made that possible (this is not hypothetical. It currently exists, and it's why AI tools can write papers for you and generate pics of your grandma on the moon). All this to say, if you're sitting here complaining about the ROI on compute, boy, are you missing the big things happening in front of you.