It seems as if the only thing you ever hear about these days is artificial intelligence. And a lot of people and companies are riding the great AI Hype Train.
But is it overblown? What is it really about, anyway?
Buying Your Ticket to the AI Hype Train
Evidently, the term artificial intelligence predates even my birth. But why is it now so, so very hot?
In part, we can all point fingers at ChatGPT. In 2022, they developed newish technology and it took off, fast! Kind of like an express train, if you will.
By early 2023, kids were already using it to write papers.
As a result, parents and educators started to get nervous. Really, really nervous. But can you blame them?
How Did the AI Hype Train Pull Into the Station?
But let’s back up a bit. AI didn’t just spring out fully formed, like Athena vis a vis Zeus. In some ways, it can pay to have had an eclectic career. Because I can honestly explain a couple of its origination points.
Databases
I’m sure most adults have heard of databases. But how many know, exactly, what one is? Well, in a way, it’s a kind of interactive list. It’s a means of organizing (basing) information (data).
Okay, so that was clear as mud.
To best explain databases, I like to turn to a personal favorite explanation.
The Database is Coming From Inside Your House
Wait, what?
We all have a database. You, me, your weird neighbor who lives down the street, and the King of England all have at least one database. And I am more than willing to bet that it’s the same type of database.
I repeat: what?
It’s definitely on your phone, and it may also still be on paper.
I am talking about your address list.
Why is an Address List a Database?
Your address list contains a ton of nuggets of information. Here, I’ll explain.
Say, you have an Uncle Dave Smith, who lives in Idaho, but used to live in Pennsylvania. And he’s married to your Aunt Susie Smith, but she was married before, to a man named William Jones. During her first marriage, Susie was known as Susie Jones.
Susie and William had a child together, Lou Jones. But Susie and Dave had a child together, Carol. Lou is away at college, in Colorado. Carol is engaged to be married to Fred Roe.
Are you with me so far?
If you wanted to list everyone who currently lives in Idaho, you’d get Dave, Susie, Carol, and maybe Lou (after all, college is generally not your permanent mailing address) and possibly also Fred.
Who fits in a set of people who have ever been named Jones? That would be William and Lou. But it’s also Susie.
Now Multiply That Times a Hundred
Let’s say you’re Carol and Fred’s wedding planner. You need to send out the invitations. And let’s say you’re sending so many invitations that it pays to batch mail everything. Using the database, you come up with four people in Idaho.
With a large family and an invitation list as long as your arm, you end up with a lot of data to comb through. A database automatically helps you pull out whatever you want (assuming the data is in there).
The Wonderful World of Granularity
Databases have fields. A field is a specific bit of information. Above, we have first names and we have state addresses. But we also have some relationship info. And while we don’t have ages or dates of birth, we can infer that Susie, Dave, and William are all older than Lou and Carol (but not necessarily Fred). We can also infer that William is older than Carol.
Now add the usual trappings of an address book, such as full name, address, phone number with area code, and ZIP code. With this information, you have even more inferences you can draw.
For example, if two people don’t share a full address, but they share a ZIP code, you know that means they live close to each other. If the wedding is somewhere they could drive to, but it’s a far drive, you could add a note suggesting to those people that they travel together.
What Does This Have to do With the AI Hype Train?
The generative and predictive AI you’ve been hearing about is really just a fancy way of saying it’s a database.
Say what?
There’s a ton of information, and all your computer does is look it up. Just like you look up Aunt Susie’s address in a book or on your phone.
Except a computer does this millions of times faster.
Now it’s time to look at the other piece.
Language Models
A language model is a list of words. But unlike a database, it contains a bit more info. It’s essentially in terms of probabilities. This isn’t really like the chance of someone saying the word infant vs the word baby.
Rather, it’s the chance of someone saying the word the or the word pickle. Because while we don’t see those words as even close to being interchangeable, a computer doesn’t. That is, unless it is taught. But otherwise, it’s just items on a list to a computer.
But where and how does such a huge model come together?
The Derailing of the AI Hype Train
To build a large language model, you need content. Lots and lots of more or less properly written content. This content should cover a large swath of human thought and activity. It has to be very broad in scope.
So, the developers turned to a place where they knew there was a ton of content, more or less properly written, covering great, big chunks of the human experience.
The internet.
Except there’s just one problem.
The Fly in the Ointment
They didn’t get most people’s permission to use the content. Also, they never checked it for accuracy or tone. A computer can’t figure those things out (yet). But you and I can. For example, we can tell when someone’s joking about something.
The AI takes it seriously.
And what about all the personal data online? The GDPR law specifically says that individuals must give clear consent to the processing of their personal data. Did AI and its creators take the time to figure out which of the trillions of web pages have personal data?
The answer to that would clearly be: no.
Finally, there’s also the matter of copyright. There’s a ton of original material online. It may be snippets of professionally written fiction, like in a blurb. Or it could be places for posting fiction, like Wattpad.
Did the creators of the language model used in AI stop to ask the authors whether they could have permission to train the model on their prose or poetry?
What do you think?
The Caboose at the End of the AI Hype Train
So, it’s mainly just a fancier, easier to use version of the databases that have been around for decades. And its training process for the language model is more than a little suspect. It can’t read your mind. It’s not Skynet. Yet.
There are plenty of companies which are trying to replace content writers with generative AI. But this technology, in that area, really isn’t ready for primetime. Predictive AI, on the other hand, more or less is.
Predictive is the kind of AI being used to cull through thousands of records to compare the data from one medical test results to determine the likelihood of the patient getting cancer. This is the kind of speed which humans just can’t do.
So when you read another breathless article or blog post about artificial intelligence, check to see if the author is riding the AI hype train.
Because Casey Jones, you’d better watch your speed.