If you noticed, the blog was down for a few minutes now and then over the past couple of days, it was me making technical changes. I’m all done now, and things can get back to normal, at least what passes for normal around here. I also spent a little time exploring some artificial intelligence bot sites. They write very poor fiction. There’s no voice, just plot, and some characterizations. That’s not to say they aren’t helpful. They are. The bots keep surprising me with how the plots they create vary from what I expect.
This interests me a lot. All of the bots are neural networks. Without getting too technical, neural nets are computer programs that work as we imagine our brains do. We learn from birth by processing experiences. Our brains (maybe) take input and try to predict what will happen next. For example, a little kid will touch a hot stove. The result is a painful burn. Her brain creates a strong path that associates injury with touching a hot stove. That’s a very simple example. The key is prediction.
A neural net works in a similar way. It takes data, any data, and tests what might come next. For example, it could scan the first word in this post. Then, it looks to see if it has seen the word before. It has! It sees that 100 different words have followed that word in past tests. Each time a prediction was right, a number associated with the successful prediction has a value incremented. It guesses the word with the highest number associated with it. If it’s wrong, it goes to the next highest. After billions of trials, patterns emerge.
The people who designed the problem they wanted to solve had no idea what the neural net would produce. This is particularly true when you set a big one loose on massive data sets like all digital text on the Internet (by the way, no one ever attempted that). The bigger the data set, the more interesting the results. Models like ChatGPT are trained using massive numbers of cloud processors. GPT-4 learned against virtually all digitized books and 1.5 million websites; ours is one of them.
With models that large, no one has much of a clue exactly what sort of information paths are built in these neural nets. We have seen that these big models do something the data scientists call hallucination. The models will answer factual questions with answers that look correct. References to books and academic papers are provided. Upon analysis, it turns out that the “facts” are false, and the references point to nonexistent books and papers. What the hell?
The same sort of problem is observed with AI-driven cars. They will get into accidents that the human programmers can’t explain. It’s impossible to program every possible situation a car can encounter. AI is used to train a model that will end up driving the car. In fairness, these models are very good. Yes, they have accidents, but way fewer than human drivers. The problem is that people expect machines to be perfect. If a robot-driven car kills just one pedestrian, the car is considered a failure. It doesn’t matter that over the same number of miles, humans killed ten pedestrians. We humans are unfair that way.
Because a useful neural net model is so big, we don’t understand how to debug one. Maybe we will end up with AI psychiatrists who help artificial intelligence with problems like lying and missing dangerous situations. In the meantime, it’s fun to play with giant models like chatGPT.