GPT3: A layman’s explanation

I am linking to the summary highlights of a brilliant write up on GPT-3. I hope that you like it!

  • Enters a principle that has emerged during the last week as developers have played around with it: Garbage in, Garbage out.
  • The input of “what’s up?” doesn’t provide any context, so GPT-3 generates text in a random direction each time.
  • If you provide GPT-3 with context, say, the first couple paragraphs of an episode from Tinkered Thinking and it will generate a continuation of that episode which is shockingly on point and very believable.
  • To really answer the question: how exactly does a machine learn is similar to asking: how does a person arise from all the chatter between neurons in a person’s brain?
  • No one can really answer the question about what’s really going on when a neural net is being trained in a machine learning context.
  • (To dive a little deeper it’s useful to know that an infant has around 100 billion neurons, and as the authors learn how to exist in the world, the authors pare down this number significantly.)
  • GPT-3 was trained to generate text using a computational model that bares a lot of similarity to the jumble of neurons the authors call a brain.
  • Here’s an interesting fact about GPT-3: the neural net upon which it was trained contains 175 billion parameters, or what you might think of as neurons.
  • An actual human neuron is far more complex regarding how it listens and signals it’s neighbors.
  • This presents a level of complexity that far exceeds what is going on with the nodes or neurons in a machine learning context.
  • How does a neural net learn in order for something like GPT-3 to work?
  • GPT-3 was trained using text from the internet – an amount of text that is just inconceivable for a single human being to think about reading.
  • That was the block of text that was given to the neural net for it’s training.
  • Run this game an astronomical number of times with an inconceivably large amount of text and after a while the neural net gets pretty good at the game it’s playing.
  • A good way to think about it is to realize that humans use their own embeddings.
  • This discussion of embeddings is important because it’s pretty magical to realize that GPT-3 doesn’t know any words.
  • Through all of this weighted calibration using embedded language, GPT-3 has ‘learned’ the subtle rules that dictate how the authors humans pick the words in different contexts.
  • GPT-3 has essentially played that probability game a ridiculously unfathomable number of times in every context that humans have written about and that’s been plopped on the internet.
  • Unlike a real therapist who is human that needs to write down information about sessions in files that they reference, a GPT powered therapist could have a perfect memory of absolutely everything you’ve ever said.
  • But this specific GPT has been trained to flag details that don’t seem connected with the main text, or details that hint at the creation or use of loopholes in existing law, because naturally, it’s read all of existing law and has a perfect memory of all legal text.