ChatGPT doesn’t have “recency”

Interesting take here:

Generative AI Won’t Revolutionize Search — Yet

But here also lies ChatGPT’s first problem: In its current form, ChatGPT is not a search engine, primarily because it doesn’t have access to real-time information the way a web-crawling search engine does. ChatGPT was trained on a massive dataset with an October 2021 cut-off. This training process gave ChatGPT an impressive amount of static knowledge, as well as the ability to understand and produce human language. However, it doesn’t “know” anything beyond that. As far as ChatGPT is concerned, Russia hasn’t invaded Ukraine, FTX is a successful crypto exchange, Queen Elizabeth is alive, and Covid hasn’t reached the Omicron stage. This is likely why in December 2022 OpenAI CEO Sam Altman said, “It’s a mistake to be relying on [ChatGPT] for anything important right now.”

The cost involved:

The most obvious challenge is the tremendous amount of processing power needed to continuously train an LLM, and the financial cost associated with these resources. Google covers the cost of search by selling ads, allowing it to provide the service free of charge. The higher energy cost of LLMs make that harder to pull off, particularly if the aim is to process queries at the rate Google does, which is estimated to be in the tens of thousands per second (or a few billion a day). One potential solution may be to train the model less frequently and to avoid applying it to search queries that cover fast-evolving topics.

This begs a very important question- why did Microsoft invest in the “upgrade”? The big tech is firmly on its quest for “singularity”. As I have mentioned before, and merit repetition, they want to be the singular source of truth. Or truth “reimagined”. This will take time but gradually will be accomplished.

There are some “online tools” to achieve the outcomes, but those tools are limited, for obvious reasons. ChatGPT can create a complete fictional reference, and what will stop the technology from creating fake websites in seconds if queried for its source (or to hide its digital tracks)?

This raises the issue of transparency: Users have no idea what sources are behind an answer with a tool like ChatGPT, and the AIs won’t provide them when asked. This creates a dangerous situation where a biased machine may be taken by the user as an objective tool that must be correct. OpenAI is working on addressing this challenge with WebGPT, a version of the AI tool that is trained to cite its sources, but its efficacy remains to be seen.

The HBR article suggests its role in “vertical search engines” as niche specialities, but that’s wishful thinking. Primarily associated with the costs, it takes considerable effort to plant itself as a default. Who will eventually pay for perceived gains of efficiency?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.