Perhaps the Cost to Implement Customized AI Solutions is Not That Expensive

I’ve been a fan of computers since I learned to program COBOL as a 12-year-old. Although I chose not to become a professional programmer, I’ve had direct and indirect supervisory roles of technology throughout my career beginning with my first job as a consultant with Price Waterhouse. Major innovations in IT have always fascinated me.

I read about a recently leaked Google document titled “We Have No Moat, And Neither Does OpenAI.” It was not about the dangers of AI. Instead, the article was about how Open-Source AI will outcompete Google and OpenAI.

I found an excellent overview of the document on Simon Willison’s Weblog. Mr. Willison writes that “while OpenAI and Google continue to race to build the most powerful language models, their efforts are rapidly being eclipsed by the work happening in the open-source community.” One example cited in the paper is Vicuna: An Open Source ChatBot. The cost to train its latest version, Vicuna-13B, was around $300. The link I included to Vicuna discusses how the developers were able to train it so cheaply and quickly using open source.

The anonymous author of the paper is bullish on a process called LoRA or the Low-Rank Adaptation of Large Language Models. What LoRA does is to freeze the pre-trained model weights of ChatGPT or another large learning model like Meta’s LLaMA and injects trainable rank decomposition matrices into each layer of the Transformer architecture (what makes the Generative AI LLMs work). The process reduces the required training parameters for downstream tasks which reduces the General Processing Unit (GPU) memory requirement. For companies licensing the OpenAI model for their application, this saves substantial time and money. Mr. Willison points out that this means “as new and better datasets and tasks become available, the model can be cheaply updated without have to pay the cost of a full run.”

Some of the “major open problems” have been solved according to the Google author. For example:

LLMs on a Phone: People are running foundation models on a Pixel 6 at 5 tokens/sec. (btw, ChatGPT just announced the release of an app for your smartphone)
Scalable Personal AI: You can finetune a personalized AI on your laptop in an evening.
Responsible Release: This one is obviated, not solved – There are entire websites of art models with no restrictions whatsoever, and text will not be not far behind.
Multimodality: The current multimodal ScienceQA SOTA was trained in an hour.

Because of these open-source processes, the Google anonymous author points out that “focusing on maintaining some of the largest models on the planet actually puts us at a disadvantage.” Competitors hire Google technologists all the time writes the author. Also, research institutions around the world build on each other’s work and can explore in a way that is better than Google’s proprietary work. In fact, the contributions from institutions and individuals led to the creation of Stable Diffusion in the image generation space. According to the author, Stable Diffusion has outstripped OpenAI’s Dall-E in terms of functionality.

Without attempting to dive down deeper into technical language, I believe that the author is correct that these smaller models will outstrip the large LLM models in terms of iterating the datasets more quickly and more cheaply. In turn, this means that we’ll continue to see the introductions of smaller, tailored AI products that may be easier to work with than the larger LLMs like ChatGPT.

A genuine concern might be finding the right talent to assist companies in evaluating, selecting, and implementing AI solutions that will benefit them the most. Imagine if this labor shortage is an issue for companies, how big of an issue it might become for educators trying to remain current and keep their future graduates up to speed on the rapidly changing AI field of applications.

Lastly, while these smaller, more personalized AI products might be reasonable solutions for smaller companies, larger companies have begun to ban their employees, particularly computer programmers, from using the large LLM models like ChatGPT. The reason cited is that ChatGPT captures the data you submit to it unless you change the settings. Companies like Apple don’t want copies of their proprietary programming or intellectual property to be inserted into ChatGPT for a programming fix or write-up enhancement. Welcome again to the world of rapidly changing solutions thanks to AI.