Foundation Model: It’s all about PMF

10 min readSep 6, 2023

Ever since LLaMA-2 became commercially free to use in July, the debate of open-source vs. closed-source LLMs has only got more heated. You could really argue from both sides. I agree that open-source models are quickly catching up in quality and making AI more accessible. But I also think open-source LLMs are somewhat overhyped as I saw many startups still chose to build on top of OpenAI APIs for easier set-ups. In my opinion, it’s not a “winner takes all” situation; it’s all about choosing the right customer segments to serve and finding the product-market fit, and both can exist and win.

How can you access or consume foundation models nowadays?

Now that OpenAI also made fine-tuning API and ChatGPT Enterprise available, it becomes more accessible for general public to develop and deploy fine-tuned or domain-specific models, and build applications and even agents on top. Besides building base models from scratch, below are the different ways of consuming foundation models.

Closed-source:

Access closed-source LLMs that are pre-trained and fine-tuned by third parties such as OpenAI, Cohere, and Anthropic directly via API;
Fine-tune on closed-source OpenAI models (available by OpenAI on Aug 22);
Customize with own data / private deployment services provided by OpenAI or Cohere (i.e., have them fine-tune for you with your own data).

2. Fine-tune on top of open-source base models:

Fine-tune using third-party tools such as Weights & Biases;
Fine-tune internally (might need to hire ML engineers).

** Here, one need to take into consideration where you are hosting the model — on cloud or on-premise, which could significantly impact the costs.

Who are the buyers?

Hyperscalers build their own LLMs, but the goal is not to sell those AI capabilities as standalone, but more for integrating into their products, and charging a premium for better productivity and user experience.
Large enterprises want to differentiate themselves by finetuning with lots of private customer data, so they would either fine-tune on their own, or have 3P model developers fine-tune for them. Companies in financial services or healthcare industries that value privacy and have sensitive customer data would consider private development.
AI-native startups tend to develop domain specific models once they have enough customer data. To date, most AI-native startups I came across still build their products on top of 3P models first.
Individual researchers and SMBs could also use LLMs for research or product prototyping.

Despite the hype around open-source LLMs, the majority of the companies still choose OpenAI API. And here’s why:

Open-sourced LLMs are not as cheap as you’d think:

When people say how accessible open-source models are from a cost perspective, they might have missed the fact that you would need to host your models while doing fine-tuning on your own, which can get very expensive.

From a cost perspective, here is the by category breakdown you would need to consider:

Inference costs: this is essentially the by usage API costs you pay to OpenAI or Anthropic. Usually charged by # of tokens inputted into and generated out of fine-tuned LLMs.
Hosting: hosting training models for fine-tuning is what open-source model users need to pay attention to. The benefit of using closed-source models is that you don’t have to worry about the hosting infrastructure, since the models are trained and fine-tuned using their GPUs and other infrastructure. It can get challenging for companies building on top of open-source models, especially those that build on-premise. Not only is securing GPUs extremely challenging these days, but the hosting costs are hard to justify until your product scales to a very large number of users.
Fine-tuning: for those building on top of open-source models, they need to pay for using third-party fine-tuning tools or frameworks, and/or hiring ML engineers. Won’t be cheap either.

Let’s do the calculation. Say we are building an app to generate meeting summary. Assume we are already at scale with 10,000 daily users. Assume on average, the meetings all of these users are in have 50M tokens as input (~37.5M words) and 10M tokens as output (~7.5M words) in total every day. Assume 1B tokens used for fine-tuning, and assume 8 A100 GPUs used for self-hosting LLaMA-2 70B.

Cost Comparisons of Closed-Source vs. Open-Source LLMs

Of course this is an extreme case with lots of customer inquiries. So for starters, i.e., companies with users below 10K monthly, it makes sense to use OpenAI APIs to play around the ideas and test out the products, where you don’t have to worry about hiring the engineers and buying GPUs. Closed-source models would be much cheaper in this case as you are essentially renting only a small portion of the services and infrastructure support from the proprietary vendors.

However, for heavy usage as shown above, the upfront costs for open-source models can be justified if your startups scale up with more than 10K MAUs.

My thesis of going open-source once you have enough customer traction to justify the upfront costs, is that you don’t have to pay a premium to every stack of the solution. For accessing proprietary models, you are essentially paying a premium for GPUs, cloud hostings, pre-training, fine-tuning, and inference. So taking it in-house would definitely save you some development money.

2. It is so easy to call an API vs. build your fine-tuning infra and team.

Especially for startup builders who want to test out their ideas and build quick prototypes, calling the OpenAI API is much easier than building on top of open-source LLMs. The latter requires infrastructure resources management, from selecting the right environment for hosting and deployment, to scaling the resources as the products take off.

Additionally, finding the right AI talents with both skills for fine-tuning and understanding of the domain knowledge is time-consuming and costly. According to ZDNet, in a survey of 1,420 IT leaders, 67% faced the challenge of AI talent shortage. CNBC noted that there were 169,045 jobs in the U.S. that required AI skills in June 2023. And simply having the words “AI skills” in the job description would yield a 45% salary increase. Moreover, top AI engineers are even more expensive to have — OpenAI engineers have a median total compensation of $930K.

3. Accuracy still matters!

Although open-source LLMs are catching up with closed-source LLMs in quality, to date, LLaMA-2 lacks GPT-3.5 and GPT-4 in performance from the chart below. One could argue that fine-tuning could close some of those performance gaps, but I’d say that the difference matters to many enterprises. Think about how customer-facing enterprises today still use AI tools for internal use cases mostly, would they trade quality for free open-source models? I’d imagine that they’d care for customer experience more. And the cost-saving from using open-source models might not be worth it if customer experience had to suffer. Maybe one day when open-source models actually have better performance than closed-source models, enterprises could then invest more in open-source models.

Performance benchmark of LLaMA-2 vs. other closed-source LLMs. Source: https://scontent-sjc3-1.xx.fbcdn.net/v/t39.2365-6/10000000_662098952474184_2584067087619170692_n.pdf?_nc_cat=105&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=bGkSwJy8Xa4AX9mXq1Q&_nc_ht=scontent-sjc3-1.xx&oh=00_AfDRAb3a59yn_yhVerLwkEeWhgE0K8sHj78jDTcTsqe4mQ&oe=64FBB4BF

Open vs. Closed — It’s about matching product-market fit.

I listed below the main criteria companies could have when selecting between building on top of open-source models or using closed-source models directly.

Pros and Cons of Open-Source vs. Closed-Source LLMs

From the chart above we can see that self-hosting and proprietary models have their own pros and cons. So my thesis is that they should each target different customer profiles for different needs and use cases. From a customer perspective, below is what they might prefer to use:

Closed-source models:

Enterprises with high accuracy requirements: those that want to deploy AI products externally for customer use, or those that are in verticals such as legal or healthcare, would want higher quality models. I’ve also seen many agents startups building on top of OpenAI API because of the same reason.
Enterprises with horizontal applications: companies with generic productivity tools that sell into multiple sectors might benefit a lot from generic-purpose closed-source models, as they have higher quality.
Enterprises without AI talents and technical capabilities are probably more suitable for easy-to-start solutions such as ChatGPT Enterprise or Fine-tuning API.
Emerging AI-native startups: these companies usually don’t have existing private customer data to start with. They can build on top of closed-source models to gain tractions and collect customer data before starting their own domain specific models.

2. Open-source models:

Enterprises with vertical applications and lots of customer data are more suitable for fine-tuning on top of open-source models into specialized tasks. This is because their business units likely have lots of domain expertise and data to start with.
Enterprises with technical talents and resources: it would be much easier for these companies to set up the hosting environment. With their internal engineer talents that know the ins and outs of the companies, they can fine-tune the models that fit best for their respective companies.
Enterprises with privacy concerns are suitable for either fine-tuning with private dataset or private deployments. There are service companies like Sambanova that help enterprises with privacy concerns (such as government agencies or financial institutions) build their own AI models on top of LLaMA-2.
AI-native startups with enough tractions: I’ve seen startups pivoting from OpenAI API to LLaMA-2 recently. Now that they’ve collected enough customer data to fine-tune their domain specific models, they’d like to have more control of and better visibility into the models.

While from a model-developer perspective, are the above categorized customers really the ones they should target? OpenAI has made it increasingly easy for starters to access high-quality models, but is this monetization strategy sustainable? In today’s market where lots of AI use cases are still unclear, not many startups can scale well. So their use of OpenAI API might stop later on when they have trouble finding customers. Or if they actually manage to scale, would they still stick with OpenAI where inference costs would also scale with more customers? Those companies might shift to open-source models to build their domain specific models once they’ve accrued enough customer data.

More importantly, would proprietary model developers be able to recoup the upfront investments and high development costs by targeting entry-level customers? It might not be OpenAI’s concern, but I do find many closed-source model developers have that product-market fit issue. At the end of day, investors are not going to pay for their huge amounts of model training and development fees. So they need to figure out the right customers to target, and whether they should sell access to models (e.g., Cohere), or AI-driven end products (e.g., AI21labs). In my opinion, I see service model as a way out for closed-source model developers — helping enterprises build and deploy customized models. At least in today’s market where enterprises lack the technical talents and resources, they might prefer to have a full-stack solution. And model developers can also charge more with enterprise sales schemes.

Limitations / Concerns:

How to scale? For proprietary model developers that are on service business models, the ability to scale efficiently would be a concern. Because each customer has different needs, their services cannot be replicated easily. And the enterprise sales teams need to build an extensive customer pipeline, and build relationships with each of them. The sales cycle can also get long here.
Maintenance. For those self-hosting their models, continuous development and maintenance of the models can get tricky. Companies would need full-stack observability into each stack. Identifying issues and bugs can be challenging.
Will proprietary models maintain the high quality? Not only are open-source models catching up, but it’s also getting challenging for OpenAI to collect more data, as more than 15% of the 100-most-popular websites are blocking OpenAI’s web crawler GPTBot. With researchers and developers contributing to the open-source community, open-source models can potentially have better quality models in the near future.

Future opportunities:

Model hosting, deployment, and orchestration platforms. Challenges such as deploying models at scale and serving models are still unsolved today. I’m looking for startups that are AWS SageMaker equivalent — those that can serve models at lower costs and more securely, and handle complexities well are of interest. As models are getting smaller in size, and MoE (mixture of experts) models are being developed, I’m also interested in orchestration platform and frameworks.
Fine-tuning tools. As open-source models become increasingly available, I’m also interested in startups that offer fine-tuning capabilities. I’m interested in those that offer collaboration capabilities with internal functional teams, and that work with expert models.
Model evaluation and testing. There are already many models now, and there will be more in the future. Companies may use different models for different use cases. And there are so many factors to consider (as listed in the above section) when selecting a model. So I’m excited about evaluation and testing tools that help enterprises and developers make better LLM choices.

Closing Thoughts:

Although I do believe that open-source models will be widely adopted in the future, we are not there yet. In an ideal world where infrastructure and model-hosting are not that expensive, where AI talents are not that hard to find, where time-to-market is not that long, and when open-source models have better quality, I’d imagine open-source models to have very bright futures. But as companies are still trying to figure out the right use cases for LLM adoption and as developers are playing around different kinds of developer tools, it will take some time until open-source models become mainstream.