SLM's vs. LLM's

What's the difference between an SLM and an LLM?

The main difference between an SLM (Small Language Models) and an LLM (Large Language Models) is the size. SLM’s range between 7B - 70B parameters in size, with the sweet spot sitting between 7B-13B parameters.

Why shouldn’t we just use a closed-source model, like GPT-4 or Claude?

Closed-source models have several drawbacks for companies that want to build their own use-case specific models.

Even though closed-source general purpose LLMs are very powerful, hitting them at scale can be very costly. You are at the mercy of closed-source model providers. When their API goes down, or has an increase in traffic, your performance is negatively affected.

Closed-source model training and training datasets are typically a 'black box'. This means you have very little insight into how they were trained or what data they were trained on.

Training LLMs with closed-course models exposes your data to third-party platforms, which compromises data security. You never have ownership of a closed-source model. Your model should be your asset. Just as data was the new gold, LLMs should be your new gold. These assets should be owned and controlled by you.

In most cases, smaller, more specialized language models running on your own cloud is a more suitable business case. This is what we can help you build.

Why do forward thinking companies prefer smaller parameter models over larger parameter models?

If you go to the depths of domain adaptation, then you can be smaller with your model size. Also data and knowledge in large general models are really unnecessary in many business use cases, and are an overkill for the ROI that you are trying to achieve.

How can we trust that our data will be secure?

One of the core pillars of Cloud Computing is VPC (virtual private cloud) deployments. This means that when we build your SLMs all the way from pre-training to deployment, everything stays in your private own cloud. This is the case for both your data AND your SLMs.

We have built our own model in-house already, can we only incorporate some other Open Source Model's features?

The short answer is yes. If you already have an are open-source model agnostic for the most part, you can utilize your model in our an new SLM adaptation system. A few companies have SLM adaptation built with a modular focus.

What is so unique about Open Source LLM's?

The Open Source community has built a way to take the power of LLMs and bring them down into smaller, specialized and scalable models. They are able to do this using unique SLM Adaptation Systems, where they offer customers pre-training, alignment, and continuous retrieval-augmented generation all in one place. Furthermore, with Open Source and SLM Adaptation systems, both your data and your models are 100% owned by your from start to finish.

Do I own my model? Do I own my stack?

Yes, your model will be 100% yours. It runs inside your cloud, inside your VPC and you can wield it anyway you see fit.

Do you pretrain our model from scratch?

We do not go all the way back to pretraining from scratch, we start with a great open-source model, such as Mistral or Llama 2, and extend the pretraining of the existing model by injecting new data into it. The end result is a model that holds all the great general capability of the base model, and all the domain knowledge of your injected corpus. This makes for a much more powerful model than pre-training from scratch, at a fraction of the price. This is also called 'the domain adaptive continual pre-training approach'.

Last updated