> For the complete documentation index, see [llms.txt](https://metaverse-imagen.gitbook.io/ai-tools-research/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://metaverse-imagen.gitbook.io/ai-tools-research/large-language-models-llms/open-source-llms/snowflake-arctic-128-experts-moe.md).

# Snowflake Arctic 128 Experts MoE

## Snowflake Arctic 128 Experts MoE

{% embed url="<https://www.snowflake.com/>" %}

### <mark style="color:blue;">Snowflake wants to give back to the community enrich the collective knowledge Empower others to succeed and with this release they're not just unveiling the model they're also sharing the research insights through a comprehensive cookbook and this Cookbook is designed to expedite the learning process for anyone looking to build Worldclass Moe models.</mark>

{% embed url="<https://www.snowflake.com/blog/arctic-open-efficient-foundation-language-models-snowflake/>" %}

<figure><img src="/files/tMpNnJQvBu4qYVI8sCNc" alt=""><figcaption></figcaption></figure>

Snowflake's new open-source large language model called Arctic, which uses a novel architecture called a "**dense hybrid Transformer"** with 128 experts (smaller models). This approach, called a Mixture of Experts (MoE), is claimed to provide several benefits:

1. **Training efficiency:** By utilizing many small "expert" models instead of one large model, the training can be made more computationally efficient and less expensive. The article states that Arctic's training cost was under $2 million, much lower than estimates for models like GPT-4 ($60 million).
2. **Model performance:** Despite using smaller expert models, the combination of 128 experts allows Arctic to achieve high performance on enterprise tasks like coding, SQL generation, and instruction following - what Snowflake calls "Enterprise intelligence".
3. **Scalability:** Having many smaller expert models makes it easier to scale up the overall model size and capabilities by adding more experts, compared to scaling up a single large model.
4. **Specialization:** Each expert can potentially specialize in specific tasks or domains, allowing the overall model to handle a diverse set of tasks effectively.

The key innovation claimed is that Snowflake's dense hybrid architecture reduces the communication overhead between the experts during training, which has been a major inefficiency in traditional Mixture of Experts approaches. This enables training very large MoE models like Arctic's 128 experts in a cost-effective manner.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://metaverse-imagen.gitbook.io/ai-tools-research/large-language-models-llms/open-source-llms/snowflake-arctic-128-experts-moe.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
