Ollama
Ollama
Function: Ollama is a service for managing and training LLM models. It provides infrastructure and tools for researchers and developers to train, tune, and deploy large language models on cloud platforms like Google Cloud TPUs.
Focus: Ollama is geared towards advanced users and developers with experience in LLM training and deployment. It offers more flexibility and control over the training process and model configurations.
Examples: Ollama is used by researchers developing cutting-edge language models, as well as companies building their own internal LLMs for specific applications.
Get up and running with large language models locally.
macOS
Windows
Coming soon! For now, you can install Ollama on Windows via WSL2.
(WSL2, which stands for Windows Subsystem for Linux 2, is a feature of Windows 10 and 11 that allows you to run a real Linux environment directly within your Windows system. This means you can access and use all of your favorite Linux tools and commands, even on a Windows machine, without having to dual-boot or use a virtual machine.)
Linux & WSL2
Docker
The official Ollama Docker image ollama/ollama
is available on Docker Hub.
Quickstart
To run and chat with Llama 2:
Model library
Ollama supports a list of open-source models available on ollama.ai/library
Here are some example open-source models that can be downloaded:
Llama 2
7B
3.8GB
ollama run llama2
Mistral
7B
4.1GB
ollama run mistral
Dolphin Phi
2.7B
1.6GB
ollama run dolphin-phi
Phi-2
2.7B
1.7GB
ollama run phi
Neural Chat
7B
4.1GB
ollama run neural-chat
Starling
7B
4.1GB
ollama run starling-lm
Code Llama
7B
3.8GB
ollama run codellama
Llama 2 Uncensored
7B
3.8GB
ollama run llama2-uncensored
Llama 2 13B
13B
7.3GB
ollama run llama2:13b
Llama 2 70B
70B
39GB
ollama run llama2:70b
Orca Mini
3B
1.9GB
ollama run orca-mini
Vicuna
7B
3.8GB
ollama run vicuna
LLaVA
7B
4.5GB
ollama run llava
Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
Last updated