Ollama

Function: Ollama is a service for managing and training LLM models. It provides infrastructure and tools for researchers and developers to train, tune, and deploy large language models on cloud platforms like Google Cloud TPUs.
Focus: Ollama is geared towards advanced users and developers with experience in LLM training and deployment. It offers more flexibility and control over the training process and model configurations.
Examples: Ollama is used by researchers developing cutting-edge language models, as well as companies building their own internal LLMs for specific applications.

GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locallyGitHub

Get up and running with large language models locally.

macOS

Windows

Coming soon! For now, you can install Ollama on Windows via WSL2.

(WSL2, which stands for Windows Subsystem for Linux 2, is a feature of Windows 10 and 11 that allows you to run a real Linux environment directly within your Windows system. This means you can access and use all of your favorite Linux tools and commands, even on a Windows machine, without having to dual-boot or use a virtual machine.)

Linux & WSL2

curl https://ollama.ai/install.sh | sh

Manual install instructions

Docker

The official Ollama Docker image ollama/ollama is available on Docker Hub.

Quickstart

To run and chat with Llama 2:

ollama run llama2

Model library

Ollama supports a list of open-source models available on ollama.ai/library

Here are some example open-source models that can be downloaded:

Model Parameters Size Download

Model	Parameters	Size	Download
Llama 2	7B	3.8GB	`ollama run llama2`
Mistral	7B	4.1GB	`ollama run mistral`
Dolphin Phi	2.7B	1.6GB	`ollama run dolphin-phi`
Phi-2	2.7B	1.7GB	`ollama run phi`
Neural Chat	7B	4.1GB	`ollama run neural-chat`
Starling	7B	4.1GB	`ollama run starling-lm`
Code Llama	7B	3.8GB	`ollama run codellama`
Llama 2 Uncensored	7B	3.8GB	`ollama run llama2-uncensored`
Llama 2 13B	13B	7.3GB	`ollama run llama2:13b`
Llama 2 70B	70B	39GB	`ollama run llama2:70b`
Orca Mini	3B	1.9GB	`ollama run orca-mini`
Vicuna	7B	3.8GB	`ollama run vicuna`
LLaVA	7B	4.5GB	`ollama run llava`

Llama 2

3.8GB

ollama run llama2

Mistral

4.1GB

ollama run mistral

Dolphin Phi

2.7B

1.6GB

ollama run dolphin-phi

Phi-2

2.7B

1.7GB

ollama run phi

Neural Chat

4.1GB

ollama run neural-chat

Starling

4.1GB

ollama run starling-lm

Code Llama

3.8GB

ollama run codellama

Llama 2 Uncensored

3.8GB

ollama run llama2-uncensored

Llama 2 13B

13B

7.3GB

ollama run llama2:13b

Llama 2 70B

70B

39GB

ollama run llama2:70b

Orca Mini

1.9GB

ollama run orca-mini

Vicuna

3.8GB

ollama run vicuna

LLaVA

4.5GB

ollama run llava

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

PreviousLM Studio NextGradio Web UI for LLMs

Last updated 5 months ago