Quick start guide
- Prerequisites
- Starting a virtual machine with cudoctl
- Installing Ollama via SSH
- Using Docker to start a LLM API
Prerequisites
- Create a project and add an SSH key
- Download the CLI tool
Starting a virtual machine with cudoctl
Start a virtual machine with the base image you require, here we will start with an image that already has NVIDIA drivers. You can use the web console to start a virtual machine using the Ubuntu 22.04 + NVIDIA drivers + Docker image or alternatively use the command line toolcudoctl
To use the command line tool you will need to get an API key from the web console, see here: API key
Then run cudoctl init and enter your API key.
First we search to find a virtual machine type to start
epyc-milan-rtx-a4000 (16GB GPU) in the se-smedjebacken-1 data center and image ubuntu-2204-nvidia-535-docker-v20240214 we can start a virtual machine:
Installing Ollama via SSH
Get the IP address of the virtual machine| Model | Parameters | Size | Download |
|---|---|---|---|
| Llama 2 | 7B | 3.8GB | ollama run llama2 |
| Mistral | 7B | 4.1GB | ollama run mistral |
| Dolphin Phi | 2.7B | 1.6GB | ollama run dolphin-phi |
| Phi-2 | 2.7B | 1.7GB | ollama run phi |
| Neural Chat | 7B | 4.1GB | ollama run neural-chat |
| Starling | 7B | 4.1GB | ollama run starling-lm |
| Code Llama | 7B | 3.8GB | ollama run codellama |
| Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
| Llama 2 13B | 13B | 7.3GB | ollama run llama2:13b |
| Llama 2 70B | 70B | 39GB | ollama run llama2:70b |
| Orca Mini | 3B | 1.9GB | ollama run orca-mini |
| Vicuna | 7B | 3.8GB | ollama run vicuna |
| LLaVA | 7B | 4.5GB | ollama run llava |
| Gemma | 2B | 1.4GB | ollama run gemma:2b |
| Gemma | 7B | 4.8GB | ollama run gemma:7b |
Using Docker to start a LLM API
If you had created a vm in the previous step delete it by running:start-ollama.txt
-start-script-file start-ollama.txt:
gemma:7b