Running large language models (LLMs) locally has never been easier. Ollama provides a lightweight, fast, and privacy‑friendly way to run models like Llama 3, Mistral, Phi‑3, Gemma, and many others directly on your machine — without sending data to the cloud.
In this guide, you’ll learn:
- how to install Ollama
- how to verify the installation
- how to download and run your first model
- how to send your first chat message
- optional: how to use the local Ollama API
Let’s get started.
1. What Is Ollama?
Ollama is a local runtime for LLMs that focuses on simplicity and performance. It provides:
- one‑command model downloads
- automatic GPU acceleration (if available)
- a built‑in chat interface
- a local REST API
- support for many open‑source models
It’s ideal for developers, researchers, and anyone who wants to experiment with AI locally.
2. Installing Ollama
Ollama supports macOS, Windows, and Linux. Installation takes only a minute.
Windows Installation
- Download the Windows installer from the official website: https://ollama.com/download
- Run the .exe file
- Follow the setup wizard
- After installation, Ollama is available in PowerShell or Command Prompt
macOS Installation
- Download the macOS installer from the official website: https://ollama.com/download
- Open the .dmg file
- Drag Ollama into your Applications folder
- Launch Ollama once to initialize the background service
Linux Installation
Run the official install script:
curl -fsSL https://ollama.com/install.sh | sh
This installs:
- the Ollama daemon
- the command‑line interface
- system services
3. Verify That Ollama Is Installed
Open your terminal (macOS/Linux) or PowerShell (Windows) and run:
ollama --version
If you see a version number, everything is installed correctly.
4. Download and Run Your First Model
Ollama downloads models automatically when you run them for the first time.
For example, to run Llama 3:
ollama run llama3
What happens now:
- Ollama downloads the model
- The model starts running locally
- A chat prompt appears
5. Send Your First Chat Message
Once the model is running, you’ll see a prompt and can type anything, for example:
>>> Hello! How are you?
The model will respond immediately.
To exit the chat:
- type /bye
- or press Ctrl + C
6. List Installed Models, remove a model
To see which models are currently installed:
ollama list
If you want to free up disk space:
ollama rm llama3
8. Using the Ollama API (Optional)
Ollama exposes a local API at:
http://localhost:11434
You can send requests using curl:
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt": "Tell me something about Agentic AI."
}'
This is perfect for integrating Ollama into:
- Python scripts
- Web apps
- Backend services
- Automation workflows
Conclusion
Ollama makes it incredibly easy to run powerful AI models locally. With just a few commands, you can:
- install the runtime
- download models
- chat with them
- integrate them into your own applications
If you’re exploring AI, building prototypes, or experimenting with local LLMs, Ollama is one of the best tools to start with.
Leave a Reply