How to Install Ollama locally and run your first model

Running large language models (LLMs) locally has never been easier. Ollama provides a lightweight, fast, and privacy‑friendly way to run models like Llama 3, Mistral, Phi‑3, Gemma, and many others directly on your machine — without sending data to the cloud.

In this guide, you’ll learn:

  • how to install Ollama
  • how to verify the installation
  • how to download and run your first model
  • how to send your first chat message
  • optional: how to use the local Ollama API

Let’s get started.

1. What Is Ollama?

Ollama is a local runtime for LLMs that focuses on simplicity and performance. It provides:

  • one‑command model downloads
  • automatic GPU acceleration (if available)
  • a built‑in chat interface
  • a local REST API
  • support for many open‑source models

It’s ideal for developers, researchers, and anyone who wants to experiment with AI locally.

2. Installing Ollama

Ollama supports macOS, Windows, and Linux. Installation takes only a minute.

Windows Installation

  1. Download the Windows installer from the official website: https://ollama.com/download
  2. Run the .exe file
  3. Follow the setup wizard
  4. After installation, Ollama is available in PowerShell or Command Prompt

macOS Installation

  1. Download the macOS installer from the official website: https://ollama.com/download
  2. Open the .dmg file
  3. Drag Ollama into your Applications folder
  4. Launch Ollama once to initialize the background service

Linux Installation

Run the official install script:

curl -fsSL https://ollama.com/install.sh | sh

This installs:

  • the Ollama daemon
  • the command‑line interface
  • system services

3. Verify That Ollama Is Installed

Open your terminal (macOS/Linux) or PowerShell (Windows) and run:

ollama --version


If you see a version number, everything is installed correctly.

4. Download and Run Your First Model

Ollama downloads models automatically when you run them for the first time.

For example, to run Llama 3:

ollama run llama3

What happens now:

  • Ollama downloads the model
  • The model starts running locally
  • A chat prompt appears

5. Send Your First Chat Message

Once the model is running, you’ll see a prompt and can type anything, for example:

>>> Hello! How are you?

The model will respond immediately.

To exit the chat:

  • type /bye
  • or press Ctrl + C

6. List Installed Models, remove a model

To see which models are currently installed:

ollama list


If you want to free up disk space:

ollama rm llama3

8. Using the Ollama API (Optional)

Ollama exposes a local API at:

http://localhost:11434


You can send requests using curl:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Tell me something about Agentic AI."
}'

This is perfect for integrating Ollama into:

  • Python scripts
  • Web apps
  • Backend services
  • Automation workflows

Conclusion

Ollama makes it incredibly easy to run powerful AI models locally. With just a few commands, you can:

  • install the runtime
  • download models
  • chat with them
  • integrate them into your own applications

If you’re exploring AI, building prototypes, or experimenting with local LLMs, Ollama is one of the best tools to start with.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *