Building a Local AI Coding Environment with Ollama and OpenCode

Building a Local AI Coding Environment with Ollama and OpenCode

2026.04.15

This page has been translated by machine translation. View original

Currently, various companies are fiercely competing to improve AI accuracy. As a result, the operational costs of AI systems are trending upward, and I believe there will likely be a shift from subscription models to usage-based pricing in the future.

I currently use Claude Pro for personal use, and signs of cost pressure are already becoming apparent. In March 2026, the consumption rate of session limits during peak hours (PT 5:00-11:00 on weekdays) was increased, and in addition to the 5-hour session limit, there's now a weekly cap on computational resource usage. It's possible that subscription plan limitations will strengthen further or shift to usage-based billing in the future. To avoid scrambling for alternatives when that happens, I decided to explore local LLMs.

In this article, I'll introduce the process of setting up Ollama as a local LLM execution environment and installing OpenCode as an AI coding agent to build a completely local AI coding environment.

Test Environment

  • MacBook Pro (M1 Pro / 32GB)
  • macOS 26.4 (25E246)
  • Ollama 0.20.5
  • OpenCode 1.4.3
  • gemma4:26b (executed via Ollama)

What is Ollama

Ollama is an open-source tool for running LLMs in a local environment. It allows you to download and run various open-source models like Meta Llama and Google Gemma with a single command.

https://ollama.com/

What is OpenCode

OpenCode is an open-source terminal-based AI coding agent. Similar to Claude Code or GitHub Copilot CLI, it creates and edits code based on natural language instructions in the terminal.

The key feature of OpenCode is its compatibility with over 75 LLM providers. In addition to cloud models from OpenAI, Anthropic, and Google, it can also use local models running through Ollama. This means you can run an AI coding agent completely locally without depending on external APIs.

https://opencode.ai/

Step 1: Install Ollama

First, let's install Ollama. While it can be installed via Homebrew, the official GitHub README.md recommends using their installation script, so we'll use that method.

curl -fsSL https://ollama.com/install.sh | sh

After installation, launch Ollama.

ollama

The Ollama menu will appear in the terminal.

Ollama's first launch screen showing options like Chat with a model and Launch Claude Code

Selecting Chat with a model allows you to choose from several models. Since we want to try a local LLM, let's select gemma4.

Ollama's model selection screen with gemma4 highlighted

After the download completes, the chat will start automatically. If you receive a normal response, the Ollama installation was successful.

Screen showing chat with gemma4 on Ollama. The model introduces itself as Gemma 4

Enter /exit to exit Ollama for now.

Step 2: Download the Model

In Step 1, we downloaded a model through the interactive selection during first launch, but for OpenCode, we'll explicitly download a model with a specified parameter size.

We'll use gemma4:26b for this demonstration. gemma4 is an open model released by Google based on Gemini, newly released in April 2026. As mentioned later, it was selected for its fast response speed and comfortable performance for local execution.

ollama pull gemma4:26b

After the download is complete, let's test it.

ollama run gemma4:26b "こんにちは。あなたはどのモデルですか?"

Result of running ollama run gemma4:26b in terminal. The response is returned properly in Japanese

I also tried qwen3.5:27b, but in my environment, the response was slow and difficult to use. In contrast, gemma4:26b responded quickly and seemed to perform reasonably accurate inferences. Performance will vary depending on available memory.

I've heard that qwen3.5 is highly rated for coding performance, so I plan to try it again later.

Step 3: Install OpenCode

Next, let's install OpenCode.

curl -fsSL https://opencode.ai/install | bash

OpenCode installation completion screen showing version 1.4.3 and ASCII art logo

After installation, check the version.

opencode --version

1.4.3

Step 4: Configure OpenCode

To use Ollama models with OpenCode, you need to create a configuration file. Add the following content to a file in your project's root directory or as a global configuration at ~/.config/opencode/opencode.json.

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "gemma4:26b": {
          "name": "Gemma 4 26B"
        }
      }
    }
  }
}

Key configuration points:

  • npm specifies the AI SDK provider package that OpenCode uses internally. For Ollama integration, use @ai-sdk/openai-compatible. OpenCode will automatically install this package, so manual installation is not required
  • baseURL specifies Ollama's OpenAI-compatible endpoint (http://localhost:11434/v1). Note that we use /v1, not Ollama's native /api endpoint
  • models specifies the model name downloaded with Ollama

Ollama's default context window varies depending on available VRAM. In this test environment (32GB unified memory), the default is 32K tokens. If you want to change this, you can adjust Ollama's num_ctx parameter.

Step 5: Verification

After ensuring Ollama is running, start OpenCode.

opencode

OpenCode startup screen showing logo and prompt input field

After startup, enter /models and verify that Ollama's gemma4:26b appears in the options. If it does, your configuration is correctly applied.

OpenCode model selection screen showing Gemma 4 26B under the ollama provider

Enter a test prompt to confirm that the local model responds properly.

Screen showing OpenCode asking gemma4:26b a question and receiving a response

Conclusion

I've introduced the process of building a locally-running AI coding environment by combining Ollama and OpenCode.

While local LLMs are less accurate compared to cloud models, they don't require an internet connection and have no API usage fees. I believe they're a sufficient option for small code generation tasks in personal development or for projects where privacy is important.

The gemma4:26b model I used wasn't as capable for coding purposes as I had hoped, but I found it practical enough for normal Q&A and research tasks. I plan to use it for regular tasks going forward. I'm also interested in trying other models like qwen3.5:27b, which is highly rated for coding performance.

Setting up a local LLM environment to prepare for potential future price changes in cloud services is worthwhile. I hope this serves as a useful reference for those considering implementing local LLMs.

References

Share this article