In a previous article I covered getting GPTel running with LM Studio and a local Qwen model — a great setup when privacy and offline access matter. But the Emacs AI ecosystem has grown considerably, and there is now a whole family of packages targeting different parts of a coding workflow. This article picks up where that one left off: connecting GPTel to Claude's API for heavy reasoning work, introducing Ellama as a batteries-included alternative for local tasks, adding inline completions with copilot.el or codeium.el, and tying it all together into a layered workflow where each tool handles what it does best.
The Layered Approach
The key insight is that no single AI tool covers every use case well. Instead of picking one and compromising everywhere, think in layers:
- Inline completions — ghost-text suggestions as you type (copilot.el, codeium.el)
- Chat and reasoning — ask questions, explain code, generate functions (GPTel + Claude)
- Routine local tasks — summarise, review, translate, improve grammar (Ellama + Ollama)
- Multi-file agentic edits — large refactors across an entire codebase (aidermacs)
Each layer uses a different model tier. Inline completions need a fast, cheap model. Chat and reasoning work benefits from a capable cloud model. Local tasks can stay on-device. Agentic edits often use two models — one for reasoning, one for writing code. You only pay for cloud inference when it genuinely earns its cost.
GPTel with Claude (Anthropic API)
The GPTel package already supports Anthropic natively since version 0.9. If you have it installed, adding Claude as a backend takes a few lines of Elisp.
Storing the API Key Securely
Never hardcode an API key in your config. Store it in ~/.authinfo (or ~/.authinfo.gpg for GPG encryption):
machine api.anthropic.com login apikey password sk-ant-YOUR-KEY-HERE
GPTel reads this automatically when gptel-api-key is left at its default value. No key in your init file, no accidental commits to a public dotfiles repo.
Basic Configuration
(use-package gptel
:ensure t
:config
;; Register Claude as a backend
(gptel-make-anthropic "Claude"
:stream t)
;; Set Claude as the default backend and model
(setq gptel-backend (gptel-make-anthropic "Claude" :stream t)
gptel-model 'claude-sonnet-4-5))
Extended Thinking Mode
Claude's extended thinking feature lets the model reason through a problem before answering. It is noticeably better on algorithmic questions, debugging, and architecture decisions. Enable it by registering a separate backend:
(gptel-make-anthropic "Claude-thinking"
:stream t
:models '(claude-sonnet-4-5)
:request-params '(:thinking (:type "enabled" :budget_tokens 4096)
:max_tokens 8192))
Switch to it for hard problems with M-x gptel-menu (C-c RET) → Backend → Claude-thinking. Switch back to the standard model for quick questions where the extra latency is not worth it.
Core Keybindings
(use-package gptel
:ensure t
:bind (("C-c l" . gptel) ; open/switch to chat buffer
("C-c C-<" . gptel-send) ; send region or buffer
("C-c RET" . gptel-menu))) ; full options menu
The most useful interaction pattern: select a region of code, call gptel-send (C-c C-<), and the response is inserted directly into the buffer below your selection. No copy-paste, no context switching.
Using GPTel Inside Org Mode
GPTel has first-class Org mode support that makes it significantly more useful for long-running work:
;; Limit context to the current Org heading — stops the whole document
;; being sent as context for every message
M-x gptel-org-set-topic
;; Branch conversations per heading — each subtree gets its own thread
M-x gptel-org-branching-context
;; Save backend/model as Org properties so different headings
;; use different models automatically
M-x gptel-org-set-properties
A practical pattern: keep a dev-notes.org file with one heading per feature or bug. Each heading becomes its own Claude conversation with full branching context. Your conversation history lives in a plain text file, version-controlled alongside your code.
Ellama: Batteries Included for Local Tasks
Ellama takes a different philosophy from GPTel. Where GPTel is a minimal, composable chat layer you build workflows on top of, Ellama ships with dozens of ready-made commands for specific tasks. It is built on the llm package rather than GPTel's own backend code, and it defaults to Ollama for local inference.
Installation and Setup
;; Install Ollama first: https://ollama.com
;; Then pull a model: ollama pull llama3.2
(use-package ellama
:ensure t
:init
(setopt ellama-language "English"
ellama-keymap-prefix "C-c e")
:config
(require 'llm-ollama)
(setopt ellama-provider
(make-llm-ollama
:chat-model "llama3.2"
:embedding-model "nomic-embed-text")))
With C-c e as the prefix, you get a transient menu of commands. The most useful for coding:
| Command | Keybinding | What it does |
|---|---|---|
ellama-code-review | C-c e c r | Reviews selected code and suggests improvements |
ellama-code-complete | C-c e c c | Completes selected code in context |
ellama-code-add | C-c e c a | Adds code based on a description you provide |
ellama-code-edit | C-c e c e | Edits code based on an instruction |
ellama-summarize | C-c e s | Summarises selected text or buffer |
ellama-improve-grammar | C-c e i g | Fixes grammar in selected text |
Multiple Providers
One of Ellama's strengths is routing different tasks to different models. Configure separate providers and switch between them:
(require 'llm-ollama)
(require 'llm-openai)
(setopt ellama-providers
'(("local-llama" . (make-llm-ollama :chat-model "llama3.2"))
("local-coder" . (make-llm-ollama :chat-model "qwen2.5-coder:7b"))
("claude" . (make-llm-openai-compatible
:url "https://api.anthropic.com/v1/"
:chat-model "claude-sonnet-4-5"
:key (auth-source-pick-first-password
:host "api.anthropic.com")))))
Switch providers with M-x ellama-provider-select. Use the local coding model for routine completions, switch to Claude when you need stronger reasoning.
Session Persistence
Ellama auto-saves conversation sessions and lets you load, rename, or delete them:
M-x ellama-load-session ; resume a previous conversation
M-x ellama-rename-session ; give the current session a meaningful name
Sessions are stored as Org files in ellama-sessions-directory, so they are plain text and searchable.
Inline Completions: copilot.el vs codeium.el
Both GPTel and Ellama are on-demand tools — you invoke them explicitly. Inline completion packages work differently: they watch what you type and offer ghost-text suggestions in real time, similar to GitHub Copilot in VS Code.
copilot.el
copilot.el connects to GitHub's official Copilot language server. As of early 2025 it uses the @github/copilot-language-server Node package directly (dropping the previous copilot.vim dependency). It requires Node.js 22+ and a GitHub Copilot subscription (a free tier is available with limited completions).
(use-package copilot
:ensure t
:hook (prog-mode . copilot-mode)
:bind (:map copilot-completion-map
("<tab>" . copilot-accept-completion)
("TAB" . copilot-accept-completion)
("C-TAB" . copilot-accept-completion-by-word)
("C-n" . copilot-next-completion)
("C-p" . copilot-previous-completion)))
After installing, run M-x copilot-install-server once, then M-x copilot-login to authenticate with GitHub. Ghost text appears as you type; TAB accepts the full suggestion, C-TAB accepts word by word.
The standout feature added in 2025 is Next Edit Suggestions (NES): Copilot predicts where you will need to make related edits elsewhere in the file based on your recent changes, and surfaces those suggestions proactively. It is genuinely useful during refactoring.
codeium.el
codeium.el is the free alternative. It uses Codeium's AI backend and integrates via completion-at-point-functions, plugging naturally into company-mode or corfu rather than using a ghost-text overlay.
(use-package codeium
:ensure t
:init
(add-to-list 'completion-at-point-functions #'codeium-completion-at-point)
:config
;; Run M-x codeium-install once to download the binary
;; Run M-x codeium-auth to authenticate
(setq codeium-mode-line-enable
(lambda (api) (not (memq api '(CancellableGetCompletions Heartbeat AcceptCompletion))))))
Which to choose: If you already pay for GitHub Copilot, copilot.el is the obvious pick — NES alone justifies it. If you want free inline completions with no subscription, codeium.el works well and integrates more naturally with Emacs's completion framework.
aidermacs: Agentic Multi-File Editing
GPTel and Ellama work well at the function or file level. For larger tasks — migrating an API, refactoring across a module, or making a change that touches a dozen files — aidermacs is the right tool. It wraps Aider, the terminal-based AI pair programmer, with an Emacs-native interface.
Setup
# Install Aider
pip install aider-chat
# Set your API key (or use ~/.authinfo as above)
export ANTHROPIC_API_KEY=sk-ant-YOUR-KEY-HERE
;; aidermacs is on NonGNU ELPA
(use-package aidermacs
:ensure t
:bind (("C-c a" . aidermacs-transient-menu))
:config
;; Use Claude Sonnet for reasoning, Haiku for code generation (faster + cheaper)
(setq aidermacs-extra-args
'("--model" "claude-sonnet-4-5"
"--editor-model" "claude-haiku-4-5")))
The --model / --editor-model split is Aider's Architect Mode: a capable model handles reasoning and planning, a faster/cheaper model does the actual code writing. This cuts costs significantly on large tasks without sacrificing quality.
Workflow
- Open any file in your project, call
C-c ato open the aidermacs transient menu. - Add files to the Aider context with
aidermacs-add-current-fileoraidermacs-add-files-in-dir. - Describe what you want in plain English: "Extract the authentication logic from UserController into a separate AuthService class and update all call sites."
- Aider makes the changes. Magit's
auto-revert-moderefreshes your buffers automatically. - Review the diff in Magit or with aidermacs's built-in ediff integration. Accept, reject, or ask for revisions.
aidermacs also supports Architect Mode for TDD: describe the desired behaviour, let the reasoning model write tests first, then let the editor model write the implementation to pass them.
GPTel Tool Use
GPTel 0.9+ supports tool use (function calling), which lets Claude interact with your Emacs environment directly — reading buffers, running shell commands, creating files. This is the foundation for building lightweight agents without leaving Emacs.
;; A tool that reads an Emacs buffer and returns its contents
(gptel-make-tool
:name "read_buffer"
:description "Return the contents of an Emacs buffer"
:function (lambda (buffer)
(with-current-buffer buffer
(buffer-substring-no-properties (point-min) (point-max))))
:args (list '(:name "buffer"
:type string
:description "The name of the buffer to read")))
;; A tool that runs a shell command and returns stdout
(gptel-make-tool
:name "run_command"
:description "Run a shell command and return its output"
:function (lambda (command)
(shell-command-to-string command))
:args (list '(:name "command"
:type string
:description "The shell command to run")))
With tools like these registered, you can ask Claude: "Look at the test output in the *compilation* buffer and suggest what is wrong with the failing test in src/auth.php" — and it will read both buffers itself rather than waiting for you to paste content.
GPTel also supports MCP (Model Context Protocol) via mcp.el, which lets you connect any MCP-compatible tool server:
(require 'gptel-integrations)
;; Then: M-x gptel-mcp-connect to register an MCP server
Putting It Together: A Practical Configuration
Here is a complete configuration that wires up all four layers:
;;; AI layer 1 — Inline completions (free, always on)
(use-package codeium
:ensure t
:init
(add-to-list 'completion-at-point-functions #'codeium-completion-at-point))
;;; AI layer 2 — Chat and reasoning (GPTel + Claude)
(use-package gptel
:ensure t
:bind (("C-c l" . gptel)
("C-c C-<" . gptel-send)
("C-c RET" . gptel-menu))
:config
;; Standard Claude backend
(gptel-make-anthropic "Claude"
:stream t)
;; Extended thinking backend for hard problems
(gptel-make-anthropic "Claude-thinking"
:stream t
:request-params '(:thinking (:type "enabled" :budget_tokens 4096)
:max_tokens 8192))
;; Default to standard Claude
(setq gptel-backend (gptel-make-anthropic "Claude" :stream t)
gptel-model 'claude-sonnet-4-5))
;;; AI layer 3 — Routine local tasks (Ellama + Ollama)
(use-package ellama
:ensure t
:init
(setopt ellama-keymap-prefix "C-c e")
:config
(require 'llm-ollama)
(setopt ellama-provider
(make-llm-ollama
:chat-model "llama3.2"
:embedding-model "nomic-embed-text")))
;;; AI layer 4 — Agentic multi-file edits (aidermacs)
(use-package aidermacs
:ensure t
:bind ("C-c a" . aidermacs-transient-menu)
:config
(setq aidermacs-extra-args
'("--model" "claude-sonnet-4-5"
"--editor-model" "claude-haiku-4-5")))
With this setup, the decision of which tool to reach for becomes intuitive:
- Typing code → codeium inline suggestions appear automatically
- Need to understand or fix something →
C-c C-<sends to Claude via GPTel - Quick local task (summarise, review, grammar) →
C-c etransient menu via Ellama - Large refactor across multiple files →
C-c aopens aidermacs
Tips for Effective AI Coding in Emacs
Give context explicitly
By default GPTel sends only what is in the current buffer up to point. Use gptel-add to attach additional files as context before sending:
M-x gptel-add ; prompts for a file or buffer to add to the context
Use Org mode as a scratchpad
Open a dedicated Org file for AI conversations. Use headings to separate topics, gptel-org-branching-context to keep each thread independent, and Org's folding to manage long conversations. The file is plain text and stays in your project's git history.
Keep prompts in your config
If you find yourself typing the same instruction repeatedly, turn it into a named gptel directive:
(setq gptel-directives
'((default . "You are a helpful assistant.")
(code-review . "Review the following code for bugs, performance issues, and style. Be concise.")
(php . "You are an expert PHP developer. Suggest idiomatic PHP 8.3+ solutions.")))
Switch directive with C-c RET → System.
Always review AI diffs in Magit
Whether you use GPTel, Ellama, or aidermacs, establish the habit of reviewing changes in Magit before staging. AI-generated code can look correct but contain subtle errors — diffing keeps you in control.
Pick the right model for the task
Spending tokens on Claude Opus for a grammar correction wastes money. Use a local model via Ellama for routine tasks, reserve the capable cloud model for reasoning-heavy work. The layered setup above makes this the natural default rather than a conscious decision each time.
Summary
The Emacs AI toolkit in 2025 is mature and genuinely useful for professional development — but it works best when you treat it as a set of composable layers rather than a single tool:
- codeium.el or copilot.el for always-on inline completions as you type
- GPTel + Claude for on-demand chat, code generation, and extended reasoning — with Org mode integration for persistent conversation history
- Ellama + Ollama for fast, private, offline routine tasks with a menu of ready-made commands
- aidermacs for multi-file agentic edits and large refactors, with ediff review and Architect Mode
Each tool handles what it does best. You stay in Emacs throughout. The result is an AI-augmented workflow that actually fits how Emacs developers work — keyboard-driven, buffer-centric, and composable.