Continue.dev In VS Code And JetBrains, Open Source AI Coding Setup
Continue 0.9 with Claude Sonnet 4.6 and a local Ollama fallback, my dual-IDE walkthrough

Continue is the open-source coding-assistant extension that runs the same config across VS Code and JetBrains IDEs. I use VS Code for web work and IntelliJ IDEA for the rare Java/Kotlin job, and Continue is the only AI coding tool I have that gives me a single config file (config.json) that works in both. This walkthrough is the dual-IDE setup that I keep across my Linux box and my Mac.
What you'll build
Continue 0.9 installed in VS Code (or IntelliJ), wired to Claude Sonnet 4.6 for chat plus a local Ollama Qwen 2.5 model for autocomplete, with a single shared ~/.continue/config.json that both IDEs read. Roughly 25 minutes if Ollama is fresh.
Caption: Continue chat panel beside the config.json that drives both my IDEs.
Prerequisites
- VS Code 1.95+ or IntelliJ IDEA 2024.2+ (Community is fine, no Ultimate needed)
- Ollama installed for the local autocomplete model
- An Anthropic API key for the chat model
- 16GB RAM if you want autocomplete responsive while a chat runs
If you only want the cloud chat path, you can skip Ollama entirely. The config flips one block.
Step 1, install Continue in VS Code
In the Extensions panel, search "Continue", install the official one (continue.continue). After install, the Continue sidebar icon shows up in the activity bar.

Click the icon, the chat panel opens. The first prompt asks for a config; you can pick a starter or skip and edit the file directly.
Step 2, install Continue in JetBrains
In IntelliJ, open Settings → Plugins → Marketplace, search "Continue", install. Restart the IDE when prompted.

The right-hand sidebar gets a Continue icon. Same chat panel as VS Code, same keyboard shortcuts.
Step 3, write the shared config.json
Continue reads ~/.continue/config.json in both IDEs. Mine, lightly anonymised:
{
"models": [
{
"title": "Claude Sonnet 4.6",
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"apiKey": "sk-ant-api03-..."
}
],
"tabAutocompleteModel": {
"title": "Local Qwen 2.5 7B",
"provider": "ollama",
"model": "qwen2.5:7b"
},
"embeddingsProvider": {
"provider": "ollama",
"model": "nomic-embed-text"
},
"contextProviders": [
{ "name": "code" },
{ "name": "diff" },
{ "name": "open" }
]
}

The chat model is Claude Sonnet for quality. The autocomplete model is local Qwen for cost and latency. Embeddings are local because I do not want to pay for context-aware retrieval.
Step 4, test chat with a real task
Open the Continue panel in VS Code, type:
@code src/lib/news-loader.ts -- the loadNews function does not handle malformed frontmatter. Add try-catch around the parse and log a warning instead of throwing. Keep the signature.

Continue reads the file, proposes a diff, lets you apply or copy. Approval is per-file.
Step 5, test autocomplete with local Qwen
Open any TypeScript file. Start typing a function. Continue triggers Qwen 2.5 inline after a short pause:
function calculateTotalPrice(items: CartItem[]) {
return items.reduce(// <-- Qwen suggests the rest

Tab to accept, Esc to dismiss. The latency on my ThinkCentre with Qwen 2.5 7B running locally is 200-400ms; usable but a notch slower than GitHub Copilot.
First run
A normal session combining both:
[autocomplete kicks in inline as you type, runs against local Qwen]
[chat panel for refactor: type "@code src/components/SearchBox.tsx -- add a debounce of 300ms to onChange"]
Continue: [reads file, proposes diff with debounce import + ref hook]
You: Apply
Continue: [writes change to file]

You commit yourself; Continue does not auto-commit.
What broke for me
Two specifics from the IntelliJ side. First, the JetBrains plugin took ~30 seconds to start indexing on a 50k-file repo, and during that window the Continue panel showed "Loading..." with no progress hint. I assumed it was broken and restarted twice before realising it was working. The fix was to look at the IDE's bottom-right indexing indicator; once that finished, Continue lit up.
Second, the embeddings provider config silently fell back to OpenAI when Ollama was unreachable. I had Ollama bound to 127.0.0.1:11434 and IntelliJ was sandboxed in a way that broke the connection. The Continue logs in ~/.continue/logs/ showed the fallback. I fixed it by adding "providerOptions": { "host": "http://localhost:11434" } to the embeddings block, which forced the Ollama path explicitly.
What it costs
| Item | Cost |
|---|---|
| VS Code | Free |
| IntelliJ Community | Free |
| Continue extension/plugin | Free |
| Claude Sonnet 4.6 API | $3/M input + $15/M output |
| Local Qwen 2.5 7B | Free |
| Embeddings (local) | Free |
Bringing your own API key, my actual monthly Anthropic spend running Continue is Rs 400-700. The autocomplete is free because it runs locally. Cheaper than Cursor Pro at Rs 1,660/mo if you stay under 4 hours of chat use a day.
When NOT to use this
Skip Continue if you want a single click-to-install workflow with no config file and no model decisions. Cursor and Copilot are easier to start with; Continue rewards the operator who wants to mix providers.
Skip if you do not run a JetBrains IDE. Continue's strength is the cross-IDE config; if you only ever use VS Code, Cursor or Cody is a smoother experience.
Indian operator angle
The cross-IDE story matters for Indian dev shops that mix tech stacks. A typical India-based shop has a Java backend in IntelliJ and a React frontend in VS Code; one Continue config covers both. There is no Indian provider option in Continue's stock model list yet, but Krutrim and Sarvam-1 work fine when added as provider: openai with a custom apiBase since they expose OpenAI-compatible endpoints.
Billing is the same Anthropic forex story as the other Claude paths. The local-Qwen autocomplete is the Indian-friendly bit, no API spend, no data leaving the country, no GST reverse-charge.
Related
More AI Coding

Building a Custom MCP Server in Python: Claude Reaches My Stack
Claude Code is sharp until it hits the edge of your machine and your private tools. I wrote three small MCP servers in Python to close that gap. Here is the real pattern, the real gotcha that bit me, and what it costs.

Claude Code Subagents in Practice: Fork Flag, Cache Leak, Worktree Trap
Fanning out subagents in Claude Code looks free until you hit the cap or your forks clobber each other's commits. These are the real fixes I learned running fanouts: the fork env flag that shares the parent's cache, the WebFetch cache leak, and the worktree pattern for parallel writers.

I Gave My AI Agents a Memory With SQLite FTS5 (No Vector DB)
Most agent-memory setups reach for Pinecone or pgvector by reflex. I put 2000+ markdown files behind SQLite FTS5 with BM25 ranking, and my agents now answer their own 'who is X' questions in under a second for zero tokens. Here is the schema, the query, and the one place lexical search loses.