llm CLI Tool Guide 2026: Run Any LLM From Your Terminal
- β’The
llmCLI tool by Simon Willison reached a heat score of 71 β #1 in the HookFlow AI dataset this week, outranking Claude, ChatGPT wrappers, and every IDE-integrated coding assistant currently tracked. An infrastructure CLI has topped every consumer AI product in the dataset. The pattern driving this: sustained practitioner discussions on Hacker News and GitHub centered on the tool's role as a provider-agnostic execution layer β a single interface for teams tired of rewriting LLM integrations each time a new model launches. This raises a question for builders: if senior engineers are standardizing on a terminal CLI over every GUI-first alternative, where do real production workflows actually live? - β’
llmis an open-source Python CLI distributed viapip. Its core is intentionally thin: a command-line interface for sending prompts to language models, with a plugin architecture handling provider diversity. The tool ships with OpenAI support built in; every other provider β Anthropic, Google Gemini, Mistral, Cohere, and local inference via Ollama β is added through installable plugins. The ecosystem currently includes 40+ community-maintained plugins. - β’Three architectural decisions matter most: (1) no cloud dependency in the core tool, (2) SQLite as the persistence layer for all prompt history, (3) API-first execution where each model backend handles its own auth. This means integration into production shell scripts, CI pipelines, or data processing workflows requires no SDK initialization, no session management, and no web server. The tool fits workflows where LLM calls need to be composable shell operations, not managed API sessions.
- β’Apache 2.0 licensed. Actively maintained by Simon Willison, creator of Django and Datasette, with a 20+ year track record of long-horizon open-source maintenance. Heat score of 71 this week, #1 in the HookFlow dataset.
- β’One data quality note from this week's synthesis: the broader dataset was affected by a social scout recovery artifact that inflated 7-day deltas across tools. The
llmCLI's #1 ranking is best interpreted as a relative position signal β it sits at the top even in a noisy week β rather than a precise delta. The practitioner base (developers, researchers, data scientists) trends toward long-retention adoption patterns compared to consumer AI product audiences. No discontinuation risk, no pricing instability, no rate-limit complaints surface in scout logs for the core tool. The only failure vector is upstream model API availability, which is provider-specific and outside the tool's control.
Signal Trigger
Why We're Covering This
The llm CLI tool by Simon Willison reached a heat score of 71 β #1 in the HookFlow AI dataset this week, outranking Claude, ChatGPT wrappers, and every IDE-integrated coding assistant currently tracked. An infrastructure CLI has topped every consumer AI product in the dataset. The pattern driving this: sustained practitioner discussions on Hacker News and GitHub centered on the tool's role as a provider-agnostic execution layer β a single interface for teams tired of rewriting LLM integrations each time a new model launches. This raises a question for builders: if senior engineers are standardizing on a terminal CLI over every GUI-first alternative, where do real production workflows actually live?
A.R.C. Analysis
Architecture Β· Reliability Β· ContextArchitecture
llm is an open-source Python CLI distributed via pip. Its core is intentionally thin: a command-line interface for sending prompts to language models, with a plugin architecture handling provider diversity. The tool ships with OpenAI support built in; every other provider β Anthropic, Google Gemini, Mistral, Cohere, and local inference via Ollama β is added through installable plugins. The ecosystem currently includes 40+ community-maintained plugins.
Three architectural decisions matter most: (1) no cloud dependency in the core tool, (2) SQLite as the persistence layer for all prompt history, (3) API-first execution where each model backend handles its own auth. This means integration into production shell scripts, CI pipelines, or data processing workflows requires no SDK initialization, no session management, and no web server. The tool fits workflows where LLM calls need to be composable shell operations, not managed API sessions.
Reliability
Apache 2.0 licensed. Actively maintained by Simon Willison, creator of Django and Datasette, with a 20+ year track record of long-horizon open-source maintenance. Heat score of 71 this week, #1 in the HookFlow dataset.
One data quality note from this week's synthesis: the broader dataset was affected by a social scout recovery artifact that inflated 7-day deltas across tools. The llm CLI's #1 ranking is best interpreted as a relative position signal β it sits at the top even in a noisy week β rather than a precise delta. The practitioner base (developers, researchers, data scientists) trends toward long-retention adoption patterns compared to consumer AI product audiences. No discontinuation risk, no pricing instability, no rate-limit complaints surface in scout logs for the core tool. The only failure vector is upstream model API availability, which is provider-specific and outside the tool's control.
Context
Scout log analysis and community discussion point to four deployment patterns absent from the tool's marketing copy:
1. One-shot queries without a browser tab: replacing the habit of opening ChatGPT for quick lookups with a terminal command that's faster, logged, and scriptable.
2. Piping CLI output into LLM: git diff | llm "write a commit message" treats the model as a Unix filter in an existing pipeline.
3. Provider comparison on identical prompts: running the same prompt against GPT-4o, Claude Sonnet, and a local Ollama model in sequence to evaluate output quality or cost tradeoffs before committing to an integration.
4. Shell script automation with LLM steps: embedding llm calls in bash scripts that process logs, classify data, or generate structured output as part of larger workflows.
The plugin ecosystem is the structural differentiator. Switching from OpenAI to a local model via Ollama requires changing one flag, not rewriting an integration. That portability is what practitioners are discussing β not prompt engineering features.
Installation & Quick Start
pip install llm
Set your first API key:
llm keys set openai
# Paste your key when prompted
Run a prompt:
llm "explain this in one sentence: what is a vector database"
Check your interaction history:
llm logs
No config files, no environment variable ceremony beyond the key. The SQLite log is created automatically at first run in a platform-appropriate data directory (~/.config/io.datasette.llm/ on Linux/macOS).
For multi-turn conversations, the -c flag continues the last conversation:
llm "what are the tradeoffs of using SQLite in production"
llm -c "what about for read-heavy workloads specifically"
Plugin System: Adding More Models
The plugin system separates llm from a simple OpenAI wrapper. Install providers as Python packages:
llm install llm-claude-3 # Anthropic Claude models
llm install llm-gemini # Google Gemini
llm install llm-ollama # Local models via Ollama
llm install llm-mistral # Mistral AI
List all available models after installation:
llm models
Switch providers with a single flag:
llm -m claude-3-5-sonnet-latest "summarize this document"
llm -m gemini-1.5-pro "summarize this document"
llm -m ollama/llama3.2 "summarize this document"
Model aliases let you define shortcuts. Set a default model:
llm models default claude-3-5-sonnet-latest
Or create a named alias:
llm aliases set fast gpt-4o-mini
llm -m fast "quick check: is this valid JSON?"
The plugin registry at llm.datasette.io/plugins lists 40+ available providers, including Bedrock, Vertex AI, Groq, and specialized research model hosts.
Prompt Logging & History
Every interaction β prompt, response, model used, timestamp, token counts β writes to a local SQLite database automatically. No opt-in required. No data leaves your machine except to the model API.
llm logs # Recent interactions, formatted
llm logs --json # Machine-readable output
llm logs --model claude-3-5-sonnet-latest # Filter by model
llm logs -n 50 # Last 50 entries
For programmatic access, the SQLite file is queryable directly with any SQLite tool, including Datasette:
datasette ~/.config/io.datasette.llm/logs.db
The community applies this capability to audit trails for automated pipelines, personal prompt knowledge bases, and cost attribution. When llm is embedded in a script processing production data, the log provides a complete record of every model call without additional instrumentation. This differentiates it from Claude Code and similar agentic tools β there is no vendor-managed conversation history, no data retention policy to audit, no account to log into.
Shell Scripting with llm
Three patterns from practitioner deployments, in order of complexity:
Commit message generation from git diff
git diff --staged | llm "write a concise git commit message for these changes"
Add it as a git alias in .gitconfig and it becomes a single command in your standard commit workflow.
Error log diagnosis
tail -n 100 /var/log/app/error.log | llm "identify the root cause and suggest a fix"
This fits incident response workflows where a fast first-pass on log volume precedes manual review.
Structured output extraction with jq
llm --system "respond only with valid JSON" \
"extract: company name, funding round, amount from: $TEXT" \
| jq '.company_name'
Combined with --system for persona or format control and jq for extraction, this becomes a lightweight document parsing pipeline without a dedicated service.
For recurring tasks, wrap these in shell functions or Makefiles. The tool's lack of startup overhead β a Python process, not a long-running server β makes it practical for high-frequency script calls.
FAQ
How does llm compare to the OpenAI CLI?
The OpenAI CLI is provider-specific β it exists to interact with OpenAI's API surface. llm treats OpenAI as one plugin among 40+. If you build a workflow on the OpenAI CLI and then need to evaluate a cost-equivalent Mistral or Gemini model, you rewrite the integration. With llm, you change a flag. For teams making build-vs-buy decisions on model providers, that portability reduces switching costs to near zero.
Can I use llm with local models, without any API keys?
Yes. Install the llm-ollama plugin and point it at a running Ollama instance on localhost. No API key required, no data leaves your machine. This fits workflows where data privacy, air-gapped environments, or inference cost elimination are constraints. The same llm commands work identically β only the -m flag changes to specify the local model.
What happens to my prompts β is anything sent to a server besides the model API?
Only the model API endpoint receives your prompt data. The llm core tool has no telemetry, no analytics collection, and no cloud sync. The SQLite log is written locally. The only external network call is to a model provider. This is verifiable in the open-source codebase β the Apache 2.0 license means there are no obfuscated components.
Is llm production-ready for automated pipelines, or is it a developer tool only?
The tool fits production shell script workflows where LLM calls are discrete, stateless operations. It's not designed for high-throughput API orchestration or stateful agent loops. The community deployment patterns (log analysis, commit message generation, document processing pipelines) are production uses that share a common shape: single-turn, input/output, scriptable. If your architecture requires retry logic, rate limit management, or structured output validation at scale, layer additional tooling on top of or instead of llm.
Track the Live Heat Score
The llm CLI sits at heat score 71 β #1 in the HookFlow dataset this week. That ranking shifts as new signals arrive from GitHub, Hacker News, PyPI download acceleration, and community platform monitoring.
Track the llm CLI heat score live at hookflow.ai β
If the score sustains at the top of the distribution into a second cycle without the social scout recovery artifact, that confirms practitioner adoption is structural, not episodic. Worth watching before committing a build-vs-buy decision on your LLM integration layer.
Verdict: Build with it β for any team running LLM calls inside shell scripts, CI pipelines, or developer tooling, the plugin portability and zero vendor lock-in on prompt history make this the lowest-risk integration point in the command-line AI category.
Heat scores update daily across 300+ AI tools.