Hermes Agent (GitHub): Self-Host Guide, Setup & Daily-Use Notes

Hermes Agent is an open-source, self-improving AI agent built by Nous Research, with source code on GitHub at NousResearch/hermes-agent. It runs locally or on a small VPS, uses any LLM provider you choose, executes real tools (terminal, code, file, browser), creates and improves its own skills over time, and talks to you from Telegram, Discord, Slack, WhatsApp, Signal, the CLI — or, with Vylen, from a native mobile app built for it.

This is the guide we wished existed when we started self-hosting Hermes. It is written by a team that runs it every day to build Vylen, the calm command surface for user-owned Hermes instances.

On this page

What is Hermes Agent?
The Hermes Agent GitHub repo, at a glance
What it actually does, day to day
Installing Hermes Agent from GitHub
Configuration that matters
The approval-gate gotcha to know about
Hermes Agent vs. other open-source agents
Driving Hermes from your phone
FAQ
Further reading

What is Hermes Agent?

Hermes Agent is a local-first AI agent runtime. You install it on your own machine — laptop, home server, VPS, or serverless container — and it gives you a single process that can:

hold a conversation in a terminal, in chat apps, or through a third-party UI
call real tools to read files, run shell commands, execute Python, browse the web, or hit MCP servers
create new “skills” from successful runs and keep improving them
remember things across sessions through agent-curated memory
schedule itself with a built-in cron and report back to any channel you connect

The defining trait, and what makes it different from most agent frameworks, is the closed learning loop: Hermes is designed to get better the longer you use it. It builds a model of you through Honcho dialectic user modeling, indexes its own past conversations with FTS5 for cross-session recall, and writes skills it can re-use later.

It is MIT-licensed, vendor-agnostic on the model layer (Nous Portal, OpenRouter, NovitaAI, NVIDIA NIM, z.ai/GLM, Kimi, MiniMax, Hugging Face, OpenAI, or your own endpoint), and built by the Nous Research team who also fine-tunes the Hermes family of open-weights LLMs.

The Hermes Agent GitHub repo, at a glance

Repository	github.com/NousResearch/hermes-agent
License	MIT
Maintainer	Nous Research
Official docs	hermes-agent.nousresearch.com/docs/
Community	Nous Research Discord
Language	Python (3.11+)
Standard install	One-line curl on Linux, macOS, WSL2, Termux
Windows	Early-beta native PowerShell installer; WSL2 is the battle-tested path
License of the upstream	MIT (use it commercially, fork it, embed it)

If you only need the link, that is the one above. The rest of this page is what you do after you found the repo.

What it actually does, day to day

Most agent READMEs read like a feature list. Here is what the seven Hermes Agent capabilities map to in practice.

1. A real terminal interface

hermes drops you into a full TUI: multiline editing, slash-command autocomplete, history, interrupt-and-redirect, streaming tool output. This is the primary way you use it on a dev machine. It is, candidly, one of the nicest agent TUIs we have used.

2. Lives where you do

A single hermes gateway process bridges Telegram, Discord, Slack, WhatsApp, Signal, and the CLI. Voice memos get transcribed. Conversations continue cross-platform — you start a thread in Telegram on your phone and finish it in the CLI.

This is the part Vylen explicitly does not replace. If a chat app already works for you, keep it. We pick up where chat apps end: long-running runs, tool activity streams, artifact previews, approval cards. More on that below.

3. A closed learning loop

After a complex task succeeds, Hermes can autonomously create a skill — a reusable recipe. It then improves that skill the next time it is used. Combined with agentskills.io compatibility, Honcho user modeling, and FTS5-indexed session search, this is the feature that compounds: your agent is more useful in month three than in week one.

4. Scheduled automations

A built-in cron lets the agent run itself. Daily reports, nightly backups, weekly audits — defined in natural language, delivered through any gateway you have wired up. The agent does the work unattended; you read the result over coffee.

5. Delegates and parallelizes

Hermes can spawn isolated subagents for parallel workstreams. You can also drop into Python and call tools via RPC, which collapses multi-step pipelines into a single zero-context-cost turn. This is a force multiplier on long research tasks.

6. Runs anywhere

Seven terminal backends: local, Docker, SSH, Singularity, Modal, Daytona, and Vercel Sandbox. Daytona and Modal hibernate when idle — your agent’s environment costs nearly nothing between sessions and wakes on demand. A $5 VPS works. A GPU cluster works. Your laptop works.

7. Research-ready

Batch trajectory generation and trajectory compression — relevant if you are training the next generation of tool-calling models. Most users will not touch this; the people building those models will.

Installing Hermes Agent from GitHub

The official quick install is one line. On Linux, macOS, WSL2, or Termux:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

On native Windows (PowerShell) — note that this path is early beta and rough edges should be filed as issues:

iex (irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1)

The Windows installer ships a portable Git Bash (MinGit, ~45 MB, unpacked to %LOCALAPPDATA%\hermes\git, no admin needed). If you already have Git installed, it is detected and reused.

Our recommendation On a fresh Windows machine, run the Linux/macOS installer inside WSL2 unless you have a specific reason not to. The WSL2 path is more battle-tested than native Windows today, and you avoid the one feature that currently requires WSL2 anyway (the browser-based dashboard chat pane uses a POSIX PTY).

After install, reload your shell and run hermes:

source ~/.bashrc    # or: source ~/.zshrc
hermes

You will be walked through model and provider selection on first run. If you prefer the full wizard, hermes setup configures everything at once.

From-source install (if you want to hack on it)

Clone, install the dev extras, and run:

git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
uv venv && source .venv/bin/activate
uv pip install -e ".[all]"
hermes

On Termux specifically, use .[termux] instead of .[all] — the full extra currently pulls Android-incompatible voice dependencies.

Docker

There is a Dockerfile in the repo root. The simplest thing that works:

docker build -t hermes-agent .
docker run -it --rm \
  -v hermes-data:/root/.hermes \
  -e HERMES_MODEL_PROVIDER=openrouter \
  -e OPENROUTER_API_KEY=sk-... \
  hermes-agent

For long-running deployments, point the volume at a path you back up, and pass your gateway credentials through environment variables rather than baking them into the image.

Configuration that matters

After install, these are the settings most people get wrong on first run.

Pick a model intentionally

hermes model brings up the picker. The right answer depends on what you are doing:

Daily driver, mid-budget: a strong open-weights model on OpenRouter or Nous Portal. Hermes 3 70B / 405B variants if you can afford the tokens.
Heavy tool use: a model that handles tool-calling cleanly. Anthropic Claude or OpenAI GPT-class through their direct APIs if you need maximum reliability.
Local-only, no API: a quantized open-weights model through Ollama or vLLM at your own endpoint. Expect slower turns and rougher tool handling, but no data leaves your machine.

You can change this any time without code changes. That is one of the underrated parts of Hermes — model choice is not a deployment decision, it is a per-session decision.

Configure tools, not just the model

hermes tools lets you enable or disable individual tools. Defaults are sensible, but two recommendations:

Keep terminal and execute_code enabled, but understand the approval implications (next section).
Turn on browse (web browser tool) only if your model handles it well; some smaller models loop on it.

Set up a gateway before you need one

hermes gateway is one command. Wire it up early — even if you only point it at one channel — because it changes how you use the agent. A Hermes you can ping from Telegram while it works in the cloud is qualitatively different from a Hermes you have to SSH to.

Health-check with `hermes doctor`

When anything feels off, hermes doctor is the first move. It catches the dependency, PATH, and credential issues that account for ~80% of real-world install pain.

The approval-gate gotcha to know about

Hermes has a dangerous-command approval system around its terminal tool: before running a shell command, it pattern-matches the command string and asks the user to approve or deny when the pattern matches a flagged category (e.g. rm -f /tmp/... triggers a delete in root path warning).

The catch we documented in The Scary Part: Hermes Approval Depends on Which Tool the Model Picks is that the same user intent can route through a different tool — execute_code instead of terminal — and skip the approval entirely. The model’s tool choice becomes part of the security boundary.

If you self-host Hermes for anything that matters, read that post before you do. The TL;DR: approval is tool-scoped today, but you want to think about it as action-scoped. Treat execute_code as a privileged surface, not a free pass.

Hermes Agent vs. other open-source agents

The honest comparison, from someone who has tried all of these.

	Hermes Agent	Open Interpreter	AutoGPT-style	Cloud agents (Claude/GPT/etc.)
Runs locally	Yes, first-class	Yes	Yes	No
Multi-channel chat	Telegram, Discord, Slack, WhatsApp, Signal, CLI	CLI only	Web UI	App-specific
Self-improving skills	Yes, autonomous	No	Limited	N/A
Persistent memory	Cross-session, Honcho	Per-session	Per-task	App-managed
Scheduled / unattended	Built-in cron	No	Limited	No
Model choice	Any provider, any time	Any provider	Any provider	Vendor-locked
Best for	Long-running personal agent that compounds	One-off coding tasks in your shell	Autonomous task chains	Quick, hosted use

Hermes is the right choice if you want an agent that lives with you for months and gets better the whole time. It is overkill if you only need to “do this one thing in my repo” — Open Interpreter or a coding-specific tool is lighter weight for that.

Driving Hermes from your phone

Here is the honest pitch for why Vylen exists.

Hermes natively supports Telegram, Discord, Slack, WhatsApp, Signal, and the CLI. Those are excellent for messaging an agent. They are not built to operate one.

Operating an agent means:

watching a long run progress in real time, with structured tool activity, not a wall of text
inspecting artifacts the agent produced (files, diffs, screenshots) without leaving the surface
approving or denying a dangerous action with the full context visible
jumping between several Hermes instances (home, work, server) without re-pairing each time
resuming work from any device — phone on the train, laptop at the desk

Chat apps cannot do those things well, and existing third-party UIs are mostly browser-only or developer-first. That gap is what Vylen fills. It is a native Android and iOS app (plus a web client) built specifically for the Hermes runtime — calm, mobile-first, focused on runs and tool streams rather than chat transcripts. Your Hermes keeps running on your hardware; Vylen Cloud only brokers an outbound tunnel.

The Hermes-side half of that surface is an open-source gateway plugin: hermes-vylen-gateway. It runs alongside Hermes Agent, opens a single outbound WebSocket, and routes messages, runs, and approvals to the Vylen client. No inbound ports on the host. No API keys leave it.

The plugin also owns the chat database — a small SQLite store on your own machine — which is what makes the same chat work live across all your devices at once (see Same chat, every screen for the multi-device demo).

If you self-host Hermes and you ever want to check on it from your phone, pair an instance with Vylen — that is what we built it for.

FAQ

Is Hermes Agent free?

Yes. It is MIT-licensed, so you can use it commercially, fork it, embed it, or run it privately. You pay only for whatever LLM tokens your chosen provider charges — and you can run fully local with open-weights models if you want zero per-token cost.

Is Hermes Agent open source?

Yes. The full source is on GitHub at NousResearch/hermes-agent under the MIT license.

Can Hermes Agent run fully locally?

Yes. Point it at a local model endpoint (Ollama, vLLM, LM Studio, or your own server) and disable any gateway you do not want, and Hermes runs end-to-end on your hardware. The tool calls execute on your machine; the memory and skills store on your disk.

What is the difference between Hermes (the LLM) and Hermes Agent (the runtime)?

The Hermes family is a set of open-weights large language models fine-tuned by Nous Research. Hermes Agent is a Python runtime that uses any LLM (including but not limited to the Hermes models) to drive a tool-using agent with memory, skills, gateways, and scheduling. You can absolutely run Hermes Agent on a non-Hermes model — and most people do.

Which models work best with Hermes Agent?

In practice: anything that handles tool-calling reliably. The Hermes family of open-weights models, Claude Sonnet/Opus class, GPT-4/5 class, and the larger DeepSeek / Qwen variants all work well. Smaller models work for chat but loop on multi-step tool use.

How do I update Hermes Agent from GitHub?

hermes update from inside the CLI. Or, if you installed from source, git pull && uv pip install -e ".[all]" from the repo directory.

Is Hermes Agent production-ready?

For personal and team use, yes — it is the daily driver for a non-trivial community. For “five-nines, customer-facing, regulated workload” production, treat it like any other young agent runtime: pilot it, harden the approval policy (see the approval-gate gotcha), and avoid running it as root.

Does Hermes Agent work on a phone?

Hermes itself runs on Termux (Android) and over SSH from any device, but neither is a great mobile experience. For day-to-day phone use, the right pattern is to run Hermes on a server (or a $5 VPS) and reach it from a phone-native client. That is exactly the gap Vylen is built for.

Can I run multiple Hermes instances?

Yes. Many users run a home instance, a work instance, and a cloud instance. Each has its own memory, skills, and credentials. Vylen specifically supports switching between instances from one app — see the product overview.