How to Run a Self-Hosted AI Agent for 24 Cents an Hour: Hermes Setup on HPC.ai Step by Step
Hermes runs on a CPU — not a GPU — for $0.24/hour on HPC.ai. One-command install, News Portal for models, and you're running in minutes.
You Can Run a Self-Hosted AI Agent for Less Than a Coffee Per Hour
Most self-hosted AI setups assume you have a spare GPU, a Linux box you’re comfortable SSHing into, and an afternoon to debug dependency conflicts. Hermes Agent requires none of that. The entire stack runs on a CPU instance at $0.24/hour on HPC.ai, installs with a single command, and is ready to take tasks within about ten minutes of you first opening the platform. That’s the promise. Here’s exactly how to make it real.
The $0.24/hour figure is not a teaser rate. It’s the standard price for a CPU instance on HPC.ai’s US West region, and it’s what you’ll actually pay. Hermes doesn’t need a GPU because it’s not running a local model — it’s an orchestration layer that calls out to inference providers. The News Portal subscription at $20/month handles model access cleanly, and the one-command install handles everything else. The total cost to run this for a full month, 24/7, is around $173 in compute plus the $20 subscription. Most people will run it intermittently and pay far less.
What You’re Actually Getting
Before the setup steps, it helps to be precise about what Hermes is and isn’t.
Not a coding agent. A product manager.
Remy doesn't type the next file. Remy runs the project — manages the agents, coordinates the layers, ships the app.
Hermes is an autonomous agent framework with built-in skills: web scraping, email (Gmail), image generation, and messaging integrations for Telegram, Discord, and Slack. It can spawn sub-agents to parallelize work, schedule cron jobs from natural language, and chain those capabilities together without you writing any glue code.
What it is not is a local LLM runner. You’re not running Llama on this CPU. The CPU instance is running the Hermes orchestration process, which calls out to whatever inference provider you configure. That’s why the CPU is sufficient and why the cost is so low.
If you’ve read the OpenClaw overview on this blog, the contrast is instructive: OpenClaw requires you to install scraping tools, email integrations, and agent scaffolding separately. Hermes ships all of that out of the box.
For a deeper look at what makes Hermes distinct as a framework — including its skill-learning loop — the Hermes vs. OpenClaw comparison post covers the architecture in detail.
What You Need Before You Start
HPC.ai account. Sign up at hpcai.com. You’ll need to add a small credit balance — $9 is enough to run for 37 hours, which is plenty for initial setup and experimentation.
News Portal subscription. You can use OpenRouter or OpenAI API keys instead, but the News Portal’s $20/month subscription is the path of least resistance. It handles authentication cleanly, the models are cheap enough that you won’t burn through credits quickly, and it avoids the OAuth debugging that can eat an hour when you’re trying to get something working.
A terminal you’re comfortable with. You’ll be using JupyterLab’s built-in terminal, so no local SSH setup required.
Optional: a Telegram or Discord account. You can run Hermes entirely in the terminal, but the real value comes when it can message you asynchronously. Set one of these up if you want notifications.
Spinning Up the Instance
Step 1: Launch a CPU instance on HPC.ai.
Go to hpcai.com, navigate to Cloud GPUs, and click Launch Now. You’re not actually using a GPU here — select CPU from the instance type options, then choose US West as your region. Name the instance something memorable (hermes-agent works fine). The cost summary should show $0.24/hour. Confirm and launch.
Now you have: a queued instance that will be ready in 2–3 minutes.
Step 2: Select the CUDA image.
When prompted for an image, select the first CUDA option. This gives you a pre-configured environment that the Hermes install script expects. Don’t overthink this — just pick the first one.
Now you have: a booting instance with the right base environment.
Step 3: Open JupyterLab and get a terminal.
Once the instance shows as running, click the JupyterLab link. It’ll take a moment to load. Once it does, open a terminal. If you prefer dark mode (and you should, staring at a white terminal for an hour is its own kind of suffering), go to Settings → Theme → JupyterDark.
Now you have: a working terminal on your cloud instance.
Step 4: Run the one-command install.
Everyone else built a construction worker.
We built the contractor.
One file at a time.
UI, API, database, deploy.
The Hermes install script is a single command that handles the full installation from back to front. The link to the exact command is in the description of the source video — copy it, right-click paste into the terminal, and hit Enter. Then don’t touch anything for five minutes.
The script installs all dependencies, sets up the Hermes agent process, and drops you at the configuration entry screen when it’s done.
Now you have: Hermes installed and waiting for configuration.
Connecting Your Inference Provider
When the install finishes, you’ll see a prompt asking for your inference provider.
Select News Portal. It will display an authentication code and open a browser link. Click the link, sign in with your News Portal credentials, and the agent will confirm the connection. If you’re on the free tier, you’ll get a limited model — something like a basic flash variant that’s genuinely not great for complex tasks. The $20/month subscription unlocks the models worth using.
The subscription also unlocks image generation. Hermes supports Flux 2, GPT Image 1, GPT Image 1.5, GPT Image 2, and Nano Banana Pro as generation models. You won’t need these immediately, but they’re there when you want them.
After the inference provider, you’ll be prompted to set up messaging. Your options are Telegram, Discord, and Slack. If you want to use Hermes for anything beyond manual terminal sessions — cron jobs, price alerts, weekly reports — pick one and configure it now. The cron job use cases below only deliver value if the agent can reach you asynchronously.
Now you have: a running Hermes agent connected to an inference provider, optionally connected to a messaging channel.
Five Things Worth Doing Immediately
The setup is done. Here’s where it gets interesting.
Content gap analysis. Ask Hermes to scrape a YouTube channel and compare what was uploaded last week against what was actually happening in the industry. A real test of this returned: “Four out of five videos uploaded within the past week are OpenAI-centric. Zero uploads on certain days. Big story gaps include the Claude Mythos preview, the Anthropic/Amazon deal, and Google Cloud Next.” That’s a genuinely useful editorial audit, produced in a few minutes, from a natural language prompt.
Cron job scheduling. After running the content gap analysis, you can say “make this a recurring thing every Sunday at 9pm UK time” and Hermes will set up the cron job. One caveat: it runs in the host’s local time zone, so you may need to adjust for UTC offset. But the scheduling itself works from plain English — no crontab syntax required.
Lead scraping. Ask it to “find Northwest London plumbing businesses without websites so I can sell them a website.” Hermes spawns sub-agents to search across multiple areas, then narrows the list down to leads with real addresses and a natural outreach angle. The output includes a specific company — Oliver Plumbers Limited was one result — with a pre-written pitch angle. It also flags caveats: check for WhatsApp Business profiles, verify the address, confirm the website absence isn’t just a bad search result. That’s the kind of output that saves you from embarrassing cold outreach.
Content idea generation. Ask it to research what’s happening in AI and surface three overlooked stories. One result from a live test: “Kimi’s 300 agent swarm is the real story, not the benchmark. Everyone is talking about Kimi K2 benchmark scores, but they’re missing what Kimi actually shipped — a system that orchestrates 300 sub-agents across 4,000 coordinated steps on 4x H100 GPUs. That’s not a better model, that’s an execution substrate, and it’s open source.” That’s a specific, defensible editorial angle that most coverage missed.
Price monitoring with cron. This is the use case that demonstrates the agent’s real utility. Ask it to find supercars in a given price range that are appreciating in value, then alert you when one appears undervalued. A live test returned a Mercedes SLS AMG listed at £125k on Autotrader — against a market rate of around £180k — flagged as potentially mispriced by £30–50k. The agent noted it could be a high-miler or a write-off, but recommended immediate inspection. It then set up a daily monitoring cron job automatically, without being asked. You can apply the same pattern to anything with public pricing data: electronics, property, equipment, domain names.
For teams thinking about how to build structured workflows around this kind of agent output — connecting it to CRMs, Slack, or databases — MindStudio’s visual builder handles that orchestration layer: 200+ models, 1,000+ integrations, and a drag-and-drop interface for chaining agents and tools without writing the glue code yourself.
When Things Go Wrong
The instance won’t boot. HPC.ai queues instances, and occasionally a region is at capacity. If it’s been more than five minutes, try relaunching in the same region. If that fails, US East is the fallback — the $0.24/hour price is the same.
The install script errors out. The most common cause is a network timeout during package installation. Re-run the same command. The script is idempotent enough that a second run usually succeeds. If it fails on a specific package, check whether the CUDA image was selected correctly — some base images are missing dependencies the script expects.
Model quality is poor. This is almost always the free tier. The default model on a free News Portal account is a basic flash variant that struggles with multi-step reasoning. The $20/month subscription is not optional if you want the agent to actually be useful. Think of it as the cost of the tool, not an upsell.
Cron jobs aren’t firing. The agent runs in the host time zone. If you asked for “9pm UK time” and the instance is in US West (UTC-7 or UTC-8 depending on DST), the cron is firing at the wrong local time. Ask Hermes to show you the current cron schedule and verify the UTC offset explicitly. You can tell it to hardcode the UTC time directly.
Messaging integration not working. Telegram tends to be the most reliable. Discord works but requires a bot token setup that can be fiddly. If you’re having trouble with Discord, switch to Telegram for initial testing, then come back to Discord once the agent is otherwise working.
Sub-agents timing out on large scraping tasks. The lead scraping and price monitoring use cases spawn multiple sub-agents in parallel. On complex queries, some sub-agents will time out or return empty results. This is normal — Hermes consolidates what it gets and flags gaps. If you’re getting consistently thin results, narrow the query scope (fewer geographic areas, tighter price range) and run again.
Plans first. Then code.
Remy writes the spec, manages the build, and ships the app.
Where to Take This Further
The cron job capability is the multiplier here. Every use case above becomes more valuable when it runs on a schedule and delivers results to your phone.
A content creator running the YouTube gap analysis weekly gets a standing editorial brief every Sunday evening. A small agency running the lead scraping daily gets a fresh list of prospects every morning. Someone monitoring a specific category of used cars gets an alert the moment something mispriced appears — while they’re asleep.
The image generation models (Flux 2, GPT Image 1/1.5/2, Nano Banana Pro) are worth exploring once the core workflow is stable. The practical application is generating social media images on the same cron schedule as your content brief — the agent produces the analysis and the visual asset in one pass.
If you’re building something more structured on top of this — say, a dashboard that surfaces the agent’s weekly reports, or a lightweight CRM that tracks the leads Hermes finds — that’s where a spec-driven approach pays off. Remy compiles annotated markdown specs into full-stack TypeScript applications with a backend, SQLite database, auth, and deployment. The spec is the source of truth; the code is derived output. It’s a reasonable next step when you’ve validated a workflow in Hermes and want to productize it.
The Gmail integration is also worth setting up if you’re using this for lead outreach. Hermes can draft and send emails directly, which closes the loop from “found a lead” to “sent a pitch” without you touching it.
For autonomous agent patterns that go beyond single-instance setups — multi-agent teams, heartbeat scheduling, parallel task execution — the multi-agent company post using Paperclip and Claude Code covers the architecture in detail, and a lot of the patterns transfer directly to Hermes.
The $0.24/hour compute cost means you can leave this running for a month and spend less than $8 on infrastructure. The $20/month News Portal subscription is the real cost. Whether that’s worth it depends entirely on whether you give the agent something useful to do — and the use cases above suggest the answer is yes, provided you’re specific about what you want.