What is PocketHook Agent Server?#
The agent server turns PocketHook into a full AI assistant. Instead of writing response logic yourself, you connect an LLM (Claude, GPT, Gemini, etc.) that processes messages, calls tools, and returns structured PocketHook responses — including Shortcut triggers.
The server runs on your own machine. Your data stays with you.
This is a starting point. The server ships with a core set of tools and is designed to be extended by you. Add your own integrations — email, calendars, documents, APIs — and make it yours.
Features#
- Multi-provider LLM — Anthropic, OpenAI, GitHub Copilot, Google, Mistral, Groq, xAI, OpenRouter, Ollama (local), LM Studio (local)
- OAuth authentication — GitHub Copilot and OpenAI Codex via device code / browser flow
- Agent tools — Shell commands, file read/write, directory listing, web search, web scraping, dev server management
- Framework / user split — Framework files (
skills/,custom-tools/,config/) stay read-only. Your customizations live underdata/user/(skills, custom tools, instructions, typed prefs). Framework updates land cleanly without overwriting your work - Typed user prefs — Store values like your preferred maps app or tunnel domain in
data/user/prefs.json. Reference them in skills as{{prefs.key}}and the server substitutes them on load - Programming tasks in one call — The
run_code_jobmeta-tool creates a prompt-type background job (run by your configured LLM) and sends the user the ack in a single step, replacing the error-prone “respond + create-job” pattern - Typed protocol tools — Six dedicated
respond_*tools (respond_text,respond_image,respond_buttons,respond_shortcut,respond_html,respond_sequence), plus typed job tools (create_once_job,create_cron_job) and typed workspace tools (create_project,list_projects,delete_project). Schemas reject malformed URLs, button syntax, andtype/schedulecombinations before they reach the device - Typed writers for customization —
create_user_skillandcreate_custom_toolbuild the user-layer markdown with correct frontmatter, so the loader always parses them and the agent never hand-writes these files - Background jobs — One-time or recurring tasks with cron expressions or simple intervals
- Dynamic skills — Define shortcuts and behavior rules as
.mdfiles. Only a compact index is loaded into the prompt; full content is fetched on demand via theload_skilltool - Self-managing skills — The agent can create, edit, and delete skill definitions (writes always land in the user layer)
- Semantic memory — Vector-based search with embeddings (Ollama, LM Studio, or OpenAI). Memories are auto-classified into wing/room/hall/status dimensions by the LLM
- Knowledge graph — Temporal triple store for durable facts with auto-invalidation. Multi-value relationships coexist; single-value facts auto-replace
- PARA method with project-end cascade — Every memory is tagged with a status (Project, Area, Resource, Archive). When a project ends, a single
complete_projectcall archives its vectors, invalidates every planning triple tied to its slug, and records the completion — one call instead of three - Hybrid recall — Combines FTS5 keyword search with vector semantic search using reciprocal rank fusion
- Long-term memory — SQLite + FTS5 full-text search as fallback when semantic memory is disabled
- Dev server management with tunnel contract — Start, stop, and list dev servers. When
tunnel: trueis requested, the server enforces it pre-flight and post-spawn — an unreachable localhost server is never left running silently - Automatic URL sanitization — If the agent leaves a
localhostURL in a response, therespond_*tools rewrite it to the matching tunnel URL so your phone always gets a reachable link - Custom tools — The agent can install CLI tools and register them as new capabilities
- Versioning — Automatic git versioning for workspace files; config backups for skills and permissions
- Web dashboard — Live overview of background jobs, customizable per user.
/dashboardand/api/jobsare unauthenticated by design — restrict access at the network layer (Tailscale ACL, firewall, reverse proxy with basic auth) or setDASHBOARD=falseif you don’t need it - HTTPS tunneling — Built-in support for Tailscale, ngrok, and Cloudflare Tunnel
- System service — Install as a persistent service on macOS, Linux, or Windows
- Rate limiting — Per-token request limits with configurable thresholds
Requirements#
- Bun runtime
- An API key or OAuth credentials for your LLM provider
- (Optional) Tailscale, ngrok, or cloudflared for HTTPS tunneling
Quick Start#
git clone https://github.com/pockethook-app/pockethook-agent-server.git
cd pockethook-agent-server
bun install
# Interactive setup — choose provider, model, auth token, port
bun run setup
# Start server + HTTPS tunnel
bun run dev:tunnel
The setup wizard will guide you through choosing an LLM provider, configuring authentication, and setting up tool permissions.
Once running, copy the displayed URLs into PocketHook Settings:
| PocketHook Setting | URL |
|---|---|
| Server URL | https://your-host |
| Health Check URL | https://your-host/health |
| Polling URL | https://your-host/jobs |
How It Works#
- You send a message in PocketHook
- The server forwards it to your chosen LLM with conversation history, recalled memories, and available tools
- The LLM processes the message — it can run shell commands, read/write files, search the web, schedule background jobs, remember facts, or start dev servers
- The response is returned in PocketHook format (
msg+shortcut+data+url) - PocketHook displays the message and executes any Shortcuts on your device
Supported LLM Providers#
| Provider | Auth | Default Model |
|---|---|---|
| Anthropic | API key | claude-sonnet-4-20250514 |
| OpenAI | API key | gpt-4.1-mini |
| OpenAI Codex | OAuth | gpt-5.1-codex-mini |
| GitHub Copilot | OAuth | claude-sonnet-4 |
| Google (Gemini) | API key | gemini-2.5-flash |
| Mistral | API key | mistral-medium-latest |
| Groq | API key | llama-3.3-70b-versatile |
| xAI (Grok) | API key | grok-3-mini-fast |
| OpenRouter | API key | anthropic/claude-sonnet-4 |
| Ollama (local) | None | llama3.2 |
| LM Studio (local) | None | qwen3.5-4b-mlx |
Switch providers anytime with bun run switch. Ollama and LM Studio run entirely on your machine — no API key needed, no data leaves your network.
Memory#
The memory system has three layers, each serving a different purpose.
The semantic memory design combines ideas from MemPalace (a memory palace architecture that organizes memories into wings, halls, and rooms) and Tiago Forte’s PARA method (Projects, Areas, Resources, Archive) for knowledge lifecycle management.
Conversation memory#
SQLite with FTS5 full-text search. All messages are stored with timestamps and session IDs.
- Short-term — Last
MAX_HISTORYmessages kept in memory per session - Long-term — All messages persisted in SQLite, searchable via FTS5 keyword matching
- Recall per turn — When semantic memory is on,
MAX_RECALLcontrols how many relevant memories are injected into the prompt each turn - Sessions expire after
SESSION_TTL_MINUTES, but long-term memory persists forever
Tune these interactively with bun run memory.
Semantic memory#
Requires VECTOR_MEMORY=true and an embedding provider (Ollama, LM Studio, or OpenAI).
Each memory is embedded as a vector and auto-classified by the LLM into four dimensions:
- Wing — The entity:
user,person:john,project:blog,place:london - Room — The type:
facts,preferences,events,decisions,requests - Hall — The topic:
personal,tech,health,travel,food,work - Status — PARA classification:
project,area,resource,archive
When you ask a question, entity extraction focuses the vector search on the most relevant wings. Results are merged with FTS5 keyword results using reciprocal rank fusion — so you get the best of both keyword and semantic matching.
Knowledge graph#
A temporal triple store for structured, durable facts:
- Triples:
(subject, predicate, object)withvalid_from/valid_untiltimestamps - Single-value predicates (
lives_in,partner) auto-invalidate the old value on update - Multi-value predicates (
child,friend,hobby) coexist without invalidation - Knowledge graph facts are injected alongside recalled memories in every conversation
When you tell the agent “I moved to Berlin”, it invalidates the old lives_in triple and creates a new one — automatically.
PARA lifecycle#
Every memory is tagged with a PARA status:
- Project — Active, time-bound work
- Area — Ongoing responsibilities
- Resource — Reference material (lists, recommendations, how-tos)
- Archive — Completed or cancelled projects
When a project completes, the agent uses semantic similarity to archive only that project’s memories while preserving reference material for future use.
Project-end cascade#
Say “I’m cancelling my trip to Barcelona” and a single tool call handles everything:
- Archives the project’s vectors (events, decisions, requests tied to Barcelona).
- Invalidates every active knowledge-graph triple whose predicate matches the project slug (
scheduled_visit_barcelona,planning_visit_barcelona,confirmed_visit_barcelona). - Records the completion as a new triple:
(user, "cancelled_visit_barcelona", "2026-04-15").
Matching is boundary-aware — a different project called revisit_barcelona stays untouched. The agent no longer has to orchestrate three separate calls in the right order, so smaller models get it right too.
If VECTOR_MEMORY is disabled or the embedding provider is unreachable, the system falls back to FTS5-only with no errors.
Skills#
Skills are .md files in skills/ that define iOS Shortcuts the agent can trigger and/or behavior rules. They use dynamic loading: only a compact index (title, description, shortcut list) is injected into the system prompt. The agent loads full content on demand via the load_skill tool, keeping token usage low as you add more skills.
Each skill file uses YAML frontmatter:
---
title: Notes
description: Create notes on the user's device with a title and body
shortcuts: [newNote]
target: mac
sync_app: Notes
---
### New Note
Shortcut name: `newNote`
Creates a new note on the user's device.
Data fields:
- title (string, required): Note title
- content (string, required): Note body
Frontmatter fields#
| Field | Required | Description |
|---|---|---|
title | Yes | Human-readable name |
description | Yes | One sentence used in the skills index shown to the agent |
shortcuts | Yes | Array of shortcut names defined in the file. Use [] for behavior-only skills |
target | No | Where shortcuts execute: device (default, sent to iOS) or mac (run on the server) |
sync_app | No | App to nudge in the background after server-side execution to trigger iCloud sync (e.g. Notes, Calendar, Reminders). Omit or use none to skip |
Skills can also be behavior rules without shortcuts (e.g., “how to plan a family trip”). Use shortcuts: [] for these.
The agent can create and manage skills when asked — ask it to “create a skill for controlling my lights” and it will write the .md file for you. New and edited skills always land in your user layer (data/user/skills/), so framework updates never overwrite them. See the Customizing your agent section below.
Executing shortcuts on the Mac server#
When a skill has target: mac, shortcuts run silently on the Mac server via the shortcuts run CLI instead of being sent to the iOS device. This is ideal for actions that create iCloud-synced content — notes, reminders, calendar events — because the result syncs to all your devices automatically without needing the PocketHook app to do anything.
How it works:
- The agent decides a shortcut should run (e.g. “create a note with today’s meeting notes”)
- The server invokes
shortcuts run "shortcutName"with the data passed as JSON on stdin, using the same wrapper format PocketHook iOS uses - If
sync_appis set, the server briefly opens that app in the background (open -gj -a Notes) to force iCloud sync, then closes it after 5 seconds - The user receives a confirmation message in the chat; the shortcut itself is not sent to the device
Requirements:
- The server must be running on macOS —
shortcuts runis macOS-only. On other platforms, the server logs a warning and falls back to device execution - The shortcut must be installed in Shortcuts.app on the server Mac
- The shortcut should expect a Dictionary as input (PocketHook wraps data in
{ context, timestamp, app, data })
When to use target: mac:
- iCloud-synced actions (Notes, Reminders, Calendar) — the result reaches every device anyway
- Long-running processing you want to keep off the iOS device
- Any shortcut that doesn’t need to interact with the iPhone’s UI
When to keep target: device (default):
- Shortcuts that need iPhone-only features (camera, precise location, local app automations)
- Shortcuts that prompt the user for interactive input
- Shortcuts that use App Intents from iOS-only apps
Background Jobs#
Ask the agent to schedule tasks and it will handle the rest:
- “Check the weather every morning at 8am and create a note”
- “Run this script every hour”
- “Remind me to check my email in 30 minutes”
Jobs support cron expressions (0 8 * * *) and simple intervals (30m, 1h, 2d). Results are delivered to PocketHook when it polls the /jobs endpoint.
Two execution types:
- Shell — Runs a bash command, captures output. Can trigger a Shortcut on completion
- Prompt — Processed by the AI agent with full tool access, stores the complete PocketHook response
Dev Servers#
When the agent creates a web project in the workspace (Hugo, Astro, Next.js, Flask, Go, etc.), it proactively offers to serve it:
- Preview — Starts a local dev server on an auto-assigned port for quick viewing
- Public — Starts the server and exposes it via HTTPS tunnel so it’s accessible from anywhere
The agent manages the lifecycle: start, stop, and list running servers. All servers are cleaned up when the main server stops.
Tunnel contract#
When the agent starts a server with tunnel exposure requested, the runtime enforces it: if no tunnel tool (Tailscale, ngrok, cloudflared) is installed, the server refuses to start. If tunnel setup fails after spawn, the orphan process is stopped and the agent is told explicitly — so it can fall back to preview mode or ask you to install a tunnel. The returned URL is always the tunnel URL when tunneling is on, with a note that the local URL is host-only.
As a safety net, every respond_* tool post-processes its message: any localhost or 127.0.0.1 URL that sneaks into a reply gets rewritten to the matching tunnel URL automatically when a managed server has one. When it can’t rewrite, you get a warning in the logs instead of a broken link on your phone.
Dashboard#
The built-in web dashboard at /dashboard shows a live overview of background jobs.
Unauthenticated by design. Both
/dashboardand/api/jobsare openGETendpoints — anyone who can reach the host can list jobs. Restrict access at the network layer (Tailscale ACL, firewall, reverse proxy with basic auth) or setDASHBOARD=falseif you don’t need it. The PocketHook iOS app doesn’t use these endpoints.
It’s fully customizable:
- Quick edit — Place a
dashboard.htmlinworkspace/dashboard/for simple customizations - Full project — Create a framework project (Svelte, React, Vue, etc.) in
workspace/dashboard/with build output todist/
Ask the agent to customize your dashboard and it will handle the rest — each user gets a unique, personalized dashboard.
Custom Tools#
The agent can install CLI tools and register them as new capabilities — extending itself without modifying the server code.
For example, say “install Playwright and use it to take screenshots”. The agent will:
- Install the dependency
- Create a tool definition (a simple
.mdfile) - Use the new tool in future conversations
Custom tools are hot-reloaded — no restart needed. Delete the .md file to remove a tool.
Versioning#
All user data is versioned automatically:
- Workspace files — Tracked with a local git repo inside
workspace/. Every write creates an auto-commit. Ask the agent to “undo the last change” or usegit revert HEADmanually - Config files —
config/agent-instructions.md,config/personality.md,skills/, andpermissions.jsonare backed up before each modification. Up to 20 versions per file
Git is optional — if not installed, workspace changes are unversioned. Config backups always work.
Customizing your agent#
The agent server ships with a minimal framework base and expects you to layer your own customization on top. The runtime keeps the two apart so framework updates never clobber your work.
Framework vs user#
pockethook-agent-server/
├── skills/ # framework-shipped skills (read-only)
├── custom-tools/ # reserved for framework-shipped tools (read-only)
├── config/
│ ├── agent-instructions.md # framework agent instructions (read-only)
│ └── personality.md # framework personality (read-only)
└── data/user/ # YOUR customization lives here (git-ignored)
├── skills/ # your own skills (override base on filename)
├── custom-tools/ # your installed custom tools
├── instructions.md # your additions to agent instructions
└── prefs.json # typed values referenced as {{prefs.key}}
User customization is written via dedicated typed tools (create_user_skill, create_custom_tool) so the resulting files always match the loader’s format. The write tool also rejects any path under skills/, custom-tools/, or config/ and redirects the agent to data/user/* — so even direct file edits end up in the user layer.
Note about the base
custom-tools/directory. Today it only holds a template (_example.md) that the loader ignores — every tool the agent installs for you goes todata/user/custom-tools/. The directory is reserved so future framework releases can ship optional built-in tools without clobbering your installs. When that happens, your user-layer files still win on tool-name collision, so there’s nothing to migrate.
Four ways to customize#
| What you want to change | Where it goes | Example |
|---|---|---|
| A shortcut or behavior skill | data/user/skills/<name>.md | “Create a skill to log my workouts” |
| A CLI tool wrapped as an agent capability | data/user/custom-tools/<name>.md | “Install ffmpeg and let me use it for conversions” |
| A global rule (“always reply in English”, “never use tables”) | data/user/instructions.md | “From now on, always summarize articles in 3 bullets” |
| A typed default value referenced by skills | data/user/prefs.json | “My default route origin is Madrid” → {"routeOrigin": "Madrid"} |
You never have to write these files by hand. Just tell the agent what you want and it picks the right layer automatically.
Typed preferences with {{prefs.*}}#
Say you write a route-planner skill that needs to know your default starting point. Instead of hardcoding “Madrid” into the skill, reference the pref:
- **Starting point**: {{prefs.routeOrigin}}, unless the user specifies a different origin.
And store the value in data/user/prefs.json:
{
"routeOrigin": "Madrid, Spain",
"preferredMapsApp": "apple",
"tunnel": { "domain": "my-host.ts.net" }
}
The server substitutes placeholders when the skill is loaded. Nested keys ({{prefs.tunnel.domain}}) work too. Unknown keys are left untouched so typos stay visible.
Editing the framework base directly#
If you’re self-hosting and want to tweak the framework itself, you can edit config/agent-instructions.md, config/personality.md, skills/, or custom-tools/ directly — the server doesn’t stop you when you use a file editor. But the agent won’t write to those paths from a conversation. And framework updates will overwrite your edits. Prefer the user layer for anything you want to keep.
Extending the Server#
- Custom tools — Ask the agent to install CLI tools; they land in
data/user/custom-tools/automatically - Add skills — Ask the agent to create a skill; the file goes in
data/user/skills/ - Change behavior — Ask the agent to apply a global rule; it appends to
data/user/instructions.md - Configure permissions — Run
bun run permissionsto control which tools the agent can use - Add built-in tools — Implement new tool functions in
src/tools.tsfor deeper integrations (requires forking the server)
Configuration#
All settings are stored in .env (created by bun run setup). Key options:
| Variable | Default | Description |
|---|---|---|
AUTH_TOKEN | (required) | Shared secret with PocketHook |
LLM_API_KEY | (required) | LLM provider API key |
LLM_PROVIDER | anthropic | Provider name |
LLM_MODEL | claude-sonnet-4-20250514 | Model ID |
LLM_REASONING | off | Reasoning effort: off, minimal, low, medium, high, xhigh. Higher levels add hidden thinking tokens (slower + more expensive). Ignored by models that don’t support it |
PORT | 3000 | Server port |
AGENT_NAME | PocketHook Assistant | Agent display name |
MAX_HISTORY | 50 | Messages in short-term memory |
MAX_RECALL | 5 | Memories returned per turn by semantic recall (only when VECTOR_MEMORY=true) |
SESSION_TTL_MINUTES | 60 | Session expiration |
VECTOR_MEMORY | false | Enable semantic memory (requires an embedding provider) |
EMBEDDING_PROVIDER | ollama | Embedding provider: ollama, lm-studio, or openai |
EMBEDDING_MODEL | nomic-embed-text | Embedding model name |
EMBEDDING_URL | (auto) | Embedding API URL |
EMBEDDING_API_KEY | — | API key for OpenAI embeddings |
LOG_LEVEL | info | Log level: debug, info, warn, error |
RATE_LIMIT_MAX | 30 | Max requests per window |
DASHBOARD | true | Enable web dashboard (/dashboard route) |
INSTANCE_NAME | (project dir basename, with pockethook- stripped) | Suffix used for the system service label, log directory, and process matching. Set explicitly when running multiple checkouts on the same machine |
See the full configuration reference in the GitHub repository.
Running as a Service#
Install as a persistent service that starts automatically:
bun run service install
| Platform | Backend | Service location |
|---|---|---|
| macOS | launchd | ~/Library/LaunchAgents/com.pockethook.${INSTANCE_NAME}.plist |
| Linux | systemd (user) | ~/.config/systemd/user/pockethook-${INSTANCE_NAME}.service |
| Windows | NSSM | PocketHook-${PascalCase(INSTANCE_NAME)} in Windows Service Manager |
INSTANCE_NAME defaults to the project directory basename with the pockethook- prefix stripped (e.g., a checkout in pockethook-agent-server/ becomes agent-server). Set it explicitly to run several checkouts on the same machine without collisions — each instance keeps its own data/ and logs.
Manage with bun run service status, restart, stop, or uninstall.
Security#
- HTTPS required — PocketHook enforces HTTPS for all URLs
- Bearer token auth — Shared secret between app and server
- Rate limiting — Per-token limits prevent abuse
- Sandboxed tools — Shell commands and file access restricted by permissions
- Blocked patterns — Dangerous commands (
sudo,rm -rf /) blocked by default - Working directory boundary — Agent can’t escape its designated directory
- Sensitive files protected —
.env,.git,*.key,*.pemblocked from agent access - Automatic versioning — All workspace changes are git-tracked for easy rollback