How we built ShellMon: detecting dangerous shell commands in real time

The problem

A user can paste rm -rf / into a mobile SSH session as easily as on desktop. Maybe easier — autocorrect, AI suggestions, fat-fingered taps. We needed an interception layer that ran before the keystroke reached the server.

The naive answer is a blocklist of scary strings. But shells are a language, not a set of fixed phrases. rm -rf /, rm -fr /, rm --recursive --force /, and r''m -rf / are all the same intent wearing different clothes. Anything that only matches literal text loses to the first person who adds a quote.

Three approaches we considered

Client-side regex. Match dangerous patterns before sending. Fast, but trivially bypassed — r''m -rf / evades naive matching.
Server-side wrapper. Install a wrapper script on every server. Reliable, but requires modifying every host — a non-starter for a mobile-first client whose whole point is touching servers you don't pre-configure.
Hybrid: client-side parse + canonicalization. Tokenize the command, normalize it, then match against a curated rule set. This is what we shipped.

The detection pipeline

Detection runs in three stages on-device: canonicalize, tokenize, match. The canonicalizer is where most of the work lives — it expands quotes, resolves obvious escapes, and splits on command separators so each segment is judged on its own.

def is_dangerous(command: str) -> tuple[bool, str | None]:
    tokens = shlex.split(canonicalize(command))
    for rule in DANGEROUS_RULES:
        if rule.matches(tokens):
            return True, rule.reason
    return False, None

canonicalize() strips matched quote pairs (r''m → rm), collapses backslash-escapes outside quotes, substitutes a small set of known-safe environment expansions, and splits compound commands on &&, ; and | so a dangerous segment can't hide mid-pipeline. Each segment goes through the rule set independently. Rules are data, not code — a YAML pack of token patterns with a human-readable reason string, which is what gets surfaced to the user when something is blocked.

Why pattern matching beat LLMs here

We considered using an LLM to classify dangerous commands. It works — but it's slow (300–800ms latency per keystroke), expensive at that call volume, and overcautious: it blocks find . -delete because "delete" looks risky.

Pattern matching with hand-tuned rules gives sub-millisecond latency, runs fully offline, and is auditable — you can read exactly why a command was flagged. So we split the job: pattern matching for detection, LLMs for explanation. Once a command is flagged, the AI helper can explain why in plain language, but it never sits in the hot path of every keystroke.

The Y/n auto-response

A separate problem ShellMon solves: apt upgrade asks Do you want to continue? [Y/n] and users want it answered without keeping the screen awake. The hard part isn't typing Y — it's knowing the session is waiting for input versus still working.

# sentinel appended to each command we launch
cmd; __ec=$?; printf '\n__TERMAI_END_%s__%d__\n' "<hex>" "$__ec"

The sentinel does double duty. When it appears in the output stream, we know the command finished and can capture the exit code without parsing the prompt. When it hasn't appeared but output has gone quiet on a line ending in a known prompt pattern ([Y/n], (yes/no)), we're confident the session is blocked on input — and only then does an auto-response rule fire. The randomized hex makes the marker collision-proof against program output that happens to contain "END".

What it can't catch

Honest limitations:

Custom binaries. We can't inspect what ./my-script.sh does. We only check the invocation, not the contents.
Pipe chains. Long pipelines with grep/awk/jq mid-chain can pass dangerous patterns through. We block the obvious cases but won't catch every adversarial example.
Determined misuse. A user who wants to wipe their server will find a way. ShellMon is a safety net, not an access control.

What we shipped, what's next

ShellMon went out in TermAI v0.9. Since then we've added customizable rule packs, Pro-tier push/email notifications when long jobs finish, and AI-helper integration for command explanations.

Next: snippet-aware mode — if you're running a vetted snippet from your own library, ShellMon trusts it more and stays quieter. And we're opening the rule set on GitHub so users can audit it and contribute patterns we missed.

Try TermAI

Free on iOS and Android. 3 SSH connections + 20 AI calls/day on the free tier.

Download for iOS Get on Google Play

Chen Chen — Founder of TermAI

Writes about mobile DevOps, terminal UX, and the surprising depth of "boring" infrastructure.

X GitHub Email

💬 Discuss this article: Hacker News · Reddit · V2EX

AI command generation: lessons from 10,000 users →SSH host key TOFU and why your mobile client should care →Why we built a mobile SSH client in 2026 →

Was this useful? ← Back to blog