Official CLI + Open Relay: The Resilient Path After Third-Party Wrapper Bans

When Anthropic started blocking third-party wrappers like OpenClaw and OpenCode in January 2026, it sent a clear signal: wrapping vendor APIs behind your own CLI is a structurally fragile business model.

Not because it's bad engineering. Because the wrapper depends on an API it doesn't control, a subscription token it doesn't own, and a ToS clause it didn't write.

There's a more resilient architecture: use each vendor's official CLI, paired with Open Relay (oly) for session supervision and cross-machine scheduling.

Why Wrappers Keep Getting Blocked

The wrapper pattern looks like this: intercept user requests → assemble prompts your way → call upstream model APIs → return results.

Three structural weaknesses:

API dependency: The wrapper must constantly adapt to upstream API changes. Update the protocol, add signature validation, or change auth, and the wrapper breaks.
ToS fragility: Most wrappers rely on users' subscription tokens, which is typically not allowed in terms of service. Platform owners can reclassify it as违规 at any time.
Replaceability: When vendors ship their own capable CLIs (Claude Code, Gemini CLI, Copilot CLI), the wrapper's reason for existing shrinks dramatically.

This isn't a "will they get blocked" question. It's "when."

The Alternative: Official CLI + Open Relay

If the wrapper's core weakness is "not official," the direct answer is: use the official CLI itself.

But official CLIs are designed for interactive terminal sessions. Close the terminal window, and the session dies. An AI agent might run for hours, need human approval mid-way, then continue. Nobody wants to sit in front of a screen waiting.

This is where Open Relay (oly) comes in.

What Open Relay Is

oly is a lightweight CLI session supervision layer written in Rust. The core idea is simple:

Let a background daemon own the PTY (pseudo-terminal) session lifecycle. Users issue commands, disconnect/reconnect at will, inject keystrokes, and stream logs.

Key capabilities:

Persistent detached sessions: Close your terminal window, CLI keeps running
Log streaming & prompt detection: oly logs --wait-for-prompt blocks until human input is needed
Remote injection: oly send to submit text or special keys without attaching
Checkpoint recovery: Reattach with buffered output replay
Full audit trail: All stdout/stderr and lifecycle events persisted to disk
Node federation: Cross-machine scheduling via oly join

Install: npm i -g @slaveoftime/oly or cargo install oly.

How It Works Together

Official CLI ──▶ Runs inside oly's managed PTY ──▶ Async supervision, logs, key injection, cross-machine scheduling

# Start daemon
oly daemon start --detach

# Launch official Claude Code inside oly
oly start --title my-coding-task claude

# Stream logs, wait for human approval prompt
oly logs --wait-for-prompt

# Inject approval
oly send <session-id> "y"

# Let it run, walk away

This pattern works with Claude Code, Gemini CLI, GitHub Copilot CLI, Codex CLI, Qwen Code—every one of them is "official," so none face ban risk.

Real Example: How Jarvis Runs

My AI assistant Jarvis is built on exactly this stack.

Jarvis is not a wrapper. It doesn't intercept, proxy, or relay any model API. Its core responsibility is supervision and orchestration:

Maintains a long-running main session for global state management
Spawns child worker sessions via oly when substantive execution is needed
Workers use official CLIs (Qwen Code, Copilot CLI) in their own PTYs for actual code work
The main session supervises via oly logs, oly send, injecting commands, judging when to stop or hand off
All worker state, logs, and lifecycle events persist to local SQLite—auditable and recoverable

The system's resilience comes from one simple fact: every layer is "official." Nobody needs to worry about upstream bans because nobody is borrowing someone else's tokens or APIs.

Structural Comparison

Dimension	Third-party Wrapper	Official CLI Direct	Official CLI + Open Relay
API Dependency	High	Medium	Low
ToS Risk	High	Low	Low
Session Persistence	Self-implemented	None	Built into oly
Async Supervision	Partial	None	Native
Cross-machine Scheduling	Limited	None	Node federation
Upstream Ban Risk	High	None	None
Human Intervention Cost	Low	High	Low

Who Should Care

If you use OpenClaw / OpenCode / similar wrappers

The bans already happened. Long-term sustainability is getting harder to bet on. Two migration paths:

Switch providers: OpenCode can be configured for OpenAI, Google, or local Ollama. Solves single-point dependency but not the wrapper's structural risk.
Change architecture: Switch to official CLI + session supervision layer. This eliminates the ban risk at the root.

If you use Claude Code / Gemini CLI / Copilot CLI directly

You've felt the power and the limitation: close the terminal, everything's gone. AI agent ran for three hours, you went to a meeting, came back, terminal closed, all context lost.

Open Relay fills exactly that gap.

If you're building AI agent infrastructure

Open Relay's architecture is worth studying:

PTY over subprocess, preserving full terminal interaction semantics
SQLite for lightweight, auditable persistence
Node federation over centralized scheduling, avoiding single points of failure
Simple heuristics like --wait-for-prompt over complex state machines—pragmatism first

Honest Limitations

Open Relay is not a silver bullet. It currently:

Does not do model routing: You decide which CLI/model to use
Does not optimize prompts: CLI prompt quality depends on the vendor's implementation
Does not proxy commercial licenses: Each CLI's ToS and billing remains your responsibility
Is still early-stage: Active project, but version is iterating fast

Its positioning is clear: session supervision and orchestration layer, not an AI wrapper. It solves "make official CLIs run reliably in the background," not "replace official CLIs."

Why This Matters

AI coding agent competition is shifting from "whose model is stronger" to "whose engineering chain is more reliable." In that shift, architecture choices matter more for long-term resilience than model choices.

The wrapper route's decline isn't accidental—it's the inevitable result of platform owners tightening control. Official CLI + Open Relay isn't the only answer, but it's a path that structurally eliminates ban risk.

Jarvis has been running on this path for a while now. My experience: when you don't need to worry daily about upstream APIs breaking, tokens getting banned, or terms getting updated, you can actually focus on building something valuable.

Quick Start

# Install
npm i -g @slaveoftime/oly

# Start daemon
oly daemon start --detach

# Run your official CLI inside oly (choose any)
oly start --title coding claude          # Anthropic Claude Code
oly start --title coding gemini          # Google Gemini CLI
oly start --title coding copilot         # GitHub Copilot CLI
oly start --title coding qwen            # Qwen Code

# Stream logs
oly logs <session-id>

# Intervene when human approval is needed
oly send <session-id> "y"

# Stop when done
oly stop <session-id>

Star the project: https://github.com/slaveOftime/open-relay

This article was written using the Jarvis + Qwen Code + Open Relay workflow—the Qwen worker is managed by oly, and I intervened via oly send at key review points.