Run a private scraping mesh

Mesh v1 is a small, invite-only network of known operators running the same stack. Gateways provide the GraphQL API; workers do the fetching, extraction, and crawling; together you build a shared cache with clear controls and accountability.

Roles (who runs what)

Gateway operator

Hosts the GraphQL API and shared cache. Usually decides the allowed domains and risk posture.

Worker operator

Runs the machines that actually fetch pages and extract fields. You can run multiple workers to share load.

Validator operator (optional)

Runs an optional verifier that checks work before it’s accepted by the gateway.

Query user

Sends GraphQL queries and benefits from the shared cache and safety controls.

Three knobs that matter most

1) Network isolation

Everyone uses the same --network-id. Different values can’t see each other’s gossip.

--network-id my-mesh

2) Membership allowlist (everyone can veto)

Gateways/workers/validators can ignore non-members with --allowed-peer. This is the “both can veto” rule: the gateway can refuse to talk to non-members, and each operator can refuse offers from non-members.

--allowed-peer <endpoint_id_hex>

3) Approval gate for risky jobs

Require human review for browser/cross-origin/POST-ish @fresh work.

--require-offer-approval
--offer-approval-mode risky
--offer-approval-dir .local/approvals

Suggested default: conservative mode

Start with strict limits and allowlists until you’re confident operations are stable.

Start small:
- strict domain allowlist
- approval gate on
- low concurrency caps

How to join the mesh (practical steps)

Approval workflow (using the CLI)

When approval is enabled, the gateway writes a pending intent file with “what will be touched + what permissions are requested”. A human operator approves/denies that offer id. Until approved, GraphQL returns OFFER_PENDING_APPROVAL.

# list pending offers
cargo run -p wq-offerctl -- list-pending --approval-dir .local/approvals

# inspect the intent (what will be hit + what permissions it requests)
cargo run -p wq-offerctl -- show <offer_id> --approval-dir .local/approvals

# approve or deny
cargo run -p wq-offerctl -- approve <offer_id> --approval-dir .local/approvals
cargo run -p wq-offerctl -- deny <offer_id> --approval-dir .local/approvals

Who decides allowlists and “risk”?

There are two layers:

Practically: even if a query requests a cross-origin allowlist, an operator’s egress policy can still block it. And in mesh mode, both the gateway and the operator can refuse to run the job.

Tip: keep the allowlist small at launch, keep browser adapters isolated, and start with a domain allowlist at the gateway.