Istanbul, TR
01Context

What it is, why it matters, who it’s for.

What

Multi-tenant SaaS that crawls full domains, scores every page across 40+ signals, runs Lighthouse audits and TTFB probes, and streams findings to dashboards in real time. Teams, invites, billing, and credits all live under one membership model.

Why

SEO crawlers are either nightly batch jobs or burn through API budgets. Teams that work daily on technical SEO need feedback that arrives while they’re still on the page.

For

In-house SEO leads and agencies running weekly audits across dozens of client domains.

02Problem

Sub-secondfeedbackperpage,fairacrosstenants,ononebox.

Constraints
  • 01Single bare-metal node — no autoscaling budget
  • 02No cross-tenant data leaks under any failure mode
  • 03Polite to upstream hosts — robots.txt, adaptive backoff
  • 04Every analyzer must be hot-reloadable in dev
Difficulty

Crawlers degrade in three places at once: scheduling, analysis, and delivery. Solving one usually breaks another, and multi-tenant fairness amplifies every mistake.

03Approach

Decisions, in order of stakes.

  1. 01

    Work-stealing, not round-robin

    A central scheduler hands work to whichever worker is idle. Per-host token buckets keep us polite without idling cores when a tenant’s domain is rate-limited.

  2. 02

    Analyzers as a registry

    Every check is an isolated module behind a typed contract. New analyzers register at startup, can be hot-reloaded in dev, and ship as a signed bundle in prod.

  3. 03

    Stream, don’t poll

    Per-tenant SSE channels push findings the moment an analyzer returns. The UI patches an immutable store — no refetch, no flicker.

  4. 04

    RLS as the only tenant boundary

    Postgres row-level security scopes every query. The API sets `app.tenant_id` per request and the database enforces the rest.

  5. 05

    Lighthouse + TTFB as first-class signals

    Lab Lighthouse runs and synthetic TTFB probes are domains of their own — separate workers, separate quotas, regression alerts when a tracked URL drifts. SEO posture and performance posture share the same dashboard.

  6. 06

    One membership model for teams, invites, billing

    Teams, invites, permissions, subscriptions, and credit ledgers all hang off a single membership table. New surfaces (Lighthouse credits, exports) plug into the same accounting without a parallel auth path.

04Execution
01

Each worker pulls jobs from a shared queue, takes a token from the per-host bucket, fetches, and walks every analyzer registered for that tenant. If anyone returns a finding, it lands on the SSE channel within milliseconds.

02

Multi-tenant safety is enforced in one place. Every table has an RLS policy; the API sets the tenant ID inside the same transaction as the query. There is no second path.

03

Lighthouse and TTFB runs aren’t free. A credit ledger settles in the same transaction as the run insert — no double-spend, no orphaned jobs, and the billing surface reads from the same view that gates the queue.

Tenant API Chi · RLS Scheduler work-stealing Workers · per-host bucket Go routines ↘ 40+ analyzers SSE · per-tenant channel Analyzer registry hot-reload
Crawl pipeline · API → scheduler → workers → SSE
05Artefacts
Live crawl streaming via SSE.
Live crawl streaming via SSE.
Toggle analyzers without a redeploy.
Toggle analyzers without a redeploy.
Drill from domain to single signal.
Drill from domain to single signal.
07Outcomes
Result

Real-timecrawls.Onenode.Zeroleaks.

Impact

10k pages crawled in under four minutes per node. First finding under 200ms. Tenant isolation enforced at the database — verified by red-team probes through launch.

Pages / minute2.5k+340%Sustained on 8-core node
First insight<200msFrom crawl start to first SSE event
Analyzers40+Modular, hot-reloadable
Lighthouse runs / day5k+Per-tenant quota, credit-settled
Regression alerts<60sFrom probe to dashboard + email
Tenant isolationRLSPostgres row-level, zero leaks
08Takeaways
Learned
  • ·

    Fairness is a scheduling concern, not a quota concern. Token buckets per host beat per-tenant caps.

  • ·

    RLS keeps the leak surface inside Postgres. Application-layer isolation always finds a way to fail.

  • ·

    Streaming is a UX concern first. Sub-second feedback changes how teams use the tool.

Would change
  • Wire OpenTelemetry traces from API → scheduler → analyzer earlier — debugging fairness issues in prod was expensive.

  • Build a replay tool for analyzer development before adding the tenth analyzer, not the thirtieth.

Continue
SeoTraceLoading
/ 10000