
Zyvan — Reliable Webhook Delivery Infrastructure
Fault-tolerant webhook delivery platform with durable acknowledgment, idempotent ingestion, exponential retries, and SSRF-safe outbound proxy.
Timeline
3–4 months
Role
Full Stack Developer
Team
Solo
Status
In ProgressTechnology Stack
Overview
Zyvan is a production-oriented webhook delivery system designed to guarantee reliable event delivery to third-party endpoints.
Unlike naive webhook implementations that rely on best-effort HTTP calls, Zyvan introduces durable acknowledgment, queue-driven dispatching, idempotent ingestion, and failure-aware retry orchestration.
Primary focus: reliability, failure isolation, and secure outbound delivery.
Problem
Typical webhook systems suffer from a cluster of compounding failure modes: events lost during transient outages, duplicate deliveries caused by unsafe retries, tight coupling between ingestion and dispatch, no retry visibility, open SSRF attack surfaces, and poor delivery observability. At scale, each of these becomes operationally expensive and trust-eroding.
Architecture Overview
High-Level Flow
Key Technical Decisions
1. Durable Acknowledgment Boundary
The reliability boundary is drawn at the database commit — not after the webhook is delivered. This means the client receives a success response as soon as the event is persisted, and Zyvan takes full ownership of the delivery lifecycle. It eliminates ghost failures where the client is unsure whether to retry.
2. Idempotent Ingestion
Every inbound event is keyed on a unique idempotency token enforced at the database level. Duplicate submissions from clients are detected and short-circuited cleanly, making retries safe and delivery semantics predictable.
3. Queue-Driven Retries with Exponential Backoff
BullMQ handles all retry scheduling with exponential backoff and capped attempt limits. This decouples ingestion from dispatch entirely — a slow or failing endpoint cannot block the ingestion pipeline — and prevents retry storms from overwhelming struggling customer endpoints.
4. SSRF-Safe Outbound Delivery
All outbound HTTP calls are routed through a hardened proxy that enforces IP allowlisting, DNS-level validation, and private IP range blocking. Webhook systems are a well-known SSRF vector; this proxy closes that surface by design.
Performance Targets
| Operation | Target | |---|---| | Ingestion acknowledgment | < 100 ms | | Queue enqueue | < 50 ms | | Worker dispatch | < 500 ms | | Retry strategy | Exponential backoff |
Load testing is currently in progress.
Challenges & Solutions
Retry storms — Solved with BullMQ's exponential backoff and capped maximum attempt counts per job.
Duplicate submissions — Solved with a unique idempotency index at the database layer, making all client retries safe by default.
Unsafe outbound requests — Solved with a hardened outgoing proxy enforcing IP filtering, DNS validation, and private range blocking.
My Role
As the sole developer, I designed and built the entire system end-to-end: the delivery architecture and state machine, idempotent ingestion pipeline, BullMQ retry orchestration, SSRF proxy strategy, and the full-stack Next.js dashboard for delivery observability.
What I Learned
Working on Zyvan deepened my understanding of reliable systems design — specifically around durability boundaries, idempotent API patterns, queue-first architecture, failure-aware retry logic, and SSRF mitigation at the infrastructure layer.
Roadmap
- OpenTelemetry tracing for full delivery observability
- Per-endpoint rate limiting
- Adaptive retry policies based on endpoint behavior
- Multi-region queue support
- Large-scale load testing and benchmarking
Key Takeaway
Zyvan demonstrates how to build fault-tolerant webhook infrastructure using queue-driven processing, strong idempotency guarantees, and security-first outbound delivery — turning an unreliable fire-and-forget pattern into a system you can actually trust.