All articles
    APIsArchitectureE-commerce

    Designing resilient e-commerce API integrations

    February 5, 2026 · 9 min read · By Jean-Philippe Cormier

    If your checkout depends on a synchronous call to an ERP, your checkout is exactly as reliable as that ERP. On Black Friday, that's a problem. Here are the patterns I recommend on every integration project.

    1. Make every write idempotent

    Every external write should accept an idempotency key — the order ID, a UUID, anything stable. Retries become safe, duplicate orders disappear, and your support team gets their evenings back.

    2. Put a queue between you and them

    Direct synchronous calls couple your uptime to theirs. A queue (SQS, Pub/Sub, even Redis Streams) lets you accept the order now and deliver it to the ERP when it's healthy. Customers never see the failure.

    3. Retry with exponential backoff and jitter

    When everyone retries on the same schedule, you create a thundering herd that keeps the downstream system down. Add jitter, cap retries, and route persistent failures to a dead-letter queue with alerting.

    4. Add circuit breakers around fragile partners

    If a tax API is timing out for 90 seconds straight, stop calling it. Fall back to cached rates or a default, alert the team, and resume when health checks pass. A degraded checkout beats a broken one.

    5. Instrument everything, then trust the dashboards

    • Request rate, error rate, p95/p99 latency per integration
    • Queue depth and oldest message age
    • Dead-letter queue size with paging alerts
    • Business KPIs (orders/min) overlaid on infra KPIs

    When something breaks at 2 a.m. on Cyber Monday, you don't want to be reading code. You want to be reading a dashboard.

    Integrations don't fail when you test them. They fail when traffic is 10x and your partner is having a bad day.