System Resilient to Retries

Duplicate prevention and auto-retry design for easy recovery from operational errors

Duplicate PreventionRetryIdempotencyError RecoveryExponential Backoff
4 min read

About This Topic

The most troublesome aspect of business systems is recovering from operational errors.
Problems like "two identical invoices were created" or "an error occurred midway and I don't know how much was processed" create real confusion on the ground.

This project was designed to handle various "oops" moments gracefully.
That kind of safety net makes day-to-day operations much less stressful.

Sending the Same Invoice Twice is OK

A mechanism called "idempotency" prevents duplicate processing.
In simple terms, the same request should not create different outcomes.

Duplicate Prevention Mechanism
First Request

Send with key="invoice:001"

Cache Check

Not registered → Execute as new process

Save Result

key="invoice:001" → Success, ID: 12345

Second Request (Same Key)
Second Request

Send again with key="invoice:001"

Cache Check

Already registered → Check previous result

Return Previous Result

Skip processing, return ID: 12345

Idempotency Key Design

Each request is assigned a unique key. If a second submission uses the same key, it's treated as "already created" and processing is skipped.

7-Day Memory

Results are cached for 7 days. Within a week, you can verify "has this invoice been created or not."

Auto-Retry on Communication Errors

For temporary network issues or server congestion, the system automatically waits and retries.
So teams do not need to keep retrying manually.

Exponential Backoff with Jitter

"Exponential backoff" gradually lengthens retry intervals. "Jitter" adds random delay to prevent multiple clients from retrying simultaneously and causing congestion again.

Retry Flow
Send Request

Execute API call

Error Occurs

429 (rate limit) or 5xx (server error)

Retry 1

Wait 300ms + random (0-199ms)

Retry 2

Wait 600ms + random

Retry 3

Wait 1200ms + random

Up to 5 Retries

Return error if limit reached

Retryable Errors

Not all errors trigger retries. Only temporary errors are retried.

Detailed Logs for Problem Identification

Records when, who, what was executed, and the result. Even when errors occur, it's clear which row had what problem.

Structured Log Example

{
  "event": "row_processed_error",
  "docType": "invoice",
  "rowNumber": 5,
  "idempotencyKey": "comp1:invoice:001",
  "httpStatus": 422,
  "error": "partner_code not found",
  "timestamp": "2025-01-24T10:30:00.000Z"
}

What Logs Reveal

Error Handling Flow

Error Response Flow
Error Occurs

Status column in spreadsheet shows "Error"

Identify Cause

Check error content in logs. Find row using rowNumber

Fix Data

Correct problematic data in spreadsheet

Re-execute

Select fixed row and resend. New idempotency key means new processing

What This Design Achieves

For Operations

  • Prevent duplicate invoices: Same invoice is never created twice
  • Resilience to temporary failures: Auto-retry handles recovery
  • Easy cause identification: Structured logs pinpoint problem areas instantly

For Field Staff

  • Operate with confidence: Peace of mind knowing mistakes are OK
  • Immediate status awareness: Result confirmation via status display
  • Easy re-execution: Just fix and resend