Background
When building a multi-location compatible integration app, system reliability is extremely important.
Error handling and duplicate prevention that were handled internally by standard integration must be designed ourselves in custom implementations. We implemented various countermeasures to prevent situations where "it stopped working without us noticing."
Design Challenges
The custom integration app needed to address the following issues:
- Authentication token expiration - NextEngine API tokens have expiration dates
- Duplicate processing - Risk of same order being processed multiple times due to Webhook resends
- Concurrent execution - Risk of scheduled processes running simultaneously
- Missing errors - Not noticing when problems occur
Automatic Authentication Token Renewal
Integration with NextEngine API requires authentication tokens. These tokens have expiration dates, and the system stops working when they expire.
We implemented a mechanism that constantly monitors token expiration and automatically renews before expiration.
Check token expiration
Check remaining days
Renewal process details:
5-minute continuous renewal prevention
Request to NextEngine API
Check save result
Key Points of Token Management
- Renewal with margin - Start renewal attempts 30 days before expiration
- Prevent continuous renewal - 5-minute cooldown prevents wasteful requests
- Multiple fallback paths - Alternative measures even when renewal fails
- Alert display - Display alerts in admin panel on errors
In-Progress Lock Mechanism
When the same process runs simultaneously multiple times, data inconsistencies and duplicate sends occur. We adopted distributed locking to prevent this.
Lock Acquisition and Release
SETNX operation
Check lock status
Main processing
Ensured execution in finally block
Lock Safety Design
- With TTL (Time To Live) - Lock doesn't persist even on abnormal termination
- 25-minute TTL - Setting with margin for normal processing time
- Guaranteed release - Released even on error via finally block
Ensuring Idempotency
The property where executing the same process multiple times produces the same result as executing once is called "idempotency." This is a very important concept since Webhooks may be resent.
Idempotency Key Generation
SHA1 hash (Order ID + Ship datetime + Tracking number)
Determine if already processed
Why Idempotency Matters
- Network failures - Resends occur when transmission succeeded but response didn't arrive
- Timeouts - Processing completed but treated as failure due to timeout
- Retries - Same request arrives multiple times from automatic retries on error
In all cases, idempotency prevents duplicate processing.
Log Hierarchy Structure
Appropriate log levels are set according to purpose:
| Level | Purpose | Output Condition |
|---|---|---|
| DEBUG | Detailed diagnostic info | Development only |
| INFO | Major operation records | Production & Development |
| WARN | Recoverable issues | Production & Development |
| ERROR | Critical errors | Production & Development |
Information Included in Logs
- Timestamp - Recorded in ISO 8601 format
- Environment name - production/staging/development
- Process type - order_sync, fulfillment_sync, etc.
- Process result - Success/failure, processed count, etc.
- Error details - Message and stack trace on errors
Sensitive Information Protection
- No token output - Authentication tokens not output to logs
- Personal info exclusion - Customer personal info excluded from logs
- Preview display - When needed, show only first 10 characters
Multi-Layer Data Persistence Structure
Data storage is redundant across multiple layers:
Tokens (encrypted) · Store settings · Sync state · Locks
Initial tokens · Encryption keys · Single-store settings (backwards compatibility)
Webhook logs · Order sync logs · Error logs (7-day retention)
Error Recovery Patterns
Appropriate recovery processing is designed for various errors:
| Error Type | Response | Result |
|---|---|---|
| Token expired | Auto-renew → Retry | Processing continues |
| Temporary API failure | Retry on next schedule | Auto recovery |
| Tracking number not entered | Save as PENDING, reprocess later | No data loss |
| Duplicate Webhook | Skip via idempotency check | Prevent double processing |
| Lock conflict | Skip, process next time | Maintain data consistency |
Monitoring and Alerts
The following monitoring is in place for early problem detection:
Regular Check Items
- Token expiration - Warning at less than 30 days, alert at less than 7 days
- Failure rate - Notify if failures continue above threshold
- Unprocessed orders - Warning if orders remain unprocessed for long periods
Alert Notifications
When problems occur, alerts are displayed in the admin panel. Alerts include:
- Occurrence datetime
- Problem type
- Impact scope
- Recommended actions
Comparison with Standard Integration
| Item | Standard Integration | This System |
|---|---|---|
| Error handling | Black box | Explicitly designed & implemented |
| Duplicate prevention | Internal processing | Managed with idempotency keys |
| Logs | Hard to check | Detailed records, searchable |
| Token management | Automatic (details unknown) | Auto-renewal with fallback |
Benefits
This design provides the following benefits:
- Auto recovery - Many problems recover automatically
- Visibility - Immediately aware when problems occur
- Data preservation - Prevent duplicate processing and data loss
- Peace of mind - No "stopped without noticing" situations
Operational Tips
Regular Log Review
Even when no problems occur, we recommend checking logs about once a week. You can verify that warning-level issues aren't accumulating.
Manual Token Renewal
If automatic renewal continues to fail for some reason, you can also manually reissue and configure tokens.
Error Response
When error alerts appear, first check the details in logs. In many cases, issues are temporary and auto-recovered, but if they persist, root cause investigation is needed.