Smooth Handling of Large Data Volumes

Design for efficiently processing hundreds of records during month-end aggregation

PagingLarge DataPerformanceConcurrency ControlEfficiency
4 min read

About This Topic

We implemented mechanisms to work smoothly even when handling hundreds of records, such as during month-end aggregation. This design processes data efficiently while respecting invoicing service API limitations.

Working with API Limits

Most cloud services have API call limits. The invoicing service is no exception, with a "maximum 100 items per request" limit.

Item limit
Details100 items per request
ImpactLarge data requires multiple requests
Rate limit
DetailsLimited requests per minute
ImpactContinuous access may cause errors
Timeout
DetailsAbout 30 seconds per request
ImpactHeavy processing may be disconnected

Automatic Paging

Just request "500 items" from GAS, and the app automatically handles 5 round-trips.

Auto-Paging Flow (500 items)
Request from GAS

"Please retrieve 500 items"

Page 1

offset=0, limit=100 → Get 100 items

Wait 2 seconds

Rate limit countermeasure

Pages 2-5

Get 100 items each (2 second wait between)

Return Combined Results

Return all 500 items to GAS

Why Wait 2 Seconds?

Inserting appropriate wait times between API calls prevents overloading the service. This proactively prevents "too many access" errors.

Retrieve Only What's Needed (Two-Stage Retrieval)

Even with large numbers of invoices, we first retrieve "just the list" quickly, then get details only for items that need them.

Two-Stage Retrieval Structure
Stage 1: List Retrieval

skip_details: true to skip line items. Get only ID, number, partner, date

Fast (seconds)
Filtering

Filter by partner name or date. Reduce 1000 items to 50

Only what's needed
Stage 2: Detail Retrieval

Get details only for the filtered 50 items

Effectiveness Comparison

Bulk retrieval
With 1000 itemsGet all details for all items
Estimated Time2-3+ minutes
Two-stage
With 1000 itemsDetails for only 50 after filtering
Estimated TimeAbout 30 seconds

Concurrency Control

To efficiently make multiple API calls, we implemented concurrency control.

Concurrency Pool Operation
Submit Tasks

Add 10 detail retrieval requests to queue

Parallel Execution

Process maximum 2 concurrently

Next on Completion

Start next task as each one finishes

All Complete

Return combined results for all 10

Why 2 Concurrent?

Increasing parallelism too much risks hitting service rate limits. 2 concurrent is a balanced setting for speed and stability.

1
SpeedSlow
StabilityVery high
AssessmentSafe but inefficient
2
SpeedReasonably fast
StabilityHigh
AssessmentGood balance
5+
SpeedFast
StabilityLow
AssessmentHigh rate limit risk

Maximum Item Limit

Even if user requests "10000 items," the app limits to 500 maximum.

When You Need More Data

For data beyond 500 items, we recommend splitting by period.

Split by period
ExampleRetrieve monthly
BenefitStable retrieval
Split by partner
ExampleRetrieve per partner
BenefitAnalysis-ready data
Split by type
ExampleSeparate invoices and delivery slips
BenefitLighter processing

Paging Termination Detection

When repeatedly fetching data, we need to correctly determine "there's no more data to fetch."

Paging Termination Detection
Retrieve Data

Request 100 items

Check Result

Check count retrieved

100 items
Possibly more data → Continue to next page
Less than 100
Last page → End

What This Design Achieves

For Operations

  • Stable bulk processing: Design respecting API limits
  • Efficient resource usage: Retrieve only what's needed
  • Timeout prevention: Appropriate item limits

For Field Staff

  • Stress-free operation: Short wait times even with large data
  • Single operation completion: No need to think about paging
  • Reliable results: No mid-process stops

Related Topics