This content originally appeared on DEV Community and was authored by Narednra Reddy Yadama
The Scenario
- The company runs analytics.
- They need frequent access to the latest data subsets.
- The older data is rarely used.
- They want a solution that provides low latency for recent data without storing the entire dataset locally.
AWS Storage Gateway Options
AWS Storage Gateway offers two Volume Gateway modes:
1.Stored Volumes
Keep the entire dataset on-premises.
AWS asynchronously backs it up to Amazon S3 as EBS snapshots.
Best when: you need low-latency access to all of your data locally.
2.Cached Volumes
Keep entire dataset in S3.
Only frequently accessed data subsets are cached locally.
Best when: you want to minimize on-premises storage but still get low-latency access to hot data.
Why Cached Volumes Fit Here
The company doesn’t need all old data locally (only the latest subsets).
Cached Volumes:
- Store all data in Amazon S3.
- Provide local cache for recently accessed data.
- Applications get low latency for hot data. This saves money & storage space compared to Stored Volumes, which would force them to keep everything on-prem.
Key Cached Volume Facts
Volume size: 1 GiB → 32 TiB (must be whole GiBs).
Per gateway: up to 32 volumes.
Max total size per gateway: 1 PiB (1,024 TiB).
Access: via iSCSI devices attached to on-premises servers.
Why Stored Volumes Don’t Work
Stored Volumes = entire dataset kept locally.
That means scaling on-prem storage as the dataset grows.
This contradicts the requirement: they only need latest subsets frequently, not the full dataset.
Summary
Requirement: low latency for latest subsets, not the whole dataset.
Best match: Volume Gateway in Cached Mode.
Why not Stored Mode? Stored Mode keeps the whole dataset local, which is unnecessary and costly here.
This content originally appeared on DEV Community and was authored by Narednra Reddy Yadama