This content originally appeared on DEV Community and was authored by Michael Flanagan
Introduction
I’ve been hanging out in the AWS ecosystem for a while, and one of the things I’ve enjoyed is how Bedrock Knowledge Bases give you a relatively turnkey way to stand up RAG pipelines without duct taping half a dozen services together. The catch has always been vector storage. Until now, your out-of-the-box choices were either OpenSearch (simple but pricey) or Aurora Serverless (flexible but configuration-heavy, even if it scales to zero).
In July ’25, AWS dropped something new into preview: Amazon S3 Vectors. It’s a native vector store designed to be dirt cheap (AWS claims up to 90% lower cost than other options) at the expense of some performance. Latencies are higher, but for many workloads, and especially experimentation or pre-production it should be “good enough.” Right now S3 Vectors show up as an option when you create new Knowledge Bases in the console… there isn’t an official L2 CDK construct yet.
Repository Link: Building Bedrock Knowledge Base w/S3 Vectors
Purpose
That’s where this CDK construct comes in. I’ve put together a relatively simple but configurable stack that wires up:
- An ingestion S3 bucket
- A Bedrock Knowledge Base
- An S3 Vectors bucket and index
- The appropriate IAM scaffolding
- A Lambda-based cleanup finalizer (so your teardown doesn’t fail on dangling vector stores).
It’s not fully L2 yet, because most of the S3 Vectors pieces are so fresh they still require AwsCustomResource
with installLatestAwsSdk: true
, and cleanup had to be loaded into a Lambda because CloudFormation deletes in the wrong order otherwise. Hopefully this all gets abstracted in a future CDK release, but for now this construct should give you a reproducible, low-cost way to deploy Knowledge Bases with S3 Vectors.
Value
Who’s this for? Pretty much anyone who wants to explore Bedrock KBs without burning cash on infra you don’t need yet:
- Hobbyists and students: Can build and test RAG pipelines on your own dime without becoming an Aurora expert or burning a clean 100+ USD monthly on OpenSearch.
- Indie devs and small teams: Can prototype workflows, test integrations, or demo features without a runaway AWS bill.
- Enterprise teams: Can spin up a pre-production configuration for experimentation, then later environments can utilize the more performant OpenSearch, Pinecone, or Aurora without rewriting your app since Bedrock’s KB API abstracts the vector store, so swapping later is always on the table without changes to the underlying code.
One early caveat: if you expect to move beyond this pattern, it’s better to bring your own S3 ingestion bucket. Knowledge Bases are bound to their vector stores, so changing later can be destructive.
Hit the ground running (Quickstart)
Beyond the usual AWS account + CDK bootstrap, there are two things you need squared away before deploying:
- Region: As of preview (July ’25), S3 Vectors are only available in:
us-east-1
,us-east-2
,us-west-2
,eu-central-1
,ap-southeast-2
- Embedding model: You must have access to an embedding model supported by Bedrock KBs. At time of writing, that means:
- Amazon Titan Text Embedding v2: supports 256, 512, or 1024 dimensions
- Amazon Titan Text Embedding v1: requires 1536 dimensions
If you try another model, ingestion will fail because the dimensions won’t line up.
Deploy the Construct
Once you’ve cloned the repo and reviewed the props you might want to tweak (prefixes, parsing model, deletion behavior, etc.), you can go ahead and synth/deploy:
npm install # or yarn/pnpm
cdk synth # optional, to review the generated CloudFormation
# however, on first pass it'd good to get a view on what you're generating!
cdk deploy
On success you’ll get stack outputs that look like:
- KnowledgeBaseId
- KnowledgeBaseArn
- IngestionBucketName (upload docs here; defaults to docs/)
- VectorBucketName / VectorIndexName
Add Documents
Now drop some content into the ingestion bucket. By default, that means s3:///docs/.
For my first test, I uploaded a handful of short Wikipedia pages about sauces (pipian, mole, béchamel) just to have something fun to query against.
Kick Off an Ingestion Job
Next, tell Bedrock to ingest what you just uploaded:
aws bedrock-agent start-ingestion-job \
--knowledge-base-id $KB_ID \
--data-source-id $DS_ID
Then poll until you see it complete:
aws bedrock-agent get-ingestion-job \
--knowledge-base-id $KB_ID \
--data-source-id $DS_ID \
--ingestion-job-id $JOB_ID
You can also check the status in the AWS Console under Bedrock -> Knowledge Bases -> Ingestion Jobs.
Try a Simple Retrieval
Once the job is complete, you can run a retrieval-only query (no LLM wrapping):
aws bedrock-agent-runtime retrieve \
--knowledge-base-id $KB_ID \
--retrieval-query '{"text":"What is pipian sauce?"}' \
--retrieval-configuration '{"vectorSearchConfiguration": {"numberOfResults": 3}}'
This will return raw document chunks with their source references.
Step Up to RAG (Retrieve-and-Generate)
Now let’s try a full RAG cycle with an LLM on top:
aws bedrock-agent-runtime retrieve-and-generate \
--input '{"text":"Summarize pipian sauce in two sentences"}' \
--retrieve-and-generate-configuration "{
\"type\": \"KNOWLEDGE_BASE\",
\"knowledgeBaseConfiguration\": {
\"knowledgeBaseId\": \"$KB_ID\",
\"modelArn\": \"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0\",
\"retrievalConfiguration\": {
\"vectorSearchConfiguration\": { \"numberOfResults\": 3 }
}
}
}"
If all goes well, you should see a short, generated answer based on your docs.
And just like that, you’ve got a working S3 -> S3 Vectors Knowledge Base on AWS. You can now hook it into workflows, wrap it with agents, or query it from your own code. If you stick around, the next sections cover some caveats and sharp edges that will save you pain down the line.
Considerations & Watchouts (Before You’re Committed)
Vector Store Lock-In
Your Knowledge Base is glued to its vector store. You don’t get to swap later so if you decide to move to OpenSearch, Pinecone, Aurora Serverless, etc, you’re rebuilding and re-ingesting. Because of that, I’d strongly suggest bringing your own S3 ingestion bucket up front. That way if the KB has to go, your docs don’t go with it.
Model + Dimension Choice
Same deal with models and dimensions: once you’ve set it, that’s it. Titan v2 is flexible (256/512/1024), Titan v1 is locked at 1536. Change your mind later? You’re tearing down the KB and index and starting over. The construct defaults to Titan v2 at 1024 if you don’t care enough to pick.
Price vs. Performance
S3 Vectors is dirt cheap, and right now they’re the cheapest vector store option AWS has, but the latency is slower. Both ingestion and retrieval take longer compared to OpenSearch or Aurora. For tinkering, prototyping, low-intensity apps, it’s more than fine. But if you’re building something latency-critical, you’ll probably outgrow it.
- S3 Vectors: sub-second retrieval
- OpenSearch Serverless: low tens of ms retrieval
- Aurora pgvector: low tens-of-ms
Preview Warnings
Don’t forget this is still in preview. The APIs aren’t final. Permissions are broad, SDK calls may shift, and if you start seeing weird errors about malformed API calls popping up out of nowhere, that’s probably AWS changing things under the hood. I’m expecting to have to refactor this construct a bit before general release, so you’d likely need to do the same if you pull it into your ecosystem.
With all that in mind, you should be good to get off to the races! If you want the deeper details of why the construct is wired the way it is, hang around for the deep dive.
Deep Dive (Why the Construct Is Wired the Way It Is)
Custom Resources and Fresh APIs
The first thing you’ll notice when you crack open the code: a bunch of AwsCustomResource
blocks with installLatestAwsSdk: true
. This is because S3 Vectors is so new the APIs aren’t in the default Lambda runtime SDK yet. This is a preview-only issue, and it’ll go away once S3 Vectors land in the managed SDK.
Small aside: running installLatestAwsSdk
also plays funny with how roles get wired in. Instead of relying on CDK to attach roles cleanly, I had to drop in a couple of explicit inline policies to make sure those Lambdas could actually talk to the services. It’s not pretty, but it keeps things working until AWS smooths this out.
Cleanup Lambda
Normal S3 buckets can’t be deleted if they still have objects, and S3 Vector buckets/inxexes are no different. CloudFormation doesn’t know that - it just tries to rip things out in reverse order and fails. To fix this, the construct wires in custom resources to delete the index before the bucket, and adds a cleanup Lambda to handle the Bedrock bits (DataSource + Knowledge Base) in the right sequence. On cdk destroy
, the two work together so teardown finishes cleanly instead of leaving you with dangling vector stores or broken stacks.
Broad Permissions for Now
You’ll also see some wide-open IAM statements, especially around s3vectors:*
. That’s because the preview only supports *
scoping for data-plane actions. I would not reasonably be able to trust ARN targeting yet as the feature persists through preview. Expect this to tighten up after general release, but in the meantime it’s “better working than broken.”
Parsing Options
By default, ingestion just chunks and indexes your docs. But I wired in optional foundation model parsing: run the content through Claude Sonnet (or another model) with a customizable prompt before vectorization. This is off by default because it can rack up costs fast depending on the model. If you flip it on, make sure the model is enabled in your account/region and double-check permissions + spend expectations.
Preview Caveat
All of the above (the custom resources, cleanup Lambda, broad IAM, parsing options) are wired this way because we’re in preview. Expect SDK upgrades, CDK L2 support (for S3 Vectors/indexes and the datasource), and tighter IAM to land over time. Again, I’ll be watching and updating over time as viable, but if I miss an update feel free to keep me honest in the repo!
Conclusion (Where This Fits, Where It Doesn’t)
At the end of the day, I’m putting out this construct to help folk out while full support is pending. It’s here to give you the cheapest, cleanest path into experimenting with Bedrock Knowledge Bases while AWS does what they do to produce the release version of S3 Vectors in the SDKs & CDK. If you’re a hobbyist, student, or small team, this is your way to get hands-on without torching a cloud budget. If you’re an enterprise shop, this is the kind of thing you point at for prototyping, internal demos, and pre-production environments, knowing you’ll likely jump to OpenSearch, Pinecone, or Aurora for production once latency and compliance start mattering more.
More to the point, it’s not the right fit if you’re building latency-critical apps where sub-100ms is non-negotiable, and you need stable, compliance-ready APIs today (S3 Vectors is still preview).
But it is the right fit if you want to understand what Bedrock KBs can do, tinker with your own docs, and start building muscle memory around RAG on AWS without burning dollars unnecessarily.
Feel free to take this for a spin, let me know how it goes. I’ll be iterating this repo as the preview evolves, but real-world feedback is always best!
References
AWS Direct Docs
- Amazon S3 Vectors Preview Announcement
- Working with S3 Vectors and vector buckets
- Amazon Bedrock Knowledge Bases
Community / Blog Posts
- AWS Serverless Advocate Lee Gilmore - Amazon Bedrock Knowledge Bases with Private Data
- ranthebuilder.cloud – Amazon CloudFormation Custom Resources Best Practices with CDK and Python Examples
That’s it! You’ve got the quickstart, the caveats, the guts, and the guardrails. Give it a whirl, and if you hit something weird (or find a sharper way to solve the cleanup/permissions pain points), toss an issue or PR in the repo. I’d love to keep this one honest as S3 Vectors matures.
This content originally appeared on DEV Community and was authored by Michael Flanagan