Forget Vector Databases: RAG with Just SQL + LLM – ██FR█████ █INTELL███████████

This content originally appeared on Level Up Coding – Medium and was authored by RisingWave Labs

Turning Documentation into Intelligence: Build a RAG System with RisingWave

Forget Vector Databases: RAG with Just SQL + LLM. Image Source: Freepik

Documentation is essential — but finding the right answer at the right time is harder than it should be. But what if your documentation could answer questions like a human expert?

Today I want to share the simplest way to make that happen: with an LLM API + RisingWave (an open-source streaming database), you can turn static documentation into an intelligent assistant that actually understands your content — and delivers contextual, accurate answers in seconds.

The architecture: from raw text to smart answers

Building a RAG system typically requires piecing together multiple disparate tools, such as a vector database, an LLM, and custom orchestration logic. That’s powerful — but also complex and brittle.

RisingWave simplifies this complex architecture by consolidating these functions into its streaming SQL engine. It natively handles vector embeddings and similarity search, manages materialized views for rapid data retrieval, and supports both SQL and Python UDFs for flexible custom logic. Furthermore, its direct integration with the OpenAI API streamlines the entire workflow, creating a unified and more efficient platform for your RAG applications.

Here’s how it works in practice at a high level. In the next section, we’ll walk through how to actually build it yourself in just three steps.

Ingest documentation into a RisingWave table.
Automatically generate vector embeddings for each document using OpenAI.
When a user asks a question, generate an embedding for the query.
Retrieve the most semantically similar docs using a custom cosine_similarity UDF.
Feed the results to an LLM to generate a concise, context-aware answer.

You get semantic search and AI-powered answers, without standing up a separate vector database or custom backend.

Building it yourself in just three steps

Set up the pipeline

Create a documents table and a document_embeddings materialized view using a UDF that calls OpenAI’s embedding API. The view uses a SQL UDF called openai_embedding—a new built-in function introduced in RisingWave 2.5.0. See openai_embedding function documentation for more details.

CREATE FUNCTION text_embedding(t VARCHAR) RETURNS REAL[] LANGUAGE sql AS $$
     SELECT openai_embedding('your-openai-api-key', 'text-embedding-3-small', t)
 $$;

Load your docs

Clone the RisingWave documentation repo and insert the content into the documents table.

Ask questions

Generate a query embedding, compare it with stored embeddings using the cosine_similarity UDF, and retrieve the top matches.

For detailed step-by-step guides, see Building a RAG system on RisingWave.

Beyond search

This new capability opens the door to several powerful applications:

Self-service developer support: Answer onboarding and usage questions automatically.
AI assistants for internal tools: Embed natural-language Q&A into your dev dashboards or CLIs.
Product search that understands meaning: Power help centers with real intelligence, not just full-text search.

And best of all, it’s all done using standard SQL and built-in features in RisingWave — so you can focus on building, not wiring systems together.

Try it out

Ready to move from endless searching to real finding?. The full tutorial is available here: Building a RAG system with Just SQL + LLM.

RisingWave is a stream processing and management platform designed to offer the simplest and most cost-effective way to process, analyze, and manage real-time event data — with built-in support for the Apache Iceberg open table format. It provides both a Postgres-compatible SQL interface and a DataFrame-style Python interface.

RisingWave can ingest millions of events per second, continuously join and analyze live streams with historical data, serve ad-hoc queries at low latency, and persist fresh, consistent results to Apache Iceberg or any other downstream system.

End-to-End Real-Time Data Stack. Image created by the author.

Forget Vector Databases: RAG with Just SQL + LLM was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding – Medium and was authored by RisingWave Labs