Data Processing & Analysisadvanced
November 13, 2025
6 min read
1 hour
Build a Smart AI Chatbot That Actually Knows Your Documents (Using n8n RAG Workflow)
Turn Google Drive into an AI-powered knowledge base. This n8n RAG workflow answers questions from your docs with real-time citations and smart context.
By Nayma Sultana

You've got dozens of documents scattered across Google Drive. Your team keeps asking the same questions. You find yourself copy-pasting from the same PDFs over and over. Sound familiar?
Here's the thing: most chatbots give you generic answers pulled from their training data. But what if your chatbot could answer questions specifically from YOUR documents, cite its sources, and keep the conversation flowing naturally?
That's exactly what this n8n RAG (Retrieval-Augmented Generation) workflow does. It turns your Google Drive into a smart knowledge base that your AI can search through and reference in real-time. No more hunting through folders. No more outdated information. Just accurate answers with clickable source links.
What You'll Need to Get Started
Before diving into the workflow, let's gather the tools. You'll need accounts and API access for a few services. Don't worry, most offer free tiers that work perfectly for testing.
Required APIs and Accounts
- Google Drive API: This connects n8n to your Google Drive folder where documents live
- Supabase Account: Your vector database that stores document embeddings and metadata
- OpenAI API: Generates embeddings that help the AI understand document content
- Google Gemini API: Powers the conversational AI that chats with users
- Cohere API: Reranks search results to surface the most relevant information
- PostgreSQL Database: Stores chat history so conversations have context
Key Components in This Workflow
This workflow uses several specialized n8n nodes working in harmony:
- Chat Trigger: Creates the chat interface users interact with
- RAG Agent: The brain that decides when to search documents or just chat
- Google Drive Trigger: Watches for new or updated files automatically
- Supabase Vector Store: Stores and searches through document embeddings
- Text Splitter: Breaks documents into digestible 750-character chunks
- Embeddings Nodes: Converts text into mathematical representations
- Reranker: Improves search accuracy by reordering results
- Memory Node: Remembers conversation history for context
Building Your RAG System: Step by Step
Step 1: Set Up the Chat Interface
Start with the fun part. The Chat Trigger node creates a web interface where users can ask questions. Connect it to the RAG Agent, which acts as the coordinator. The agent has a custom system prompt that gives it personality and instructions.
The prompt does something clever: it analyzes whether someone is just saying hello or actually asking for information. If it's casual chat, the bot responds naturally without digging through documents. If it's a real question, the bot switches into research mode and searches your knowledge base.
img_1.png
Hook up Google Gemini as your language model and add PostgreSQL Chat Memory so the bot remembers what you talked about earlier in the conversation. This makes interactions feel natural rather than robotic.
Step 2: Connect Your Document Pipeline
Now for the behind-the-scenes magic. Add two Google Drive Trigger nodes: one watches for new files, another watches for updates. Point both at your chosen Google Drive folder.
When a file appears or changes, the workflow springs into action. A Loop Over Items node processes files one by one (because batch processing can get messy). The Set File ID node extracts important metadata: the file's ID, type, title, and shareable link. You'll need these later for citations.
Step 3: Process Documents Intelligently
Different file types need different handling. A Switch node checks whether you're dealing with a PDF or a Google Doc, then routes accordingly.
PDFs go through Extract PDF Text. Google Docs get downloaded and converted to plain text through Extract Document Text. Both paths merge back together because the next steps are identical.
img_2.png
Here's where it gets technical but important: the Character Text Splitter breaks documents into 750-character chunks with 200 characters of overlap. Why overlap? So sentences aren't cut awkwardly, and context carries between chunks. Think of it like creating a flip book where each page shares a bit with the previous one.
Step 4: Create Searchable Embeddings
Raw text isn't searchable by AI in a meaningful way. You need embeddings, which are mathematical representations that capture meaning. The Embeddings OpenAI node converts each chunk into a vector (a list of numbers).
These vectors get stored in Supabase Vector Store along with metadata: which file they came from, the title, and the URL. This metadata becomes crucial later when the chatbot needs to cite sources.
img_3.png
Before inserting new chunks, the Delete Old Doc Rows node removes previous versions of the document. This prevents duplicates when files get updated. The Insert Document Metadata node keeps a separate table tracking which files exist in your system.
Step 5: Make Search Results Smarter
Vector search is good, but it's not perfect. Sometimes relevant results get buried. That's where the Reranker Cohere node shines.
When someone asks a question, the Supabase Vector Store retrieves the top 20 most similar chunks. Then Cohere reranks them, pushing the truly relevant ones to the top. It's like having a librarian who not only finds books on your topic but also knows which chapters actually answer your question.
img_4.png
The RAG Agent receives these reranked results and uses them to formulate an answer. Crucially, the system prompt instructs it to cite sources using the metadata. Every claim gets wrapped in a citation with a clickable link back to the original document.
Step 6: Keep Everything Clean
Documents get deleted from Google Drive, but their embeddings can linger in your database like ghosts. The workflow includes two cleanup jobs (currently disabled but ready when you need them).
img_5.png
These jobs compare files in Google Drive against records in Supabase. Any orphaned records get deleted from both the documents table and the document_metadata table. The Code nodes handle edge cases, like what happens if your entire Google Drive folder gets emptied.
Why This Workflow Changes the Game
This isn't just a chatbot. It's a knowledge management system that scales with you.
Your support team can answer customer questions faster because the bot pulls from your latest documentation. Your sales team can quickly reference product specs without digging through shared drives. Your onboarding process becomes self-service because new hires can ask questions and get accurate, cited answers.
The real power is in the citations. Users don't just get answers; they get links to verify information. Trust goes up. Fact-checking is built in. And when something changes in your documents, the bot's knowledge updates automatically within minutes.
Companies use workflows like this for internal wikis, customer support systems, research databases, compliance documentation, and product knowledge bases. Anywhere you have documents and people asking questions, this workflow fits.
Start Building Your Smart Knowledge Base
The beauty of n8n is that you can start small. Drop a few documents in Google Drive, set up the basic pipeline, and watch it work. As you get comfortable, add the cleanup jobs. Fine-tune the chunk size. Experiment with different reranking settings.
Your documents already contain valuable knowledge. This workflow just makes that knowledge accessible, searchable, and conversational. And unlike traditional search, it understands context, maintains conversation history, and presents information in a human-friendly way.
Stop letting valuable information hide in documents. Build a system that brings knowledge to your fingertips, one chat message at a time.
Share this article
Help others discover this content
Tap and hold the link button above to access your device's native sharing options
More in Data Processing & Analysis
Continue exploring workflows in this category

Data Processing & Analysisintermediate
1 min read
# Build an AI-Powered Conversational Survey Bot with n8n: Turn Static Forms into Dynamic Interviews
Nayma Sultana
Nov 15
Est: 40 minutes

Data Processing & Analysisintermediate
1 min read
Build an AI-Powered YouTube Parser with n8n
Mahedi Hasan Nadvee
Nov 13
Est: 45 minutes

Data Processing & Analysisintermediate
1 min read
Automate Vendor Risk Monitoring with n8n: A Complete Compliance Workflow
Nayma Sultana
Nov 13
Est: 35 minutes