RAG Engine (formerly Vertex AI) - Managed RAG Pipelines · innFactory

What is RAG Engine?

RAG Engine is a fully managed orchestration runtime for retrieval-augmented generation (RAG), now part of Google’s Gemini Enterprise Agent Platform (previously marketed under the Vertex AI brand). The service handles building and operating a complete RAG pipeline for you and enriches the responses of large language models with your own private data. As a result, models like Gemini answer more accurately, and hallucinations are reduced because generation builds on verifiable sources rather than the model’s training knowledge alone.

RAG Engine follows a six-step pipeline: data ingestion, transformation with chunking, embedding, indexing into a corpus, retrieval, and grounded generation. Chunk size and overlap are configurable, so you can tune retrieval quality for your use case. The service is natively integrated with the Gemini API as a retrieval tool and additionally draws on numerous models from the platform’s Model Garden, including Gemini, Claude, and Llama.

Core Features

Six-step RAG pipeline: Data ingestion, transformation and chunking, embedding, corpus indexing, retrieval, and grounded generation as a managed, end-to-end flow with configurable chunk size and overlap.
Native Gemini integration: Available as a retrieval tool of the Gemini API and with access to numerous generation models from the Model Garden such as Gemini, Claude, and Llama.
Pluggable vector databases: Choose between RagManagedDb (default), Vector Search, Feature Store, and third parties such as Pinecone and Weaviate.
Broad data-source connectivity: Ingest from Cloud Storage, Google Drive, BigQuery datasets, local files, and websites, as well as through additional connectors.

Typical Use Cases

Knowledge-based chatbots and assistants: RAG Engine grounds responses on internal documents, manuals, and knowledge bases. This lets assistants answer questions about company-specific content that no general model knows.

Question-answering systems with source citations: By grounding on a corpus, responses can be traced back to concrete sources. This improves traceability in areas such as support, legal, or compliance.

RAG backends for agents and search: RAG Engine serves as the retrieval layer for agents (e.g. in Agent Studio/Agent Runtime) and search applications, delivering the relevant context that agents need to complete tasks.

Benefits

Fully managed pipeline with no need to operate your own embedding, index, and retrieval infrastructure.
Flexible choice of vector database and data sources, so you avoid lock-in to a single storage solution.
Pay-per-use with composed costs: you pay only for the components you use.

Integration with innFactory

As a certified Google Cloud partner, innFactory supports you with the adoption and operation of RAG Engine as part of the Gemini Enterprise Agent Platform.

Frequently Asked Questions

What is RAG Engine (formerly Vertex AI RAG Engine)?

RAG Engine is a managed orchestration runtime for retrieval-augmented generation, now part of the Gemini Enterprise Agent Platform (formerly Vertex AI). The service runs the complete pipeline: data ingestion, chunking, embedding, indexing into a corpus, retrieval, and grounded generation. It enriches LLM responses with your own data to reduce hallucinations.

When should I use RAG Engine?

Use RAG Engine when you want to ground Gemini or other model responses on internal documents, knowledge bases, or structured data. Typical scenarios are knowledge-based chatbots, question-answering systems with source citations, and RAG backends for agents where you do not want to operate the pipeline yourself.

How much does RAG Engine cost?

RAG Engine bills pay-per-use with composed costs: access to data sources through the default parser, LLM parser calls, vector storage, embedding, and generation model usage are billed proportionally. Details are available in the official pricing list; there is no flat base fee.

Which vector databases and data sources does RAG Engine support?

Supported vector stores include RagManagedDb (default), Vector Search (formerly Vertex AI Vector Search), and Feature Store, as well as third parties such as Pinecone and Weaviate. Supported data sources include Cloud Storage, Google Drive, BigQuery datasets, local files, websites, and additional connectors.

How does RAG Engine relate to Vertex AI and the Gemini Enterprise Agent Platform?

RAG Engine was originally part of Vertex AI and, in 2026, moved together with the entire Vertex AI portfolio into the Gemini Enterprise Agent Platform. The product name RAG Engine itself was retained; only the platform name and some neighboring components (e.g. Agent Engine to Agent Runtime) were renamed.

RAG Engine (formerly Vertex AI) - Managed RAG Pipelines

What is RAG Engine?

Core Features

Typical Use Cases

Benefits

Integration with innFactory

Typical Use Cases

Frequently Asked Questions

What is RAG Engine (formerly Vertex AI RAG Engine)?

When should I use RAG Engine?

How much does RAG Engine cost?

Which vector databases and data sources does RAG Engine support?

How does RAG Engine relate to Vertex AI and the Gemini Enterprise Agent Platform?

Quick Links

Google Cloud Partner

Similar Products from Other Clouds

Amazon Augmented AI (A2I) - Human Review for ML

Amazon Bedrock AgentCore - AI Agent Runtime

Amazon Bedrock Agents (Classic): Status and Alternative

Amazon Bedrock Data Automation - Structure Data

Amazon Bedrock Guardrails - Safety for Generative AI

Amazon Bedrock Knowledge Bases: Managed RAG

Ready to start with RAG Engine (formerly Vertex AI) - Managed RAG Pipelines?