Skip to main content

Amazon OpenSearch Service

Amazon OpenSearch Serverless

Power agentic AI and dynamic workloads with instant scaling and scale to zero, with no infrastructure to manage, and no idle compute costs.

Why Amazon OpenSearch Serverless

Create resources in seconds and scale up and down to zero seamlessly. Decoupled storage and compute with usage-based pricing offers up to 60% cost savings compared to provisioning OpenSearch Service clusters for peak capacity. No idle costs with scale to zero. Native integrations across leading AI development platforms including Vercel and Kiro streamline your AI development and allow you to build faster.

Benefits of Amazon OpenSearch Serverless

    Get real-time context retrieval that scales automatically with dynamic agentic workloads with no infrastructure management required. Create resources in seconds. Rapid autoscaling keeps performance consistent as workloads scale up or down automatically.

    Pay for storage and compute resources independently for what you use. Automatic scaling eliminates idle capacity and overprovisioning by matching resources to demand in real time. Save up to 60% compared to provisioning OpenSearch Service clusters for peak capacity.

    Go from idea to production with no infrastructure to manage. Build from your preferred AI development platforms such as Vercel and Kiro, and create complete agent-driven applications in minutes.

Use cases

    Run critical search and vector-powered applications with dynamic workloads. Scale instantly to meet demand across financial services, e-commerce, and media platforms without provisioning for peak capacity.

    Power RAG applications, semantic search, and agentic workflows that generate unpredictable traffic. Scale instantly when agents trigger tasks and pay no compute costs when idle, so developers can focus on building instead of capacity planning.

    Serve search and vector workloads across multiple tenants without paying for always-on capacity. Scale dynamically for high-usage tenants with no compute costs of idle tenants, eliminating the overprovisioning tax of traditional architectures.

Autodesk

"Autodesk's Search platform for Design & Make Data on Amazon OpenSearch Service powers search and information retrieval across our multi-billion-dollar industry products, handling billions of objects and versions per model and millions of documents across multiple languages and geographies. OpenSearch Serverless is the natural next step: decoupled compute and storage fits versioning requirements at petabyte-scale, scale-to-zero removes idle costs, and provisioning in seconds means we can scale from thousands to millions of users in weeks, not months, without touching infrastructure. We're excited to build the next-generation search platform with OpenSearch Serverless.”

Prasanth Nair, Sr. Director, Software Engineering & Architecture, Autodesk

Missing alt text value

Freshworks

"We're injecting agentic AI capabilities across our products to help 75,000+ customers resolve issues faster and work smarter. We've trusted OpenSearch Service as the backbone of our Search and Observability platform for years. With Amazon OpenSearch Serverless, we can take that foundation further — scaling instantly with unpredictable agent workloads while maintaining cost efficiency and letting our engineers focus on building AI features instead of managing infrastructure."

Sreedhar Gade, VP of Engineering, Freshworks

Missing alt text value

Sumo Logic

“Amazon OpenSearch Serverless is the vector search foundation behind Sumo Logic Dojo AI, our agentic security and operations platform that detects, investigates, and responds to threats in real time. It powers RAG retrieval across thousands of Sumo Logic knowledge sources that Mobot and our SOC Analyst Agent depend on. OpenSearch Serverless helps us provide real-time contextual evidence retrieval for security and reliability investigations. We're looking forward to the new serverless architecture and agentic features to deliver agent-native knowledge retrieval for RAG and elastic scalability without manual provisioning.”

Kui Jia, VP AI Engineering, Sumo Logic

Missing alt text value

Serverless FAQs

Open all

    Amazon OpenSearch Serverless is a fully managed search and vector database that automatically scales capacity up or down based on the demands of your workload, so you pay only for the capacity you use.

    Benefits include simplified capacity management, automatic scaling, support for multi-tenant workloads, and up to 60% cost savings versus provisioning OpenSearch Service managed clusters for peak capacity.

    You can get started with the OpenSearch Serverless in minutes. In the AWS Management Console, customers navigate to the Amazon OpenSearch Service console and select the type of collection they need – search or vector. Within seconds, you get a fully managed endpoint for indexing and searching your data. OpenSearch Serverless is also available through the AWS CLI and AWS SDKs for teams that prefer infrastructure as code.

    Amazon OpenSearch Serverless now delivers up to 20x faster autoscaling compared to its previous generation along with scale-to- zero. This means infrastructure scales on demand from a single request to thousands of requests and back to zero when idle, seamlessly handling the unpredictable traffic spikes generated by agentic workflows. Customers pay only for what they consume, with no idle capacity charges, eliminating the need for over-provision for peak loads.

    When an OpenSearch Serverless collection has no active requests, it automatically scales the compute down to zero OCUs after 10 minutes of inactivity, so customers are charged only for managed storage while idle. When a new request arrives, OpenSearch Serverless provisions the necessary compute resources and begins serving traffic with a cold start delay of approximately 10 seconds. This true scale-to-zero capability means customers never pay for idle infrastructure, making it ideal for development environments, bursty agentic workloads, and long-tail collections that see intermittent traffic.

    Decoupled storage and compute means OpenSearch Serverless manages your data on a shared storage layer independently from the compute resources that process your queries and indexing. This allows compute to scale up, down, or to zero without moving or recopying data, enabling 20x faster autoscaling and scale-to-zero capabilities.

    OpenSearch Serverless charges across three dimensions - You pay for Indexing OCUs (compute for indexing data) in OCUs per hour, Search OCUs (compute for serving queries) in OCUs per hour, and Storage-Hot charged in GB per month. When your collection scales to zero, you stop paying for compute entirely and are only charged for storage. This pay-per-use model means your bill directly reflects actual consumption rather than provisioned capacity, delivering up to 60% cost savings compared to provisioning for peak loads. Refer to the pricing page for more details.

    Amazon OpenSearch Serverless offers two collection generations. Classic collections are the previous generation. NextGen collections are the new generation, purpose-built for agentic and unpredictable workloads — they scale to zero so you only pay when your collection is in use, autoscale 20x faster in seconds, and fully decouple storage from compute so indexing and search scale independently. NextGen collections reduce infrastructure costs by up to 60% and eliminate the need for capacity planning. We recommend all customers create NextGen collections.

    Yes, you will get the same security features in both collection generations.

    Yes, you can run both collection generations in the same account.