Vespa Cloud Subscription logo

    Vespa Cloud Subscription

    Sold by
    Vespa is an AI search platform for real-time retrieval, ranking, and inference on AWS, powering customer-facing applications such as RAG, search, recommendations, and personalization. It unifies structured, unstructured, vector, and tensor data in a single system to deliver low-latency, high-throughput performance. With hybrid search and machine-learned ranking executed directly in the engine, Vespa enables accurate, scalable AI applications that respond instantly to user interactions.

    Ratings and reviews

    4.4
    11 ratings
    2 star
    1 star
    64%
    27%
    9%
    0%
    0%
    3 AWS reviews
    |
    8 external reviews
    External reviews are from G2 .

    Filters

    Review type

    AWS Marketplace reviews
    External reviews
    Reviews (11)
    Ganaraj Amakrishna

    Vector search has improved e‑commerce relevance but setup and learning curve still need work

    Reviewed on Jun 06, 2026
    Review from a verified AWS customer

    What is our primary use case?

    My main use case for Vespa is implementing it as the back-end search engine for an e-commerce site, where we have about six million products, or six million SKUs, that we are selling. I implemented Vespa as an alternative for Elasticsearch.

    Using Vespa for the e-commerce site involved utilizing it as the backend search engine to replace Elasticsearch, which we felt was not doing us justice. The very first thing I did was convince my CIO to try out Vespa. We did a quick proof of concept, engaged with the right people through the Vespa Slack channels, and then we did the actual implementation, including A/B testing it against the previously running fully optimized Elasticsearch pipeline.

    What is most valuable?

    One of the best features that Vespa offers is natively handling embeddings or vectors, along with its capability for really fast searches. The powerful DSL provided by Vespa allows you to define search calculations, which I used extensively over the period of one year.

    Vector search is definitely the biggest selling point for Vespa. Even though Elasticsearch has vector search capabilities, it is not as powerful as what Vespa offers. Since Vespa natively supports sparse embeddings, that was an advantage, but in the end, I opted for dense embeddings using Google's Gemma's embeddings, as I worked on retraining that model on our dataset.

    Vespa's DSL includes many built-in functions, which is quite powerful. Even without embedding features, the DSL shines. However, it takes considerable time for indexing—initial runs took almost a whole day to create the embeddings and push them into Vespa, and it required adequate resources to run. Another issue was its inability to handle synonyms in the same manner as Elasticsearch.

    What needs improvement?

    Vespa definitely had its own set of challenges. It was really hard to get into initially, especially when I started implementing it in 2024 along with one junior employee, and the lack of documentation made it difficult. I aimed for an implementation with ColBERT, a sparse embedding mechanism, which I believed would fit well for e-commerce. We went through iterations during A/B testing because the initial set did not work as expected, which extended the process to about one and a half years.

    Vespa has a considerable learning curve, making it challenging for most people to get into, and it is also expensive, which can deter startups or those with smaller budgets from using it. Community support was decent, and we turned to it for clarifications. However, substantial improvements in documentation are necessary, especially more examples for handling DSL effectively. Having a runtime testing feature would greatly facilitate quick iterations.

    For how long have I used the solution?

    I have been using Vespa for more than about a year and a half.

    What do I think about the stability of the solution?

    In terms of stability and scalability, Vespa performed well. While it took some attempts to stabilize, I managed to scale effectively with the traffic we experienced and the servers we operated.

    How are customer service and support?

    The customer support I received was pretty good, mainly through interactions in the Slack community, where I typically got responses within hours or by the next day, leading me to rate them an eight or maybe even nine.

    Which solution did I use previously and why did I switch?

    Before choosing Vespa, I explored a few other search engine solutions, starting with Orama, a Node.js and TypeScript-based search engine that struggled to handle six million SKUs, and then Typesense, which aimed for instant searches but failed to accommodate the numerous attributes I needed for sparse data. That led me to Vespa, which met my expectations.

    How was the initial setup?

    The setup cost is definitely huge, and pricing is also steep. In terms of licensing, it seems generous for those who do not want to engage with Vespa's hosted services.

    What about the implementation team?

    I have very little experience with Vespa's governance and security, but I found it generally robust, despite lacking extensive engagement in that area. We deploy Vespa on AWS, which qualifies it as a public cloud solution.

    We did not purchase Vespa through the AWS Marketplace. Instead, we deployed the free version of Vespa that was available.

    What was our ROI?

    I would not agree with seeing a return on investment since we had not fully deployed Vespa into production, with only two people working on it for approximately a year and a half, which did not require a large team. However, we spent about two thousand to three thousand dollars per month on AWS while using Vespa, which was higher compared to around one thousand to one thousand five hundred dollars per month for Elasticsearch, although we saw some slight improvements in key metrics before stopping the A/B test.

    What's my experience with pricing, setup cost, and licensing?

    The setup cost is definitely huge, and pricing is also steep. In terms of licensing, it seems generous for those who do not want to engage with Vespa's hosted services.

    What other advice do I have?

    I would rate Vespa a six, as it is a powerful tool with great potential in terms of search engine capabilities, but the steep learning curve and initial setup costs are significant downsides.

    I chose six because of the steep learning curve and the substantial initial costs involved with setting up Vespa. If it were feasible for people with limited budgets, even as low as fifty dollars a month, it would be more appealing.

    While conducting A/B testing, Vespa seemed to be performing slightly better than Elasticsearch, especially in search relevancy within live production systems, and its performance was decent. Comparing raw Elasticsearch text-based search against Vespa's vector-based and text-based search, we were already recommending Vespa to several peer companies.

    During A/B testing, looking at conversion rates, search-to-basket ratios, and add-to-basket ratios showed improvement until we shut it down. It took several iterations to get the results, particularly after switching to Embedding Gemma, emphasizing that the quality of embedding used heavily influenced the outcome.

    Nothing else comes to mind regarding improvements needed for Vespa.

    I would not suggest Vespa unless you are an enterprise due to the steep learning curve and significant infrastructure costs involved. My overall rating for Vespa is six out of ten.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Shubhang Dutta

    Hybrid search has improved document retrieval and now supports high-volume conversational queries

    Reviewed on Jun 06, 2026
    Review from a verified AWS customer

    What is our primary use case?

    My full name is Shubhank and I am serving in Redblink Technologies in Mohali where I have been doing integration work related to AI. I implemented RAG.

    The previous year, we were using Quadrant as a vector store. With that, we were creating many collections there. Our company discussed internally and decided to move to Vespa. This was about six or eight months ago. We are using Vespa in our RAG pipeline.

    We have implemented a RAG pipeline where we have document retrieval. Users can chat with their documents. We are breaking down our documents into meaningful chunks using LangChain4j and feeding that directly into Vespa as a vector store. Later, while the user chats or starts a chat with the document, we can retrieve according to the user's prompt.

    We have such more use cases. We have a client, CPA Pilot, where there are many text documents, so we directly chunk those documents. There are very large documents, so in Quadrant, the collections were almost full. Inside Vespa, there is no system of collections, so that also helped us. We use self-hosted Vespa for that particular client and we are chunking down the long documents using LangChain4j and hitting Vespa to store it. During retrieval, we get good results and get proper relevant scores based on the user's query.

    What is most valuable?

    Earlier we used Quadrant where there is only vector embeddings and search on that basis. Vespa provided us a highly scalable and more reliable platform. Vespa also provides BM25 text search and embedding search. The main reason to move to Vespa is hybrid search.

    We have explored BM25 and hybrid search. We have implemented direct search and also embedding search by creating embeddings and storing them in Vespa. We can also use direct text search based on the user's query. This way we have implemented hybrid search and get the user's response.

    It works very well because we have many documents, plus a single document is also very long. Vespa is very good with retrieval and high-volume queries.

    Earlier we used Quadrant and now we are moving with Vespa. In Quadrant, we have a concept of collection where for every assistant, we create a new collection, and every document in that assistant goes to that particular collection. In Vespa, there is no concept of collection so we have to separate it on the basis of that assistant. That makes it unique. We were familiar with only the single, single, single collection for that specific assistant. With Vespa, we have all in one place and get it separated out on the basis of assistants and the environment we are using.

    What needs improvement?

    We want Vespa to implement some UI features so that we can visualize how our data goes and what embeddings it stores. The main thing Vespa has to implement is the UI. Right now, we are hitting the API and getting the results in Postman.

    One more improvement we want is an option in Vespa for getting some suggestions from Vespa. If you are storing a document in the vector store, Vespa could suggest some information you have to store for that particular document. My suggestion is going with implementing some features related to agentic AI. We have a couple of agents, so Vespa could decide which agent is best suitable for this user's query. That would be helpful.

    For how long have I used the solution?

    I have been working here since last year.

    What do I think about the stability of the solution?

    We have not specifically calculated the metrics, but until now we do not feel any issues with Vespa.

    What do I think about the scalability of the solution?

    Vespa is stable and it is also scalable. We have many documents and a single document also has a lot of content inside it. Vespa stores it in a very significant and optimized way.

    How are customer service and support?

    I do not have that much involvement with the customer support. I have raised some questions on Slack for the Vespa community and received responses in 24 hours. I have discussed my concerns and questions. The community support is very good.

    Which solution did I use previously and why did I switch?

    We were initially trying to use Pinecone, but after a lot of discussion and research, we decided to go with Vespa.

    How was the initial setup?

    AWS provides us with more analytics with Vespa, such as how it is performing on the servers. It is easy because of their well-documented documentation.

    We are using self-hosted Vespa in our AWS servers.

    The setup process is fine. It helps us save money and we got very good responses from the users.

    What about the implementation team?

    We are moving with Vespa. In Quadrant, we have a concept of collection where for every assistant, we create a new collection, and every document in that assistant goes to that particular collection.

    What's my experience with pricing, setup cost, and licensing?

    The cost part is not at a high point or at a low point. It is somewhere in the middle. That also helps us to sustain it.

    Which other solutions did I evaluate?

    We were initially trying to use Pinecone, but after a lot of discussion and research, we decided to go with Vespa.

    What other advice do I have?

    For anyone who wants to use a vector store, they should do research on their end, and if nothing comes up after discussion and research, I recommend using Vespa because they have good reliability. The main thing is the speed. The retrieval speed is very good. I recommend Vespa for systems to get integrated with.

    Vespa is very good and it improves our product, and we got more clients. We got very good results and very good relevance. This mainly depends on how you can design the Vespa document schemas. The document schema design determines how your relevance will come and how your retrieval will be done. The feedback for how Vespa responds is good and also fast. We are using Amazon Web Services (AWS) and it is easy because of their well-documented documentation. I give Vespa a rating of nine out of ten.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Kelyn Ukiru

    Advanced ranking has improved candidate matching and now simplifies end‑to‑end hiring workflows

    Reviewed on Jun 04, 2026
    Review from a verified AWS customer

    What is our primary use case?

    I use Vespa as a vector database for ranking and matching. I have jobs and candidates indexed in Vespa, which is a vector database. When I have a job and need to get the first 50 candidates who match that job, a normal vector search would not retrieve the first 50 because I may need to filter or rank based on some features and fields. Vespa helps by allowing me to first select the first 200, and then within the 200, I rank the first 50 based on certain criteria.

    What is most valuable?

    The best feature to me is the LTR feature, the ranking feature to be specific. For most other vector databases, you perform the query and then apply your own logic on LTR outside the database, but with Vespa, these two operations can be done within the database. This means that latency is reduced and bottlenecks are reduced if these two operations can be done within the database. It is more having your filter and sort operation within the database itself. I also appreciate the filter part that Vespa offers.

    The results have been better with Vespa. The matching has been far better with Vespa. Before, I was using other solutions such as PGVector and Pinecone and Weaviate, but they were harder to integrate, meaning more work for the developer. With Vespa, the integration on the development side has been easier.

    I cannot say that the improved results are directly tied to Vespa alone because many iterations have been made, and this includes the general architecture. The fact that Vespa does its own indexing, and I can just receive a string of text, index it and store it as a vector, is remarkable.

    What needs improvement?

    The integration is actually a pain. If something could be done to make it easier, I would really appreciate it. The reason I am saying this is that if I have a migration script that I need to run in the database, it is difficult to run migration scripts on Vespa because each time I run a migration script, I have to restart Vespa. This makes the CI process a little bit difficult.

    A UI would be nice to have. It is not something I thought of earlier, but now that I am thinking of it, I believe a UI would be beneficial, similar to how Neo4j offers it. Of course, I have not looked into it extensively, so I do not know if it offers a UI because for my use case, I did not need the UI.

    The documentation could also be improved, although the documentation was quite easy to follow for me. I do not know if it is a skill issue or not. For beginners, for someone just getting into vector databases and they just discover Vespa, I believe it will be somewhat harder for them. If the documentation can be made more beginner-friendly, it would be better.

    The migration script and the amount of resources Vespa requires is significant.

    The embedding is good. I have actually used it; the only problem is that if I need some more context passed into the embedding, then with Vespa, it is difficult, meaning I have to pass the text through an external LLM to extract the context and then pass it into Vespa for embedding. If there is a way I can improve the embedding context, to pass some context into Vespa's embedding so that I can just pass a string and let it handle the embedding by itself, that would be beneficial.

    I noticed Vespa only requires deployment within the environment. If I have an internal network and since Vespa does not support passwords and usernames, it makes it difficult to control what level of access a user has to data. This raises some questions regarding integrity. If two different services are using the same Vespa instance, then data protection is at risk. If Vespa can introduce username and passwords, this would solve many things. With this structure, it also means that Vespa cannot be exposed to the public. If I want to buy more resources on a different vendor and have my services on a different vendor, exposing Vespa to the public means additional setup, which means IP mapping or something similar. If the IPs are changing, then I am also running into a different problem. Therefore, username and passwords, although basic, can really help, or at least roles.

    For how long have I used the solution?

    I have used Vespa for roughly two years.

    What do I think about the scalability of the solution?

    I have yet to experience scalability issues. I do not know how it handles traffic, but I will have an answer for that soon. I also have not tried scaling it, so I do not know how it scales. These are answers I have yet to find.

    As I mentioned, I have not tried scaling it and I do not know what problems I might run into while scaling Vespa. I still need some more time, maybe as I get more users. At the moment, it is not a bottleneck for us; the one instance of Vespa is working well. Perhaps soon I will scale it and can have a better answer for this.

    Which solution did I use previously and why did I switch?

    I was using PGVector before switching to Vespa. The reason that made me switch to Vespa is because I needed more functionality, such as the ranking feature. There were some other options on the table; Weaviate was one of them.

    What other advice do I have?

    Up to now, I am still in the building phase. I have not gone commercial with my product, and so I cannot give a relevant answer about that. I am still trying out Vespa to see if it actually meets my business need. I would tell others that the product is actually good if they have some resources on their side because it is resource-intensive. It actually requires someone who knows what they are doing to reap most of the benefits out of Vespa because you do not have to implement most of the features in the code layer; you can just do it at the database layer. I would rate this product an 8 out of 10.

    Which deployment model are you using for this solution?

    Private Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Automotive

    Powerful backend for vector and hybrid search with many bells and whistles.

    Reviewed on Dec 18, 2024
    Review provided by G2
    What do you like best about the product?
    We purchased the Enclave product which was really well-suited for us because it let us run the hosts in our own Google cloud account (at our pricing with Google), and thus didn't require us to transfer any data out which was well-aligned with our security stance. It provided light-touch deployment and observability services that we lacked and helped us bootstrap quickly and with minimal investment.

    The Vespa search backend itself provided a good match to our requirements of near-real time hybrid search, combining nearest neighbor embedding search with attribute filters, in a distributed and highly scalable way. Our target installation comprised >12TB of memory across 24 hosts and held O(1B) vector embeddings.
    What do you dislike about the product?
    Vespa, in a scalable deployment, presents a fairly complex architecture with a lot of tuning knobs and bells and whistles. It took several months to get familiar with them. The Vespa consultant was very instrumental in this. Feeding Vespa from BigQuery was harder than expected.
    Native extensions can only be written in Java which, without a native Java toolchain at our company, proved too challenging to pursue. The documentation is vast but could be better organized and have more contextual examples in places.
    What problems is the product solving and how is that benefiting you?
    We used the Vespa search backend for hybrid search, consisting of nearest neighbor search of indexed embeddings vectors and attribute filters. This powered a natural-language image search product for our internal users.
    Satwik L.

    My go-to-tool for my research on my e-commerce data

    Reviewed on Sep 11, 2024
    Review provided by G2
    What do you like best about the product?
    I like the open-source and free 300 dollar cloud credits for hosting the live applications.
    What do you dislike about the product?
    I feel there should be more documentation work is in pending and needed as I am still exploring the AI and vector database part.

    Anyway I am happy to contribute for open source as a contributor.
    What problems is the product solving and how is that benefiting you?
    I worked for my e-commerce client to highlight the product which are giving more sales by ranking and recommendations for efficiency in stock.
    Michele S.

    Connect data to AI capabilities

    Reviewed on Aug 13, 2024
    Review provided by G2
    What do you like best about the product?
    I can create recommendation applications and deploy real-time machine learning inference using this stack. Such a level of functionality is what we need for our large scale search applications.
    What do you dislike about the product?
    Vespa initialization and subsequent functioning, in fact, require a significant level of system configuration. It may be a little obscure sometimes and for troubleshooting issues one has to really appreciate the underlying environment.
    What problems is the product solving and how is that benefiting you?
    Vespa solves the problem of managing and processing large amounts of data and its integration with Artificial Intelligence for Web applications. It enables me to build outstanding search capabilities and I use real-time data processing.
    Vignesh H.

    Best Gen AI software to build your own infrastructure

    Reviewed on Jul 30, 2024
    Review provided by G2
    What do you like best about the product?
    The most helpful thing is the open source big data engine, heps to process and serve large scale data in real time with very low latency time.Its content recommendations are very useful for the modern day real-time analysis. Also, it is more flexible and scalable with advanced query techniques which makes it more easy to use.
    What do you dislike about the product?
    Integrating vespa with existing systems and workflows can be challenging, particulary if systems were based on different technologies. Documentation and customer support for an open source is not at the top notch when compared to the real time products. since it is highly specialised it may overkill for simpler applications w or less demanding requirements.
    What problems is the product solving and how is that benefiting you?
    Vespa helps in solving real time updates by using as a search engine which gives lot of recommendations based on our search results. it has the scalability and flexibility to process large volume of data in real time analyses and in turn produces intelligent responses based on the latest data.
    Marketing and Advertising

    Vepsa decreased costs, latency, and management for billions of searches per month

    Reviewed on Jun 12, 2024
    Review provided by G2
    What do you like best about the product?
    For our use case in advertising, Vespa leaves Apache Lucene-based products in the dust:
    - High indexing throughput while searching
    - Very, very technical team
    - Best of the best technical support and guidance
    - Multiple times, discussions were had and the next day the idea was implemented
    What do you dislike about the product?
    - Search is still costly
    - Improving ANN capabilities with ideas like DiskANN
    - Simplify schema configuration and testing
    - Lean in on more cloud native technologies
    What problems is the product solving and how is that benefiting you?
    We do web-scale advertising. This means we process billions of queries a month concurrently with hundreds of million of feed requests. Vespa Cloud and their team provided us great technical guidance, saving us hundreds of thousands of dollars by optimizing and implementing fixes for our deployment. Although the road to utilizing Vespa took a long, hard journey, we are in a much better place then our previous solution with a Lucene-based product.
    Eddie N.

    We moved our inhouse recommendations system to Vespa

    Reviewed on Jun 10, 2024
    Review provided by G2
    What do you like best about the product?
    Vespa provides a comprehensive set of features you would look for in a search engine, particularly in more ranking capabilites (e.g. leveraging ML models) and performance than what Elasticsearch offers out of the box. They're also constantly making advancements in new capabilities that they offer a nice hybrid between vector databases and a conventional search engine. Particularly for our business problem at OkCupid of recommending potential matches to millions of other users based on a myriad of factors and ranking algorithms, Vespa was a great fit to not only meet those use cases, but improve our team's development and iteration workflows in our recs system.

    The Vespa team is also very active on Slack: https://vespatalk.slack.com/ssb/redirect and genuinely collaborative. In my case, we worked together with an engineer from their team who helped raise improvement changes into the engine to help us meet our use cases.
    What do you dislike about the product?
    One of the challenges in the past was around documentation and general community knowledge and expertise. Their documentation has since gone through a substantial revamp
    What problems is the product solving and how is that benefiting you?
    Vespa provides capabilities around a vector database as well as typical search engine capabilities so that we can consider other filters than just only constraining on similar vectors, etc. Additionally Vespa provides a strong set of ranking capabilities out of the box via ONNX, Tensorflow, LightGBM, etc. models
    Gabe V.

    The best search infrastructure

    Reviewed on Jun 10, 2024
    Review provided by G2
    What do you like best about the product?
    Powerful Search Capabilities: Vespa.ai's search engine delivers lightning-fast and highly relevant results, even for complex queries over vast datasets. Their advanced linguistics capabilities ensure accurate understanding of query intent.

    Scalable Architecture: I never have to worry about scaling with the Vespa cloud offering

    Rich Filtering and Ranking: Vespa provides extensive capabilities for filtering, ranking, and blending results based on multiple criteria and machine learning models. We leverage their HNSW and BM25 rankings

    Machine Learning Integration: Their tight integration with advanced machine learning frameworks like TensorFlow and PyTorch allows easy deployment of custom ML models for ranking, recommendations, and other use cases.

    Top Tier Customer Support: The Vespa team has been exceedingly responsive to my questions regarding how to implement certain features.
    What do you dislike about the product?
    There can be a steep learning curve when onboarding to the product, though it is well worth the investment of time
    What problems is the product solving and how is that benefiting you?
    Finding relevant information for my end users