
Overview

Product video
WhyLabs is the essential AI Observability Platform for model and data health. It is the only machine learning monitoring and observability platform that doesn't operate on raw data, which enables a no-configuration solution, privacy preservation, and massive scale.
Machine learning engineers and data scientists rely on the platform to monitor ML applications and data pipelines by surfacing and resolving data quality issues, data bias, and concept drift. These capabilities help AI builders reduce model failures, avoid downtime, and ensure customers are getting the best user experience. With out-of-the-box anomaly detection and purpose-built visualizations, WhyLabs eliminates the need for manual troubleshooting and reduces operational costs.
The platform can monitor tabular, image, and text data. It integrates with many popular ML and data tools including Pandas, Apache Spark, AWS Sagemaker, MLflow, Flask, Ray, RAPIDS, Apache Kafka, and more. To learn more about what data types WhyLabs can work with and which tools we integrate with, check out the whylogs GitHub page: https://github.com/whylabs/whylogs
WhyLabs was created at the Allen Institute for Artificial Intelligence (AI2) by Amazon Machine Learning alums and is backed by Andrew Ng's AI Fund.
For custom pricing, EULA, or a private contract, please contact AWSMarketplace@whylabs.ai for a private offer.
Highlights
- Enable data and model monitoring quickly and securely: Automated monitoring and alerting across dozens of "data vitals" with out-of-the-box configurations and lightweight integrations. Cloud agnostic, built with AWS-grade privacy and security. Integration takes less than an hour.
- Deliver the impact models were designed for: Improve model performance, resilience, and auditability with alerting and reporting tools. Monitor model inputs, outputs, performance as well as upstream data quality in one platform.
- Achieve AI Governance across the organization: Track all relevant metrics associated with the data that flows through Al applications. Enabling observability in Al applications is key for achieving Al Governance best practices.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/month |
|---|---|---|
1 Model (free tier) | Monitoring for one model | $0.00 |
2 Models | Monitoring for two models | $100.00 |
3 Models | Monitoring for three models | $200.00 |
4 Models | Monitoring for four models | $300.00 |
5 Models | Monitoring for five models | $400.00 |
WhyLabs Enterprise | Enterprise contract with model monitoring at scale | $8,333.33 |
Vendor refund policy
No refunds
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Resources
Vendor resources
Support
Vendor support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Standard contract
Customer reviews
Monitoring multi-agent LLM workflows has become reliable and protects PII in real time
What is our primary use case?
My main use case for WhyLabs was for LLM monitoring and observability. At that time, I had an AI application that I deployed on Vertex AI, and I used WhyLabs for the observability, logging, and monitoring of that application and the model.
I can provide a specific example of how I used WhyLabs for monitoring my LLM application. It was a multi-agent system with around four agents involved, and each agent had around seven or eight tools that it could use or invoke. Whenever a user sent a query to the main agent, its responsibility was to delegate the request among the other sub-agents. Each sub-agent could communicate with each other using the A2A protocol and call their tools. I monitored how the request progressed through the system. For instance, if a user sent a request to one agent, which then transferred it to a third agent, the third agent used a tool, and then it went to the seventh agent. I could easily monitor all this communication between the agents, the logging time, the request, the response, any errors, and any guardrails I wanted in my application in WhyLabs.
This was my only use case, and then WhyLabs got discontinued. WhyLabs was acquired by Apple in January or February 2025. The company then open-sourced their software so that anyone can use it. It is now open-source software available on GitHub where you can set it up yourself and use it.
What is most valuable?
WhyLabs's best features are real-time guardrails, PII personal information data detection, hallucination mitigation, and monitoring. It has a centralized dashboard so I can create a project and see an overall summary of the dashboards, and I can check the health metric on specific dates or specific times for WhyLabs or for the application. Additionally, it provides an alerting system. If there is an error or the system is down, it generates an alert via email.
Out of all those features, I find the PII detection and the monitoring most valuable in my day-to-day work because it is very hard to monitor an LLM application. As I mentioned earlier, it was a multi-agent system and a query can go from one agent to another agent very easily, which created problems in debugging how the request was progressing and how the data flow was happening. The monitoring and the PII detection of the guardrails are the three features most useful to me. Regarding the guardrails or the PII detection, if I do not want my PII data given to the agents or any LLM, this feature is particularly useful in that scenario.
WhyLabs has positively impacted my organization by reducing the error time and debugging time. It has increased and enhanced the user experience. When the application is down, I receive alerts, which has reduced a significant amount of time for my team.
What needs improvement?
Regarding how WhyLabs can be improved, since it is not available in the market as of now, improvements cannot be made to the product itself. However, there is an open-source version that anyone can set up on their machine and try to accomplish the same things.
I do not think there is anything else needed for improvement.
For how long have I used the solution?
I was using it in 2024 for around 1.5 years.
What was our ROI?
WhyLabs has saved my team time by 30 to 40%.
What other advice do I have?
Regarding WhyLabs's AI capabilities, I believe its governance and security are totally secured. It was deployed in our on-premises infrastructure, so all the data remains in our infrastructure only. The guardrails and the PII detection work perfectly. I have not seen any scenario where it has not generated an alert for PII data or the guardrails have not worked, so it performed very well.
In terms of WhyLabs's AI capabilities, I believe it is totally accurate. I used it for around 1.5 years, and it was the best software available, but it was discontinued. However, it was a very good software.
My advice to others considering WhyLabs is that as of now it is open-source, and you can set it up on your own machine for free and use it. It has very good features. I would rate this product a 10 out of 10.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
I built a monitoring solution with Whylabs for multiple ML models for a client of mine.
Model performance monitoring
Whylabs helped us setup end-to-end monitoring of our ML projects
* Tool allows for easy ingestion of big number of features and setting up initial monitoring on them
* We can use it to monitor both: input quality and model performance
* The alerts can be raised to specific group of users via specific channels (email/slack), which is helpful
* Some actions are not possible via UI and require specific API calls
* Documentation can be hard to navigate