Databricks Data Intelligence Platform
Databricks, Inc.External reviews
763 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Unified ML Platform That Removes Infrastructure Friction
What do you like best about the product?
The unified platform experience is genuinely hard to beat — having MLflow for experiment tracking, Unity Catalog for governance, vector search, and serverless endpoints all in one place removes so much infrastructure friction. Feature engineering pipelines and model deployment feel cohesive rather than stitched together. The SQL warehouse + notebook hybrid workflow also makes it easy to hand off between data engineering and ML work without context switching tools.
What do you dislike about the product?
Serverless endpoints have some sharp edges — Spark context initialization behaves differently than in interactive clusters, which can cause silent failures if you're not careful about where you initialize things. Cold start latency on serverless is also noticeable for low-traffic production endpoints. Documentation around some of the newer features (like vector search index configs) tends to lag behind the actual product behavior, so you end up doing a lot of trial and error.
What problems is the product solving and how is that benefiting you?
We use Databricks to consolidate ML model development, feature engineering, and deployment for a cards and payments platform — work that previously required juggling separate tools for data processing, training, and serving. The unified environment means our ML engineers can go from raw transaction data to a deployed churn prediction model without leaving the platform. MLflow tracking keeps experiments reproducible, and Unity Catalog gives us the data governance story our banking client needs. It's cut down a significant amount of the coordination overhead that comes with multi-tool ML pipelines.
Unifies Data Processing with Delta Lake's Reliability
What do you like best about the product?
I use Databricks in my enterprise environment and projects to ingest data from multiple sources, transform and clean it at scale, and prepare reliable datasets for analytics and reporting. It allows me to build and manage data pipelines efficiently using Spark, SQL, and notebooks. I love having data ingestion, large-scale processing, analytics, and collaboration all in one place, making my workflow much more streamlined and efficient. I really value the reliability and confidence I get from features like Delta Lake, which make data versioning, recovery, and handling changes much safer, cheaper, and easier in my projects. Delta Lake is one of the main reasons Databricks is so valuable to me because it directly addresses reliability and trust, which are constant challenges in real data projects. The ability to rollback to a previous version if something goes wrong makes me much more confident when developing, testing, or deploying changes to production pipelines. Additionally, the initial setup was relatively straightforward because Databricks integrates well with our existing cloud infrastructure.
What do you dislike about the product?
The learning curve can be quite steep at the beginning, especially for users who are new to Spark or large scale data processing concepts. Debugging complex pipelines or job failures can sometimes be time-consuming, when error messages are not very intuitive. As workflows and environments grow, governance and environment management can require extra effort to keep everything well-organized and consistent. Cost management is another challenge, as resource usage can increase quickly if clusters and jobs are not configured or monitored carefully.
What problems is the product solving and how is that benefiting you?
I use Databricks to solve fragmentation and inefficiency in my data flow, handling ingestion, transformation, analytics, and collaboration on one platform. It reduces operational overhead, ensures data quality, and offers scalability, improving large data processing without infrastructure worries.
Databricks Genie and AgentBricks Make “Talk to Data” Easy
What do you like best about the product?
In databricks I like genie and agentbricks that help me to solve business process as talk to data
What do you dislike about the product?
I think all the functionality works as expected for me.
What problems is the product solving and how is that benefiting you?
It’s mainly about giving my business users more flexibility to talk directly to the data and run their own analysis without needing to know SQL.
Databricks Saves Time with Smooth, High-Performance Data Pipelines
What do you like best about the product?
Databricks saves time by automating data pipelines, improving performance, and reducing infrastructure management.
Overall, it provides a smooth experience for building, analyzing, and deploying data solutions.
Overall, it provides a smooth experience for building, analyzing, and deploying data solutions.
What do you dislike about the product?
Databricks provides strong capabilities for large‑scale data processing and collaboration, but there are areas for improvement.
What problems is the product solving and how is that benefiting you?
We use Databricks for building and managing large‑scale data pipelines and analytics workloads.
It helps us process high‑volume data faster by using scalable Spark clusters and automated workflows.
It helps us process high‑volume data faster by using scalable Spark clusters and automated workflows.
Databricks’ Unified Platform: Fast SQL, Streamlined Pipelines, and Context-Aware AI
What do you like best about the product?
The unified platform experience is what keeps me on Databricks. Having notebooks, pipelines, SQL warehouses, ML, and governance all in one place under Unity Catalog means I’m not constantly stitching together five different tools just to get work done.
Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
What do you dislike about the product?
Cost management can be tricky—DBUs add up quickly if you’re not careful with cluster sizing and warehouse auto-scaling. The pricing model also isn’t always transparent, especially when you’re mixing serverless and classic compute.
Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
What problems is the product solving and how is that benefiting you?
Before Databricks, our data stack was fragmented — separate tools for ETL, analytics, ML, and governance. That meant constant context-switching, duplicated data, and governance gaps. Databricks consolidates all of that into one lakehouse platform. Delta Lake gives us reliable ACID transactions on the data lake, Unity Catalog handles lineage and access control across the board, and SQL warehouses let our analysts self-serve without needing a separate data warehouse product. It's cut our pipeline development time significantly and made data governance something we can actually enforce consistently instead of hoping for the best.
Databricks Genie A/BI and Genie Code: Amazing Features on My Favorite Platform
What do you like best about the product?
I think almost all the features, being MVP Databricks, are always my favourite platform. If one object I pick, then its Databricks Genie A/BI and Genie Code, so its genie ...... genie.... really amazing name and amazing feature.
What do you dislike about the product?
The Databricks team is not hiring me. I am one of the great, great fans of databricks.
What problems is the product solving and how is that benefiting you?
More visibility, making things easier and easier, data access is not a challenge for anyone.
Databricks Unifies Data Engineering, Science, and Analytics Exceptionally Well
What do you like best about the product?
The ability to converge data engineering, data science, and analytics on a single platform without compromising on governance, performance, or flexibility is still rare in the industry. Databricks executes this exceptionally well.
What do you dislike about the product?
Reducing the spinning time of all purpose clusters and job clusters. It would be more usefula nd helpful if it starts as quick as serverless
What problems is the product solving and how is that benefiting you?
In enterprise banking, where regulatory compliance, data accuracy, and operational resilience are non-negotiable, Databricks is solving some of our most critical challenges. As a Lead Data Engineer managing end-to-end ETL pipelines, dashboard delivery, monitoring alerts, and data governance for a major banking client, the platform has become the backbone of our modern data architecture. Databricks unifies our fragmented data landscape through Delta Lake and Unity Catalog, giving us ACID-compliant transactions for reliable ETL, automated lineage for audit-ready governance, and fine-grained access controls to protect sensitive PII and financial data—all while enabling seamless schema evolution to handle the constant changes in source systems. This directly translates to faster, more trustworthy reporting: our dashboards in Power BI and Tableau now pull from a single source of truth, eliminating metric disputes between Risk, Finance, and Compliance teams. On the operational side, native alerting integrated with Slack and PagerDuty, combined with Databricks System Tables for observability, lets us proactively catch data quality issues or SLA breaches before they impact business decisions—reducing incident resolution time by over 60%. Ultimately, Databricks isn't just improving our engineering efficiency; it's enabling us to innovate responsibly in a highly regulated environment, delivering trusted insights at scale while keeping auditors confident and stakeholders aligned.
Versatile Platform, But Needs Faster Analysis
What do you like best about the product?
I like Databricks because it allows me to perform multiple tasks on a single platform, which isn't possible with some other cloud service platforms. This functionality is particularly useful for managing database tasks efficiently and is a capability I can't find in other platforms.
What do you dislike about the product?
I have faced many times when there's a wrong thing in Databricks, and it takes some time to analyze. It could be better if they gave faster and more accurate answers.
What problems is the product solving and how is that benefiting you?
I use Databricks for migration projects. It allows me to perform multiple tasks on a single platform, which I can't do on other cloud platforms.
Unified Data Engineering, Analytics, and ML on a Scalable Databricks Platform
What do you like best about the product?
What I like most about Databricks is how it brings data engineering, analytics, and machine learning together in one platform. It streamlines the entire data pipeline—from ingestion and transformation through to serving—so I don’t have to rely on multiple separate tools to get end-to-end workflows done.
Its integration with Spark and Delta Lake is another big plus, making it both scalable and dependable when working with large datasets.
Its integration with Spark and Delta Lake is another big plus, making it both scalable and dependable when working with large datasets.
What do you dislike about the product?
One challenge with Databricks is cost management and visibility. Since compute is abstracted through clusters and jobs, it can sometimes be difficult to track and optimize costs without additional monitoring or governance in place.
What problems is the product solving and how is that benefiting you?
Solves the problem of fragmented data ecosystems, where data engineering, analytics, and machine learning are handled in separate tools.
Databricks Brings Spark, Delta, and ML Together with Effortless Auto-Scaling
What do you like best about the product?
Databricks is hands down my favorite platform for data engineering because it brings everything together in one place Spark processing, Delta Lake, and ML tools all play nice without the usual headaches. The auto-scaling clusters save tons of time on big ETL jobs, like the SAP integrations I've done, letting me focus on logic instead of babysitting resources. Unity Catalog has been a game changer for governance in our lakehouse setups too.
What do you dislike about the product?
Costs can sneak up fast if you're not watching usage closely, especially with premium features on large pipelines. The notebooks are great for prototyping but get messy in production without strict discipline. Setup for advanced stuff like custom Unity Catalog policies sometimes feels overly complex for what it delivers.
What problems is the product solving and how is that benefiting you?
Databricks tackles key data engineering headaches like scaling massive Spark jobs, data quality issues, and siloed teams by providing a unified lakehouse platform with Delta Lake for ACID transactions and reliable pipelines. When I have a large number of files or tables to process like in supply chain ETL from SAP systems it shines with optimized Delta processing, serverless compute, and Photon engine, slashing run times from days to hours while cutting costs through auto-scaling. This benefits me directly by speeding up project delivery, reducing debugging time on failures, and enabling seamless collaboration with analysts on notebooks without tool switches.
showing 21 - 30