Databricks Data Intelligence Platform
Databricks, Inc.External reviews
763 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Databricks’ Unified Platform: Fast SQL, Streamlined Pipelines, and Context-Aware AI
What do you like best about the product?
The unified platform experience is what keeps me on Databricks. Having notebooks, pipelines, SQL warehouses, ML, and governance all in one place under Unity Catalog means I’m not constantly stitching together five different tools just to get work done.
Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
What do you dislike about the product?
Cost management can be tricky—DBUs add up quickly if you’re not careful with cluster sizing and warehouse auto-scaling. The pricing model also isn’t always transparent, especially when you’re mixing serverless and classic compute.
Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
What problems is the product solving and how is that benefiting you?
Before Databricks, our data stack was fragmented — separate tools for ETL, analytics, ML, and governance. That meant constant context-switching, duplicated data, and governance gaps. Databricks consolidates all of that into one lakehouse platform. Delta Lake gives us reliable ACID transactions on the data lake, Unity Catalog handles lineage and access control across the board, and SQL warehouses let our analysts self-serve without needing a separate data warehouse product. It's cut our pipeline development time significantly and made data governance something we can actually enforce consistently instead of hoping for the best.
Databricks Genie A/BI and Genie Code: Amazing Features on My Favorite Platform
What do you like best about the product?
I think almost all the features, being MVP Databricks, are always my favourite platform. If one object I pick, then its Databricks Genie A/BI and Genie Code, so its genie ...... genie.... really amazing name and amazing feature.
What do you dislike about the product?
The Databricks team is not hiring me. I am one of the great, great fans of databricks.
What problems is the product solving and how is that benefiting you?
More visibility, making things easier and easier, data access is not a challenge for anyone.
Unified Lakehouse Powerhouse: Fast, Scalable Analytics in One Databricks Workspace
What do you like best about the product?
What I like best about Databricks is the unified lakehouse platform. Everything—ingestion with Auto Loader/Lakeflow, Delta Live Tables pipelines, Spark transformations, SQL analytics, MLflow experiments, and governance via Unity Catalog—lives in one workspace. No more tool sprawl. Delta Lake delivers reliable ACID transactions, time travel, and schema evolution on massive datasets, while Photon makes queries fly. Serverless compute simplifies scaling, and collaboration in notebooks/repos is seamless for data teams.
What do you dislike about the product?
What I dislike most is the cost. It can spike quickly with poorly tuned jobs, forgotten clusters, or over-provisioning—DBU pricing adds up fast even with optimizations. Cold starts on interactive clusters slow quick prototyping, and it's overkill (and expensive) for tiny datasets or simple queries. The Spark/Delta learning curve is steep for newcomers, and heavy use creates some vendor-specific lock-in.
What problems is the product solving and how is that benefiting you?
Databricks solves data silos, unreliable lakes, and fragmented tooling by providing a governed lakehouse where raw data becomes clean, queryable assets for BI and AI. This benefits me by cutting infra firefighting so I focus on pipelines and quality; for the business, it means faster insights, better data reliability, easier AI adoption, and less tool sprawl—delivering real value from petabyte-scale data without constant re-architecture. (248 chars)
Databricks Unifies Data Engineering, Science, and Analytics Exceptionally Well
What do you like best about the product?
The ability to converge data engineering, data science, and analytics on a single platform without compromising on governance, performance, or flexibility is still rare in the industry. Databricks executes this exceptionally well.
What do you dislike about the product?
Reducing the spinning time of all purpose clusters and job clusters. It would be more usefula nd helpful if it starts as quick as serverless
What problems is the product solving and how is that benefiting you?
In enterprise banking, where regulatory compliance, data accuracy, and operational resilience are non-negotiable, Databricks is solving some of our most critical challenges. As a Lead Data Engineer managing end-to-end ETL pipelines, dashboard delivery, monitoring alerts, and data governance for a major banking client, the platform has become the backbone of our modern data architecture. Databricks unifies our fragmented data landscape through Delta Lake and Unity Catalog, giving us ACID-compliant transactions for reliable ETL, automated lineage for audit-ready governance, and fine-grained access controls to protect sensitive PII and financial data—all while enabling seamless schema evolution to handle the constant changes in source systems. This directly translates to faster, more trustworthy reporting: our dashboards in Power BI and Tableau now pull from a single source of truth, eliminating metric disputes between Risk, Finance, and Compliance teams. On the operational side, native alerting integrated with Slack and PagerDuty, combined with Databricks System Tables for observability, lets us proactively catch data quality issues or SLA breaches before they impact business decisions—reducing incident resolution time by over 60%. Ultimately, Databricks isn't just improving our engineering efficiency; it's enabling us to innovate responsibly in a highly regulated environment, delivering trusted insights at scale while keeping auditors confident and stakeholders aligned.
Great Governance and UI—Databricks Fits Our ETL Workflow Perfectly
What do you like best about the product?
I like the overall environment, especially the governance features and the way the UI is handled. I primarily use Databricks as my ETL platform, and it fits well with how I work. The SDP job management governance and lineage capabilities are also helpful.
What do you dislike about the product?
Sometimes there are glitches in the UI. For example, if I cancel something, it takes a bit longer for that change to be reflected in the UI.
What problems is the product solving and how is that benefiting you?
It addresses centralized database and lakehouse management through Unity Catalog. It has also helped solve governance needs and improved lineage tracking.
Versatile Platform, But Needs Faster Analysis
What do you like best about the product?
I like Databricks because it allows me to perform multiple tasks on a single platform, which isn't possible with some other cloud service platforms. This functionality is particularly useful for managing database tasks efficiently and is a capability I can't find in other platforms.
What do you dislike about the product?
I have faced many times when there's a wrong thing in Databricks, and it takes some time to analyze. It could be better if they gave faster and more accurate answers.
What problems is the product solving and how is that benefiting you?
I use Databricks for migration projects. It allows me to perform multiple tasks on a single platform, which I can't do on other cloud platforms.
Comprehensive Platform with Room for Improvement
What do you like best about the product?
I find Databricks to be a one-stop solution because it incorporates various functionalities such as orchestrating pipelines. It also has an inbuilt AI called Genie, which helps in building jobs, and other AI-related tasks. I appreciate that compared to other providers like AWS and Azure, Databricks offers specific features that they lack, allowing me to use the database simply and access everything in one place. The initial setup was quite easy because I could use a single stop to directly implement and update tables using the data lakehouse, which is easier compared to others
What do you dislike about the product?
I think Databricks could improve on the orchestration part. Even though it has orchestration capabilities for pipelines and jobs, it misses the ease of access that something like Airflow provides, which is specifically designed for orchestration. It would be helpful if Databricks adopted a pattern similar to Airflow's for better orchestration and job linking. I also feel the Genie part could be improved. While the Genie works well, the output duration can be lengthy, usually taking more than five to ten minutes to perform specific tasks. So, I would like to see improvements in that area as well.
What problems is the product solving and how is that benefiting you?
I use Databricks as a one-stop solution for various tasks. It orchestrates pipelines and utilizes an inbuilt AI, making it more feature-rich than alternatives like AWS or Azure. This allows me to streamline workflows without relying on multiple providers.
Unified Data Engineering, Analytics, and ML on a Scalable Databricks Platform
What do you like best about the product?
What I like most about Databricks is how it brings data engineering, analytics, and machine learning together in one platform. It streamlines the entire data pipeline—from ingestion and transformation through to serving—so I don’t have to rely on multiple separate tools to get end-to-end workflows done.
Its integration with Spark and Delta Lake is another big plus, making it both scalable and dependable when working with large datasets.
Its integration with Spark and Delta Lake is another big plus, making it both scalable and dependable when working with large datasets.
What do you dislike about the product?
One challenge with Databricks is cost management and visibility. Since compute is abstracted through clusters and jobs, it can sometimes be difficult to track and optimize costs without additional monitoring or governance in place.
What problems is the product solving and how is that benefiting you?
Solves the problem of fragmented data ecosystems, where data engineering, analytics, and machine learning are handled in separate tools.
Databricks Brings Spark, Delta, and ML Together with Effortless Auto-Scaling
What do you like best about the product?
Databricks is hands down my favorite platform for data engineering because it brings everything together in one place Spark processing, Delta Lake, and ML tools all play nice without the usual headaches. The auto-scaling clusters save tons of time on big ETL jobs, like the SAP integrations I've done, letting me focus on logic instead of babysitting resources. Unity Catalog has been a game changer for governance in our lakehouse setups too.
What do you dislike about the product?
Costs can sneak up fast if you're not watching usage closely, especially with premium features on large pipelines. The notebooks are great for prototyping but get messy in production without strict discipline. Setup for advanced stuff like custom Unity Catalog policies sometimes feels overly complex for what it delivers.
What problems is the product solving and how is that benefiting you?
Databricks tackles key data engineering headaches like scaling massive Spark jobs, data quality issues, and siloed teams by providing a unified lakehouse platform with Delta Lake for ACID transactions and reliable pipelines. When I have a large number of files or tables to process like in supply chain ETL from SAP systems it shines with optimized Delta processing, serverless compute, and Photon engine, slashing run times from days to hours while cutting costs through auto-scaling. This benefits me directly by speeding up project delivery, reducing debugging time on failures, and enabling seamless collaboration with analysts on notebooks without tool switches.
Seamless Integration, Needs Performance Tuning
What do you like best about the product?
I think the most useful part of Databricks is its single architecture where you can have everything, like a database and dashboard, all in one. Compared to other providers like Azure or AWS, where I would need multiple services, Databricks offers everything in a single service. This simplifies my work because I don't have to manage integration or network level details across different services. The convenience of having everything inside Databricks means I can avoid multiple network updates when connecting with tools like Power BI, which makes it a standout feature for me. Additionally, the initial setup after migrating from Snowflake was pretty easy since Databricks allows us to manage access and security within a single service.
What do you dislike about the product?
Yeah, so one thing that needs to be updated is Genie code. If I look at it, Genie code is helpful for generating code but when it does in the back end, it consumes much memory. For example, if I'm opening Databricks in Chrome, it's gonna take at least one or two GB memory at the back end, and that takes a lot of time to generate the response as well. So if we could reduce that, it would be great. Also, on the pipeline stuff, for example, if you take Airflow, Airflow is specifically designed for our position. We use Airflow and I can see, for example, if I have thousands of jobs, I can see each and every job and what's happening. But with Databricks, it's a tough job for me to see the success and failures and to manage the charts. We have multiple options to monitor it in Databricks, but it's hard when compared with Airflow.
What problems is the product solving and how is that benefiting you?
Databricks helps us consolidate data from different locations into a single database, simplifying master data management and making data access easier with integrated dashboards, improving our AI-powered sales and prospect tracking.
showing 61 - 70