My usual use case for Databricks as an end-user mostly involves exporting data. This typically entails writing directly into a web interface to get the data out, so probably with Python.
Databricks Data Intelligence Platform
Databricks, Inc.External reviews
External reviews are not included in the AWS star rating for the product.
Databricks Streamlined Our ETL Migration with Delta Lake and Unified Analytics
The most helpful features for me have been Delta Lake’s ACID transactions and schema evolution, which handle my sparse shipment loads really well. Unity Catalog has also been a big win because it eliminates the back-and-forth of RDS access tickets by enabling governed table sharing. On top of that, Genie turns natural-language requests into production-ready Spark SQL almost instantly.
On the upside, autoscaling clusters have cut costs by about 70% compared with ADF’s always-on pipelines. I also like being able to combine PySpark and SQL in a single notebook, which makes complex joins and subqueries much easier to manage. And I don’t miss the old NOLOCK hint debates—built-in optimizations take care of that.
If you’re migrating ETL pipelines, Databricks removes a lot of the SQL-to-cloud friction while still scaling to enterprise volumes without breaking the bank.
Fast, Governed Self-Service Data Exploration with Databricks Genie
My laptop can hang or become noticeably sluggish when I’m working with multiple Genie tabs and dashboards at the same time, especially during heavier queries or more demanding visualizations. This hurts the overall user experience and can slow down iterative development and analysis.
Latency with complex data models
With very wide schemas or more complex semantic models, Genie sometimes selects suboptimal joins or an overly broad/narrow level of granularity. As a result, I still need to review the generated SQL and optimize it myself. In that sense, it remains a helpful assistant rather than a fully autonomous query engine.
Overall, Genie helps me talk with my data in natural language, improves how quickly we uncover insights, and supports better data‑quality practices—though working across many Genie‑backed tabs can strain local hardware and sometimes slow down the workflow.
Revolutionized HR Analytics with Genie, Minor Cost Concerns
Improved data governance has enabled sensitive data tracking but cost management still needs work
What is our primary use case?
What is most valuable?
The most significant benefit Databricks has brought to my company is the Unity Catalog. Previously, with our data warehouse, we weren't able to track where sensitive data was. The Unity Catalog has been a big improvement, even though we haven't gotten the rest right.
The user interface is very useful, especially in writing directly into a web interface.
From my perspective, the ability to export data effectively and use Python within Databricks are key valuable features.
What needs improvement?
I believe we could improve Databricks integration with cloud service providers. The impact of our current integration has not been particularly good, and it's becoming very expensive for us. The inefficiencies in our implementation, such as not shutting down warehouses when they're not in use or reserving the right number of credits, have led to increased costs.
We made several beginner mistakes, such as not taking advantage of incremental loading and running overly complicated queries all the time. We should be using ETL tools to help us instead of doing it directly in Databricks. We need more experienced professionals to manage Databricks effectively, as it's not as forgiving as other platforms such as Snowflake.
I think introducing customer repositories would facilitate easier implementation with Databricks.
For how long have I used the solution?
I have been working with Databricks for the last six months.
What do I think about the stability of the solution?
As a platform, Databricks is fine. However, our implementation isn't particularly reliable.
We've suffered from the lack of professionals with previous experience, which makes it difficult to dig ourselves out of the situation we've found ourselves in.
What do I think about the scalability of the solution?
The scalability level of Databricks at the moment exceeds our needs. It's not a problem for us.
The sky's the limit with Databricks.
How are customer service and support?
I have addressed technical support about our issue with Databricks. It was the team that engaged with them, and I believe our development teams also reach out for support, though I'm not sure what level of support they get.
Previously, when using Snowflake, we had customer reps who were really knowledgeable and helped us to avoid beginner mistakes. With Databricks, it seems we could have benefited from similar support. Our implementation team had no experience and made obvious mistakes. It may be that we opted not to have that support, but I believe we should have.
Which solution did I use previously and why did I switch?
Before Databricks, I used SQL Server.
The big decision to switch from SQL Server to Databricks was motivated by the lack of auditing, lineage, and tracking sensitive data in SQL Server, along with a need for more flexibility.
How was the initial setup?
I did not participate in the initial setup of Databricks.
What about the implementation team?
We use a consultancy, Avanade, for our Databricks implementation. They had previously done a Databricks implementation for another part of our organization. Our implementation team lacked experience which resulted in several beginner mistakes.
What was our ROI?
So far, we're not measuring any return on investment, such as saving time, money, or resources with Databricks. We're still in the phase where our old system and the new system are running simultaneously, so everything is twice as expensive and much effort is doubled. We haven't progressed far enough yet to realize any ROI.
What's my experience with pricing, setup cost, and licensing?
I believe that in terms of credits for Databricks, we're spending between £15,000 and £20,000 a month.
I think Databricks is priced correctly. If we managed our resources better, we wouldn't be paying anywhere near that amount. The issue is with our management of resources.
Which other solutions did I evaluate?
No other options were considered because we used the consultancy Avanade, who had done a previous Databricks implementation for another part of our organization. We used them to recreate our implementation.
What other advice do I have?
I'm probably not the best person to discuss certain aspects of Databricks since I haven't explored it deeply and am not part of the team developing it.
We haven't utilized Databricks' machine learning capabilities.
From my company, data ingestion and transformation are done with Databricks, though I don't do it directly.
I don't use Databricks' features for managing data, such as data lake and warehouse operations.
Most of our current work with Databricks isn't really live yet, so measuring savings in time and money or identifying any return on investment isn't applicable right now.
I would rate this review a 7 overall.