Unified data from diverse sources has created consistent client views and reshaped data strategy
What is our primary use case?
My main use case for Starburst Galaxy is to use it as a data federation tool, collect data from various data sources, and have a unified view of the data.
A quick specific example of how I use Starburst Galaxy for data federation in my daily work is that I assume I need data from five different data sources, and each data source is on a different database platform, and I have information that I need for my client profile. I can pull data from all those five different data sources and have a consolidated view of the client.
Those are the main use cases for Starburst Galaxy; basically, we are trying to build data products.
What is most valuable?
Starburst Galaxy is very SQL friendly, which stands out for me because I have used SQL in other platforms such as SQL Server, Teradata, and Oracle, so it is very portable with minor changes.
Another feature I appreciate in Starburst Galaxy is that it has object storage with Iceberg storage, which helps optimize data storage and also enables columnar search, which speeds up queries.
Starburst Galaxy has positively impacted my organization by allowing us to rethink the strategy for data and architect data differently; instead of having multiple data marts and siloed data marts, we have a unified vision, and that is how it is changing.
What needs improvement?
One way Starburst Galaxy can be improved is through AI enablement. I have not seen how the user interface is going to function or how users can interact with the data products on Starburst Galaxy using AI, so I am curious to know about that.
I chose a rating of eight because it has many good features, including data federation and the ability to write queries easily. I think there are areas of improvement with respect to AI adaptability, and also in general, the amount of connectors working with other tools are areas where it can be expanded.
For how long have I used the solution?
I have been using Starburst Galaxy for 18 months.
What do I think about the stability of the solution?
Starburst Galaxy is stable in my experience so far.
What do I think about the scalability of the solution?
I do not have enough visibility into the scalability of Starburst Galaxy, but I think we are adding more and more data sources into it, so I believe it is going to be scalable, though results are still pending.
How are customer service and support?
Starburst Galaxy customer support is good.
Which solution did I use previously and why did I switch?
Earlier, we were using traditional databases.
What was our ROI?
I am yet to see the hard numbers regarding return on investment, but I believe it will probably result in money saved and time saved.
What other advice do I have?
My advice to others looking into using Starburst Galaxy would be to first understand your current data environment and make sure that you have the right connectors that Starburst Galaxy can connect to those environments. Have a dedicated team from Starburst who can help you through all the installation and onboarding, and ensure all your personnel who are going to be working on that environment receive good training with proper use cases. I would also recommend using a sandbox in your environment and putting Starburst Galaxy in it so that you can get a taste of how it works with your data. I gave this product a rating of eight.
Outstanding Performance and Savings with Robust Governance
What do you like best about the product?
The query performance, governance features, and cost savings stand out when compared to other solutions such as AWS Athena.
What do you dislike about the product?
The user interface could use a more modern design and enhancements to make it easier to use. Additionally, the existing documentation is too basic for enterprise needs; users would benefit from more advanced, production-level examples instead of just beginner tutorials. Finally, the consumption-based pricing model makes it difficult to predict monthly expenses with accuracy.
What problems is the product solving and how is that benefiting you?
This product offers capabilities in data analytics, cataloguing, and data governance. It provides tools that help manage and organize data efficiently, supporting both analysis and oversight. Overall, it addresses key needs in handling and governing data within an organization.
Outstanding Support Team Makes All the Difference
What do you like best about the product?
The support team is very good. They are very helpful and informative
What do you dislike about the product?
I just feel that the documentation could be better. That's all
What problems is the product solving and how is that benefiting you?
Data querying, retrieval, enhancing decision making, ability to leverage AI and vector search tools.
Streamlined Data Analytics with Excellent Support
What do you like best about the product?
I like how Starburst gives us more insight about the query execution plan. It's really helpful in understanding how a query is going to execute. It also has lineage and audit features. I find it user-friendly for managing data security with RBAC and based security controls, which are necessary for data governance. Starburst is especially useful for running complex queries, especially when dealing with multiple joins and data source systems. It supports multiple languages, which I find very valuable. Setting up Starburst wasn't complex, and I had good support from the solution architect, which made the process successful.
What do you dislike about the product?
Sometimes I face challenges with the query editor. Whenever we have a big source code, it's not big enough to troubleshoot and run the query faster compared to a traditional SQL editor. I would definitely give feedback to the internal team to make it more user-friendly.
What problems is the product solving and how is that benefiting you?
Starburst solves data duplication issues, allowing access to any source without copying data for analytics. It's user-friendly for managing data security with RBAC and facilitates complex queries across multiple data sources, supporting multilingual analytics.
Fast Data Queries and Robust Access Controls
What do you like best about the product?
Starburst simplifies querying data quickly, while also offering useful controls for managing access.
What do you dislike about the product?
If Starburst offered more comprehensive documentation, particularly when new releases come out, it would make it much easier to understand the new features.
What problems is the product solving and how is that benefiting you?
Starburst makes it simple for us to give our customers convenient access to the data they require.
Effortless Data Federation and Granular Governance Made Easy
What do you like best about the product?
Effortlessly federating data is a standout feature, and the ability to apply data governance with a high level of granularity is impressive. I also appreciate how easy it is to use overall.
What do you dislike about the product?
One drawback is the absence of a built-in data processing and orchestration mechanism.
What problems is the product solving and how is that benefiting you?
The platform enables rapid data integration from a variety of sources, spanning different regions and cloud environments. It simplifies feature engineering for AI datasets, making the process more efficient. With a single platform, it delivers governed data to multiple stakeholders, ensuring consistency and control. Automated user access is streamlined through OKTA SAML setup, enhancing security and ease of use. Additionally, curated data can be provided directly to client storage, eliminating the need to store or transfer data locally.
Effortless AI Agent Creation with Robust Features
What do you like best about the product?
While creating AI agents its very easy to use starburst it already has most support and nice number of features. If you want to let AI agents use your data there are built in plugins controlling governance and ease.
What do you dislike about the product?
Lack of community support and reliability. Most of the companies prefer to go to Azure and have their own AI solution because main thing is about reliabilty.
What problems is the product solving and how is that benefiting you?
Creating data pipelines and ease of connecting AI agents with organisational data.
Unified data access improves analytics and simplifies complex processes
What is our primary use case?
I use Starburst Galaxy on AWS as a federated query engine to access our S3-based Iceberg data lake, Snowflake, and Redshift without duplicating data. This enables secure, high-performance analytics and machine learning workloads with consistent governance across all data sources.
How has it helped my organization?
Starburst Galaxy has improved our organization by unifying access to all major data sources, reducing the need for complex ETL processes. In addition to our original use case, it has proven fast and reliable for Iceberg table maintenance, and it has enabled ingestion of Kafka feeds into our AWS S3 data lake, further increasing its value to our data platform.
What is most valuable?
The features I value most are federated querying across S3 Iceberg, Snowflake, and Redshift; native Iceberg table management tools that make maintenance operations simple and performant; and the ability to connect directly to Kafka for streaming ingestion. The federated query capability has also enabled me to build a Sigma Computing dashboard that pulls data from Postgres, BigQuery, and Snowflake through a single Starburst Galaxy connection, greatly simplifying data access and integration.
What needs improvement?
I would like to see better alerting integrations for failures and errors in scheduled tasks and maintenance jobs. I also want support for more connectors such as Kinesis and Firehose, support for more file types such as Avro and JSON, and object storage message queue integration for object storage integrations. A single view of query execution and optimization details, rather than needing to toggle between the Galaxy and Trino UI, would be helpful. Additionally, enhanced control over account and environment variables that would be available in the Enterprise edition would be beneficial.
For how long have I used the solution?
I have used the solution for 1.5 years.
Which solution did I use previously and why did I switch?
I previously used several query engines, including Athena, EMR, Redshift, Snowflake, and BigQuery. Starburst Galaxy’s federated query capabilities allowed me to join data across clouds and platforms, reducing complexity.
What's my experience with pricing, setup cost, and licensing?
I recommend tracking usage metrics from the start, focusing on data scanned and query concurrency, so you can right-size spend. If workloads are steady, you should explore commitment-based pricing for better rates and factor in the operational savings from not having to manage and scale your own Trino or query infrastructure.
Which other solutions did I evaluate?
I reviewed several options including Databricks and Dremio. I was an early adopter of Snowflake and still use it as well. Starburst Galaxy was a better fit for my technology stack and developers.
What other advice do I have?
I have found that Starburst Galaxy’s flexibility makes it worth experimenting beyond the initial deployment plan. Features I originally viewed as secondary, such as Iceberg maintenance and Kafka ingestion, have become everyday tools. Building a strong relationship with the Starburst team has also helped me optimize configurations and discover new capabilities faster.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Platform reduces management overhead by deploying multiple clusters and tracking costs efficiently while enhancing performance with low-latency responses
What is our primary use case?
Starburst Galaxy serves as our primary SQL-based data processing engine, a strategic decision driven by its seamless integration with our AWS cloud infrastructure and its ability to deliver high performance with low-latency responses.
The platform provides a comprehensive suite of functionalities that significantly enhance the daily operations of our data engineers and data analysts.
How has it helped my organization?
Starburst Galaxy has been instrumental in reducing the maintenance effort and management overhead of our Trino cluster, which is particularly valuable given our lean platform team responsible for Kovi's data infrastructure.
The platform has enabled us to deploy multiple clusters for different purposes while providing clear cost tracking and utilization monitoring capabilities.
What is most valuable?
The most relevant functionalities today are cluster autoscaling for intensive load periods and automated metadata management through cleaning, compression, and orphaned file deletion in Iceberg.
These capabilities significantly reduce reading costs, storage expenses, and query processing overhead.
What needs improvement?
I maintain weekly conversations with Starburst's development and support teams, which provides me with visibility into the product roadmap and evolution.
Currently, my primary need is the impersonation functionality for BI solutions within Starburst clusters, which would enable enhanced access control and data governance capabilities.
For how long have I used the solution?
I have used the solution for almost 2 years.
Which solution did I use previously and why did I switch?
Previously, I utilized the AWS stack with Redshift and Athena.
I chose to migrate to Starburst Galaxy due to their expertise with Trino, superior aggregate cost structure compared to my previous solutions, and the rapid product evolution with new functionalities, problem corrections, and performance improvements.
What's my experience with pricing, setup cost, and licensing?
Since Starburst Galaxy's pricing model is simple to understand and easy to predict, there are no major secrets.
Everything is transparent and accessible through the product console.
The only point of attention is the S3 and transfer costs that should also be included when calculating the total cost.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Guaranteed performance transforms complex queries and empowers focus on feature delivery
What is our primary use case?
I use the solution for processing large simulation datasets into aggregated datasets that can either be used for real-time data analysis or stored for later analysis.
How has it helped my organization?
Starburst has provided us with virtually guaranteed performance on complex queries across datasets that are in the tens of gigabytes which complete in seconds. This allows me to concentrate on the features I want to deliver to our end users rather than diagnosing performance issues.
What is most valuable?
The most valuable features include taking care of the minutiae of Trino management so that it is well-optimized for our use case out of the box. Additionally, the ability to write to Apache Iceberg tables enables complex queries to be written to S3, avoiding the need for them to be re-run repeatedly.
I also find attribute-based access control valuable, as it allows end users to access only their data in a multi-tenant environment.
What needs improvement?
Multi-tenancy could be improved. In order to have multiple environments for SSO, we maintain multiple tenants that are connected to different AWS accounts via the Marketplace. On the AWS side this setup works because all accounts belong to the same organization. However, on the Starburst side these tenants are disconnected from each other, and it would be great if they could be connected and managed centrally.
Which solution did I use previously and why did I switch?
I previously used Amazon Athena. I switched because the performance offered by Starburst was significantly better than that provided by Athena. Additionally, Starburst allowed for integrations with BI tools, which was difficult to achieve with the necessary level of security in Athena.
What's my experience with pricing, setup cost, and licensing?
I recommend experimenting with different cluster sizes to determine what works best for your particular use case.
Which other solutions did I evaluate?
I considered Amazon Athena and Firebolt as alternative solutions.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)