Real-time monitoring has improved performance tracking and has simplified analyzing complex metrics
What is our primary use case?
I work in data analytics with experience in monitoring systems and working with large-scale data. I have used Splunk Observability Cloud in the context of real-time monitoring and performance tracking.
Splunk Observability Cloud works well alongside Splunk Enterprise for logs and integrates with cloud platforms and monitoring tools. It is often used together with other observability solutions. The tracking metrics such as latency, error, and throughput are easily visible. I can also build dashboards for real-time visibility.
We use Splunk Observability Cloud to track latency metrics and identify where slowdowns are happening. We have visualized response time trends and quickly detected performance degradation. We have also used it for infrastructure monitoring. Over the past six months, we have been monitoring metrics such as CPU usage and memory. If there is unusual usage, we identify it quickly using this tool and take action before it impacts our performance.
What is most valuable?
Splunk Observability Cloud has optimized our solutions and helped us understand the metrics. The AI-powered guidance in Splunk Observability Cloud helps us identify patterns and anomalies in system performance data. Instead of manually going through a large volume of metrics, it highlights unusual behavior and potential issues automatically. This makes it easier to detect problems early and understand where to focus, especially in complex systems.
There is definitely log analysis and dashboards. Log monitoring and dashboards have been better using Splunk. Splunk Observability Cloud is the best tool for log monitoring and dashboards. Splunk Observability Cloud feels more focused on real-time metrics and performance tracking compared to some other traditional log-based tools.
What needs improvement?
The learning curve for understanding all features should be improved, and the cost can increase. Splunk Observability Cloud is very costly. Cost is one of the drawbacks.
Sometimes too many alerts, if not configured properly, is a major drawback that could be improved.
The prices are quite high. As I have mentioned earlier, we are Splunk partners, so this has been handled by my other team. However, for other companies and small startups, the prices are very high for them to use Splunk Observability Cloud. Price is a concern.
For how long have I used the solution?
I have been working with Splunk Observability Cloud for the past six to eight months.
What do I think about the scalability of the solution?
We have expanded our team and usage. We are scaling up right now from ten people to twenty-five or thirty. Over time, I expanded my usage by going through basic monitoring and exploring things like setting up custom dashboards. We have gradually expanded our usage from setting up dashboards and alerts.
How are customer service and support?
For customer service, I would rate them eight out of ten because whenever we raise a support case, they are always available for us.
For Splunk real user monitoring, implementation took time because our engineers tried very hard. In case of support, there should be more engineers specifically for this case.
Which solution did I use previously and why did I switch?
We have used different products like Palo Alto and Cribl before moving to Splunk Observability Cloud. As we got a partnership, we have shifted to Splunk Observability Cloud.
What was our ROI?
The information is confidential and I cannot share specific details. However, I can tell you in percentage that fifty to sixty percent of our work has been easy to identify in terms of performance metrics and performance using Splunk Observability Cloud.
It has saved us thirty to forty percent in cost because we used some other tools before that were more costly. As we are Splunk partners, we obtained Splunk Observability Cloud, and our costs have been reduced by thirty to forty percent using this solution.
What other advice do I have?
My overall impression of using Splunk Observability Cloud is that it is a strong tool for real-time monitoring. It does take some time to get fully comfortable with all the features. We have not explored everything right now, but in the future, we are looking forward to using more features.
A part of the implementation has been handled by my other team. I have explored using custom metrics to enrich observability data, mainly by adding application layer or business-related metrics alongside system metrics. I have used custom metrics in a limited way to add more context to monitoring, such as tracking application-specific metrics alongside system data.
Dashboard customization in Splunk Observability Cloud is quite flexible. We care about metrics in different types of visualization, and it helps us organize them in a way that makes sense for monitoring. It allows us to build dashboards tailored to specific use cases. This makes it easier to monitor system performance and quickly identify issues without going through unnecessary data.
The integration in real user monitoring from Splunk Observability Cloud is actually better than from some other tools. If you are looking for the best SIM tool, then Splunk Observability Cloud is for you. If you have funds and capability for the cost, then Splunk Observability Cloud is definitely the best tool you can use.
I have given this review an overall rating of nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Observability has exposed tracing gaps and inconsistent metrics while still mapping complex services
What is our primary use case?
In my organization, we have 150 to 160 applications yearly with different frameworks including .NET, Java, and Python based applications. All of them are hosted on different types of servers such as Windows, Linux, ECS, and EKS. With respect to deployments, we integrated Splunk Observability Cloud. Previously, we used Prometheus and Grafana. My organization considered Splunk Observability Cloud to be a premium side of observability, so they switched from our previous solution.
We use the tracing feature in Splunk Observability Cloud.
What is most valuable?
I appreciate the service map and APM in Splunk Observability Cloud the most. This is the main feature I value. The interface is completely UI based, so I can see the complete service map, observe the latency present, and view complete metadata for a particular service or any database-related service. The service map enables a 3D view of the complete application architecture.
With respect to the effectiveness of Splunk Observability Cloud in improving digital resilience within the organization, it was quite similar to other third-party tools. The main distinction is that it has some improved security. We use SignalFlow queries, and with respect to those queries, we work with alerts and the dashboarding part. I can say it provides efficiency with improved security compared to other third-party tools, but in terms of usage, it is quite similar to Prometheus and Grafana.
What needs improvement?
I want to address a disadvantage regarding the service map showing misinformation with respect to latency, which relates to data reliability pulled from AWS cloud or on-premise servers. We saw issues with latency because Splunk APM app shows different data than Prometheus and Grafana. We tried to get premium support and on-call support with Splunk, and they were helpful in troubleshooting, but they ended up with no solution.
Performance with Splunk Observability Cloud is acceptable to me, but the modifications required by users are problematic. I had to build the complete alerting system and monitoring system, which had to be changed. The way they designed this is not optimal. If I compare with Prometheus, we can import and export dashboards, but here we face errors with dialogue boxes. We tried with technical support calls about this, but they were unable to solve it, so I do not understand why export and imports are not functioning.
The overall impression of the no-sample tracing feature in Splunk Observability Cloud, specifically in terms of eliminating blind spots in data collection, is that it needs improvement because the data is not adequate compared to other third parties. We get disturbance in the dashboards and charts while trying to correlate data. The mechanism functions differently manually than it does with a SignalFlow query, and both should be equal. We are unable to replicate from manual processes to the automation method, which is the issue.
The SignalFlow query feature in Splunk Observability Cloud needs improvement because it should function the same as manual processes. When we configure manual queries and then configure them via SignalFlow, they give different outputs. We tried with on-call support about this, but they were unable to address it, indicating there is a bug with the queries that needs improvement.
For enhancements, I would like to see improvements in the OTEL agents, OTEL collectors, and other features in Splunk Observability Cloud. The guidelines in the official documentation are not working at all. We have to deploy processes in our own way, and the documentation works only in 60 percent of the conditions, leaving the remaining 40 percent as problematic and needing improvement.
For how long have I used the solution?
I have used Splunk Observability Cloud for nearly one to one and a half years.
What do I think about the stability of the solution?
I experienced a downtime with Splunk Observability Cloud one time. We were unable to access it for nearly one day, which took a lot of time to resolve. Normally, other tools do not take as much time, and I do not understand why Splunk took so long. From the vendor's end, they should address such issues in a much shorter timeframe. When downtime occurs, it raises concerns about how we measure and receive alerts, as everything needs to be in place.
What do I think about the scalability of the solution?
In terms of lowering the cost of unplanned digital downtime using Splunk Observability Cloud, I found that many users report it is expensive, especially at a large scale, which can be a concern for organizations with tight budgets. At a large scale it is good, but for start-ups and some medium-range companies, it is expensive and they cannot afford it, especially as the cost increases with respect to data volume and retention needs.
How are customer service and support?
Support wise, there are two kinds of support for Splunk Observability Cloud: bi-weekly support and on-call support, with one more being premium support. They need to decrease the price of premium on-call support because as an employee, we require credits to get premium support, and our organization does not have many credits. That is a point where it lagged, but with respect to the bi-weekly calls and on-call support, it was acceptable. Out of five, I can give three for normal support, and four for premium call support.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, we used Prometheus and Grafana.
Which other solutions did I evaluate?
In comparing Splunk Observability Cloud to other observability platforms I have worked with, I find no key differences in both pros and cons. The integration process is the same across the board, and I feel there is not a real differentiator, as everything is similar in terms of custom dashboards and APM features.
What other advice do I have?
We miss the synthetic monitoring and AI-related features in Splunk Observability Cloud, which I think means front-end monitoring. We touch only the main AWS monitoring and service map, APM, and that is what we are using.
Regarding the ability to enrich data with custom metrics in Splunk Observability Cloud, we configured our breaches based on application performance only. Every application has different SLAs and SLOs, and according to each application, we have configured alerts using baselines that got triggered. We correlate this with multiple factors, such as Java-based memory leaks or garbage collections, and we generate custom metrics with alerts for notification purposes, employing the Webhook URL of Microsoft Teams and Outlook.
The out-of-the-box customizable dashboards provided by Splunk Observability Cloud are effective in showcasing IT performance to business leaders. It offers a nice point, as when we correlate different charts, I get so many x-axis and y-axis options, and we can correlate with other metrics. We have formulas there to find ratios and averages, which was a nice experience offering so many options. We are using the f(x) functions with respect to maximum, minimum, and averages, which are quite good.
On a scale of one to ten where ten is the best, I would rate Splunk Observability Cloud differently. For the UI part, I would rate it an eight, but for the configuration part, I would rate it three to four, as the configuration and integration aspects are not good at all. Overall, I would rate Splunk Observability Cloud a three out of ten.
Which deployment model are you using for this solution?
On-premises
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Monitoring has improved operational visibility and supports fast, customizable alert dashboards
What is our primary use case?
I work for a managed service provider, so I have different clients that require help in assessing various tools. I work with Splunk, ScienceLogic, and Nagios most frequently because I have small clients as well.
We have Splunk Observability Cloud for some customers. The dashboards are good, and everything is nice, but unfortunately, it doesn't have long-term storage of the logs. So you need to use a data lake to store the logs.
I would like to see agentless deployment and better integration with ticketing systems like ServiceNow, which is the biggest.
We utilize the ability to enrich data with custom metrics in Splunk Observability Cloud to create tickets in ServiceNow. It is integrated with ServiceNow, but we enrich the tickets by putting the logs in the tickets and things of that nature, so it helps us. However, even that is a mixed approach. From Splunk Observability Cloud, you cannot put the logs directly in the tickets. Instead, it will create a ticket and send you an email with the logs. That integration could be improved.
What is most valuable?
Splunk Observability Cloud has helped me improve my operational performance and my customer's operational performance because we use alerting, so we find when things are not working.
I think it is very good for evaluating the effectiveness of Splunk Observability Cloud in improving digital resilience within my customer's environment.
It does provide some return on investment. It is beneficial in terms of finance to use it.
The dashboards in Splunk Observability Cloud are amazing. If you configure them correctly, they are amazing, and it is quite fast as well.
That is a very good feature of Splunk Observability Cloud because it helps us and it gives more trust in the alerts.
What needs improvement?
There are not complexities with the installation of Splunk Observability Cloud, but with the configuration of alerts and everything because Splunk has its own language in the background. You need to know Splunk in order to configure everything that you want.
It requires some in-depth knowledge of the product. It should be more plug-and-play, similar to ScienceLogic. ScienceLogic uses whatever it finds. You can use PowerShell, you can use scripts that you make. Splunk is more on the old style. It uses agents, and you have to deploy the agents.
The out-of-the-box customizable dashboards provided by Splunk are okay, but usually, I have to create new dashboards because every user wants to see something else. The out-of-the-box dashboards help to get started faster, but in the end, I will have to redo them.
I would like to see agentless deployment and better integration with ticketing systems such as ServiceNow, which is the biggest.
We utilize the ability to enrich data with custom metrics in Splunk Observability Cloud to create tickets in ServiceNow. It is integrated with ServiceNow, but we enrich the tickets by putting the logs in the tickets and things of that nature, so it helps us. However, even that is a mixed approach. From Splunk Observability Cloud, you cannot put the logs directly in the tickets. Instead, it will create a ticket and send you an email with the logs. That integration could be improved.
For how long have I used the solution?
I have been working with Splunk Observability Cloud for about two years.
What do I think about the stability of the solution?
I cannot speak to lowering the cost of unplanned digital downtime using Splunk Observability Cloud because the client will get the bills. However, it reduces the downtime for systems. It improved visibility when you do changes and you do patching and you do emergency changes, so you can see if they were applied correctly or not, if the servers are still down.
What do I think about the scalability of the solution?
If it is a new deployment and you have a medium client with about 2,000 users or computers or servers, it will take about six months just to install and configure.
How are customer service and support?
The technical support is very good with Splunk.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I worked with ScienceLogic before actually working with Splunk.
How was the initial setup?
There are not complexities with the installation of Splunk Observability Cloud, but with the configuration of alerts and everything because Splunk has its own language in the background. You need to know Splunk in order to configure everything that you want.
What about the implementation team?
I do not spend any time personally because I have a team that does it. I have 27 people in my team.
What was our ROI?
It does provide some return on investment. It is beneficial in terms of finance to use it.
What's my experience with pricing, setup cost, and licensing?
I think the pricing for Splunk Observability Cloud is still at a good price. If you are looking at Dynatrace, it is way higher.
Which other solutions did I evaluate?
I am familiar with the Dynatrace operator but I am not actually working with them. I am just looking into differences and tooling and what will benefit my clients better.
What other advice do I have?
You need to know Splunk in order to configure everything that you want.
The out-of-the-box customizable dashboards provided by Splunk are okay, but usually, I have to create new dashboards because every user wants to see something else. The out-of-the-box dashboards help to get started faster, but in the end, I will have to redo them.
We utilize the ability to enrich data with custom metrics in Splunk Observability Cloud to create tickets in ServiceNow. It is integrated with ServiceNow, but we enrich the tickets by putting the logs in the tickets and things of that nature, so it helps us. However, even that is a mixed approach. From Splunk Observability Cloud, you cannot put the logs directly in the tickets. Instead, it will create a ticket and send you an email with the logs. That integration could be improved.
I would rate this product an 8 overall.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?