Almost a week back home from Grafanacon EU and I’m still dreaming of Poffertjes. With a sold-out crowd of over 350 attendees and launch of Grafana 5.0, the Grafana team knocked it out of the park. Everything about the venue, food, after party, video production, and overall design of the event was on point.
Things got off to a typically cold week in Amsterdam, but were quickly warmed up with talks from Booking.com, Ebay, InfluxData, and of course Grafana. Booth business was booming with conversations around how VictorOps works with microservices, Graphite, and of course, performance metrics and monitoring. One point of feedback that was especially flattering, was that people really appreciated the design of our product and how it could provide observability for other members of their team.
To this, of course I was excited, but began to wonder, why is design so important to data-driven teams and what does it mean for observability? Lucky for me, I was at Grafanacon and could go straight to the experts.
Where to Start?
Hmm… I guess the CEO and Co-founder of Grafana might have an opinion about design. So Raj, what does design mean for observability?
Too often observability, at least in the open source world, is focused on the backend tech and backend systems, but usability and design are not always the things or areas that people pay much attention to. At Grafana, we believe in democratizing metrics and making it easy for people who aren’t necessarily technical or might not be able to write a query to still ask a question and get an answer. Design and usability are really important but often underdeveloped and underappreciated.
Well, That’s It!
Question answered. Right? Well, no. Not necessarily. This makes perfect sense in the context of why Grafana is a great tool for visualizing your metrics to provide visibility for teams, but I guess the better question would have been, what the heck is observability in the first place? Paul, the CTO from InfluxData had his own opinion:
In general, observability is a property that you need to engineer in a system from day one. The thing most experts can agree on is that to achieve basic observability, you need to have metrics, logs, and tracing. Essentially, if you can combine those three things, then you have a system that is observable. In my mind, you can’t do monitoring without observability. As far as design goes, people who are using CI/CD systems would ideally have dashboards or visibility as soon as they deploy new code and would be able to look at it and agree that everything is okay.
Well said, Paul. If you’re a fan of Influx, make sure to check out his talk on “The Design of IFQL, the New Influx Functional Query Language”. Of course, monitoring and observability are the basis for the incident management, and because monitoring is well—hard, how is design important to VictorOps as we think about observability? Our VP of engineering, Dan Hopkins, and senior UX designer, Drew McKinney, laid it out very nicely:
I think it starts with the fact that this industry is really good at creating data and is not so good at turning that data into information. And where design comes [in handy] is for helping people help themselves. Because, they can so easily overwhelm themselves with data. You just look at that and say there is so much to look at, but I don’t know what that means and I don’t know how to hand this off to anyone else on my team because they won’t have context. If you can get the design team to come in and make it so that you can share this data with others outside the team, they’ve done their job. - Dan Hopkins
This is a classic UX problem. There are two things that we are doing to support observability within our platform. Observability within our platform means one of two things. One, bubbling up that information and two, making it relevant in the context that you need it. To us, taking the data and making it relevant is more about answering the question, what are they going to be doing next? - Drew McKinney
So What Does This All Mean?
In the world of monitoring and alerting, there are a lot of unknowns. People look for understanding in confusing data and are tasked with the responsibility of communicating that data throughout the entire organization. Democratizing data and design means making observability available to anyone. From back-end engineers to front-end engineers, from technical people to non-technical people, everyone must be able to access and consume the applicable metrics, logs, and traces that a system is producing. Design in observability enhances how metric gathering systems work and focuses on providing valuable context to the systems for everyone in the organization. We, as an industry, could always use more of this.
That’s why pairing tools like Grafana with VictorOps can help provide simple and lightweight representation to what may seem like insurmountable data. These tools help to highlight changes to production while mobilizing humans through awareness, transparency, and observability from deployment to final retro.