SLO Monitoring

Goal & Context

Close the competitive gap

SLO tracking was becoming table stakes in observability. Competitors had shipped; our customers tracked reliability in spreadsheets.

More Less

Deliver native Service Level Objective (SLO) monitoring capabilities that would match and exceed competitor offerings while leveraging Sumo Logic's existing alerting and data visualization platform strengths. The solution needed to integrate seamlessly with our unified monitoring ecosystem and provide teams with the reliability metrics they needed to track service health over time.

SLO tracking was becoming table stakes in the observability market. Competitors like Splunk, New Relic, and Datadog had already shipped SLO products, and customers were starting to ask why we hadn't.

Beyond competitive pressure, our customers had a real need. Engineering teams tracked reliability in spreadsheets and tribal knowledge—never connected to the real-time monitoring data already flowing through our platform. Error budgets existed in theory but not in practice.

SLO dashboard with platform navigation

System flow: SLOs integrated with monitors and alerts

Team & Role

Leading design, growing talent

Led design as Staff Designer with end-to-end ownership. Mentored a junior designer who earned a promotion through this project.

More Less

I led design as a recently promoted Staff Designer with end-to-end ownership of the design direction. The core leadership team included a Senior Director of Product Management, a Principal Engineer, and a Senior Engineering Manager.

This was also a mentorship opportunity. I worked with a junior designer on her first major project, having her handle production work while including her in all leadership-level discussions. The goal was to grow her project management capabilities alongside the design work. By project end, she had earned a promotion.

SLO list views with status indicators and filtering

Process

Making SLO concepts understandable

The hardest design problem was conceptual, not technical. Industry terminology described "good" and "bad" events — a framing that user research proved was confusing.

More Less

The hardest design problem wasn't technical — it was conceptual. Industry terminology described service level indicators as tracking "good" or "bad" events. This framing felt wrong from the start: these aren't moral judgments about system behavior.

During research interviews with enterprise customers, we validated this concern. One participant articulated what we'd been sensing: these are markers for desirable or undesirable behavior, not moral categories. We reframed the terminology as "successful" and "unsuccessful" events — a small change that made the concept significantly more intuitive for users new to SLO practices.

This insight came from our quarterly research cadence: initial needs interviews, progress reviews with working prototypes, and acceptance testing during development. The team had agreed to take the time needed to get this right the first time.

SLO configuration with reframed terminology

Query builder with progressive disclosure

Solution

Native SLO monitoring

SLOs as first-class citizens: platform-integrated, progressively disclosed, with glanceable dashboards and burn rate alerting.

More Less

Rather than building a standalone SLO tool, we designed SLOs as a first-class citizen of our monitoring platform. We built on the guided configuration flow already established in Monitor Configuration, with SLO setup accessible from anywhere in the product. Log searches and metrics could flow to either traditional alerts or long-term SLO tracking.

Design approach

Platform integration. SLOs built on existing monitors and alerts — no duplicate configuration, no data silos.
Progressive disclosure. Simple defaults for teams just starting out, with full power available for advanced users.
Glanceable dashboards. At-a-glance status for every SLO: are we healthy, burning fast, or already in trouble?
Burn rate alerting. Proactive warnings when error budget consumption suggests a breach is coming.

Dashboard views: overview and detail with error budget charts

Burn rate alert configuration

Outcome

Competitive and customer success

Closed a key competitive gap. Sales reported stronger positioning; adoption grew steadily as teams expanded from initial experiments.

More Less

The launch closed a key competitive gap. Sales teams reported stronger positioning against Splunk and Datadog in enterprise deals with our compelling, integrated solution.

Customer adoption followed a gratifying pattern: teams often started with just one or two SLOs to test the waters, then quickly expanded once they recognized the value. Adoption grew steadily throughout my remaining time at Sumo Logic. Some enterprise customers with existing SLO practices integrated seamlessly; others discovered a practice they hadn't known they needed.

Beyond the product success, this project delivered meaningful personal growth. Leading design while mentoring a junior designer stretched my skills in cross-functional alignment and stakeholder management. The junior designer I mentored was promoted to Senior Product Designer based on her work on this project — an outcome that meant as much to me as the product launch itself.

Final dashboard: glanceable reliability at a glance

Configuration edit view

Reflection

Taking time to get it right

Terminology matters as much as interface design. The shift from "good/bad" to "successful/unsuccessful" came from research and fundamentally improved comprehension.

More Less

This project reinforced something I've come to believe deeply: terminology matters as much as interface design. The shift from "good/bad events" to "successful/unsuccessful events" came directly from user research and fundamentally improved how teams talked about their SLOs. It's a small change that made a big difference in comprehension.

Working on an aggressive timeline while maintaining quality required strong alignment across the team. There was an implicit understanding that we would take as long as needed to get this right the first time — and we did. That trust enabled us to make careful, considered decisions rather than rushing to ship.

The mentorship aspect of this project was equally rewarding. Watching a junior designer grow through the process, from production work to leadership-level discussions, reminded me that how we build is as important as what we build. Her promotion was validation that investing in people pays dividends.

Simplified configuration entry point

SLO list: status at a glance across all objectives

Design approach

Alert Muting Schedules →