Back to Work Sumo Logic • 2020

Unified Monitoring & Alerting

Building an extensible monitoring platform that replaced fragmented legacy tools and became the foundation for Sumo Logic's alerting ecosystem.

Role Lead Product Designer
Timeline 18 months
Company Sumo Logic

Building for the future

Replace fragmented legacy tools with an extensible monitoring foundation for Sumo Logic's alerting ecosystem.

More Less

Sumo Logic needed to transform its monitoring capabilities to retain customers and compete effectively against DataDog, Splunk, and New Relic. The goal wasn't just to build a better monitoring tool — it was to create an extensible foundation that would support the company's alerting ecosystem for years to come.

We set out to replace fragmented legacy tools with a unified platform that could gracefully accommodate future capabilities: outlier alerting, smart alerts, SLO monitoring, and eventually the Alert Response Platform.

Monitor definition: the extensible framework

Ad-hoc tools, expert-level requirements

Legacy scheduled search required deep query expertise, complete upfront configuration, and had no central management.

More Less

Sumo Logic had built ad-hoc reporting tools that customers were using as makeshift monitoring and alerting solutions. These tools required deep understanding of Sumo's query language to fully exploit — and even experienced users often needed professional services interventions to achieve their desired outcomes.

The legacy scheduled search feature exemplified these problems. It was only available from the log search page, required complete configuration before deployment, hid settings across separate pages, and was notoriously difficult to find once set up. Users couldn't quickly create a simple monitor — they had to configure everything upfront or nothing at all.

Legacy Scheduled Search

  • Only accessible from log search page
  • Required complete configuration before deployment
  • Settings hidden across separate pages
  • Difficult to locate and manage after creation
  • Deep query language expertise required

What Customers Needed

  • Create monitors from anywhere in the app
  • Start simple, add complexity incrementally
  • All settings visible and accessible
  • Central location for monitor management
  • Visual configuration without query expertise

Monitor status page: the management landscape

Leading the design vision

Primary UX lead for a newly formed team, owning end-to-end design and mentoring a junior designer.

More Less

A new team was formed specifically for this initiative. The core team included a Product Manager, Senior Staff Engineer, three front-end engineers, and more than a dozen back-end engineers. I served as the primary UX lead, with a junior designer reporting to me whom I mentored on the output message design component.

What I owned

  • Competitive research leadership — Led research efforts with the PM, Sr. Staff Engineer, and other product leaders, analyzing DataDog, Splunk, and New Relic's monitoring solutions
  • End-to-end design — Responsible for the entire design from concept through implementation, including new components (icons, modal design, and interactive elements)
  • Framework architecture — Created the end-to-end UX framework that would guide future product development
  • Design governance — Conducted regular UX review meetings with team leads and the extended team, with periodic reviews across the broader UX organization

Central monitor list: the team's core deliverable

Research, iteration, refinement

Competitive analysis, customer interviews, and iterative design cycles with Sumo's UX research team.

More Less

We began by understanding current monitoring practices — both in the field and among our competitors. Through research reviews, customer interviews, and conversations with internal support teams, we learned what our customers were looking for and where we needed to focus.

Competitive analysis

Deep dive into DataDog, Splunk, and New Relic monitoring solutions. Reviewed these products with stakeholders to understand industry standards and identify differentiation opportunities.

Customer & support interviews

Conducted interviews with customers to understand pain points and desired outcomes. Spoke with internal support teams to identify common struggles and professional services escalations.

Design iteration

Multiple rounds of proposals focusing initially on log search and metrics tracking. Practiced a regular pattern of design, research, and refactoring with support from Sumo's UX research team.

Key research findings

  • Customers wanted visual configuration — not just query-based setup
  • Alert fatigue was a universal pain point across all platforms
  • Integration with existing workflows (PagerDuty, Slack) was essential
  • The system needed to scale from simple thresholds to complex ML-based detection

Monitor configuration: iterating on the trigger interface

Designed for extensibility

Application-wide monitor access, severable configuration areas, and reimagined notifications.

More Less

Three key design decisions shaped the monitoring system's success and established patterns that would guide future products.

Application-wide accessibility

Unlike previous feature implementations that siloed configuration to feature-specific pages, we made monitor configuration available across the entire Sumo Logic application.

Users could create and edit monitors from any search result, dashboard panel, or metrics exploration — and eventually from security dashboards and reports as well. This framework made it easy to extend the system to accommodate different monitor types as the platform evolved.

Severable configuration areas

We partitioned the modal into independent sections, enabling users to configure base monitor settings quickly and add complexity incrementally. A functioning, near-real-time monitor required only three things: a query, one trigger condition, and a name.

This was a stark contrast to legacy scheduled search, which necessitated complete configuration, hid sections across separate pages, and required all fields to be configured before deployment.

Simplified notifications

Notifications were completely reimagined. They became optional rather than required, easy to configure, and supported multiple notification channels simultaneously. Users could enable or disable individual notifications without affecting the rest of their configuration.

This flexibility meant teams could iterate on their alerting strategy without rebuilding monitors from scratch — a common frustration with the legacy system.

Independent configuration sections: start simple, add complexity over time

A platform transformed

Steady enterprise adoption, legacy sunset by 2022, and an architecture that scaled to smart alerts and anomaly detection.

More Less

Upon release, unified monitoring was an immediate success. We measured steady growth and adoption from introduction, onboarding several large-scale enterprise customers who were able to effectively translate existing monitors from third-party solutions.

Customer feedback was generally positive, and the UI was universally well received. We extended feature capabilities significantly beyond what the original scheduled searches offered, without changing the foundational patterns.

2020 — Platform launch

Unified monitoring released with log search and metrics tracking. Immediate adoption and positive reception from enterprise customers.

2021 — Alert Response Platform

Built the Alert Response Platform on top of unified monitoring, providing incident context, related alerts, and automated playbooks.

2022 — Feature parity & legacy sunset

Reached feature parity with legacy scheduled search. Notification capabilities — initially an area where we struggled — matched and then exceeded the original tools. Successfully sunset the scheduled search real-time reporting feature in favor of the unified monitoring solution.

2023 — Smart Alerts & Anomaly Detection

The extensible architecture paid off as we added advanced analytics capabilities. The framework accommodated these new monitor types without architectural changes.

Central monitor list: view, manage, and organize all monitors

Lessons learned

A concession on trigger configuration and hard-won lessons in navigating stakeholder dynamics.

More Less

If I could revisit this project, I would have pushed harder for a simplified trigger configuration design. Instead, I conceded to a natural language interface approach that others on the team advocated for. The NLP-based trigger configuration ended up being difficult to maintain and was the only part of the design that proved problematic over time.

This project also gave me a better understanding of stakeholder dynamics. I learned to identify which leaders could be relied upon to pursue better UX versus those for whom convenience and shipping velocity were paramount. That awareness has shaped how I navigate design decisions and build consensus in subsequent projects.

The most successful enterprise tools feel like consumer products — even in high-stakes environments, every interaction should feel intuitive and deliberate.

The configuration modal: accessible from anywhere in the app

Next Project

Alert Response Platform →