Skip to main content
Test Double company logo
Services
Services Overview
Holistic software investment consulting
Software Delivery
Accelerate quality software development
Product Management
Launch modern product orgs
Legacy Modernization
Renovate legacy software systems
DevOps
Scale infrastructure smoothly
Upgrade Rails
Update Rails versions seamlessly
Technical Recruitment
Build tech & product teams
Technical Assessments
Uncover root causes & improvements
Case Studies
Solutions
Accelerate Quality Software
Software Delivery, DevOps, & Product Delivery
Maximize Software Investments
Product Performance, Product Scaling, & Technical Assessments
Future-Proof Innovative Software
Legacy Modernization, Product Transformation, Upgrade Rails, Technical Recruitment
About
About
What's a test double?
Approach
Meeting you where you are
Founder's Story
The origin of our mission
Culture
Culture & Careers
Double Agents decoded
Great Causes
Great code for great causes
EDI
Equity, diversity & inclusion
Insights
All Insights
Hot takes and tips for all things software
Leadership
Bold opinions and insights for tech leaders
Developer
Essential coding tutorials and tools
Product Manager
Practical advice for real-world challenges
Say Hello
Test Double logo
Menu
Services
BackGrid of dots icon
Services Overview
Holistic software investment consulting
Software Delivery
Accelerate quality software development
Product Management
Launch modern product orgs
Legacy Modernization
Renovate legacy software systems
Cycle icon
DevOps
Scale infrastructure smoothly
Upgrade Rails
Update Rails versions seamlessly
Technical Recruitment
Build tech & product teams
Technical Assessments
Uncover root causes & improvements
Case Studies
Solutions
Solutions
Accelerate Quality Software
Software Delivery, DevOps, & Product Delivery
Maximize Software Investments
Product Performance, Product Scaling, & Technical Assessments
Future-Proof Innovative Software
Legacy Modernization, Product Transformation, Upgrade Rails, Technical Recruitment
About
About
About
What's a test double?
Approach
Meeting you where you are
Founder's Story
The origin of our mission
Culture
Culture
Culture & Careers
Double Agents decoded
Great Causes
Great code for great causes
EDI
Equity, diversity & inclusion
Insights
Insights
All Insights
Hot takes and tips for all things software
Leadership
Bold opinions and insights for tech leaders
Developer
Essential coding tutorials and tools
Product Manager
Practical advice for real-world challenges
Say hello
Leadership
Leadership
Leadership
Legacy modernization

Turning observability into a team strength without a big overhaul

By addressing observability pain points one at a time, we built systems and practices that support rapid troubleshooting and collaboration.
Gabriel Côté-Carrier
|
August 19, 2025
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Introduction: observability has a human dimension

We don't always need to start with a complicated plan to make things better. Sometimes all it takes is having a clear direction, taking the first step and seeing where it leads you.

Developers at my previous client had a hard time finding out what was going wrong in their services. Investigations took a long time and the results weren't always conclusive. Observability was a pain point for us.

Observability is the ability to see what's happening inside software systems. It's typically achieved through querying and visualizing logs, metrics and traces.

I felt that pain alongside them and raised my hand when the team wanted to do something about it. It became a much bigger project than I could have envisioned at the start and solving that technical problem also turned out to have a big human component.

The trigger: time consuming bugs

We had a couple of recurring bugs in different services in a short period of time. Attempts to fix them took time and could fail if our investigations into the error logs didn't yield the right root cause. At that point, leadership made the decision to invest engineering time to make logs more robust in these services.

The double agents at the client made the case that it'd be a good idea to approach the issue holistically instead of only addressing the problematic services. The ball was in our court afterward to display what that could look like.

From that point on, I worked on and off toward making observability better for several months. I repeatedly addressed the next most painful parts of our observability and adjusted course based on the team's feedback.

Technical wins: making things easier

The first thing I did was build a path to structured logging. It enabled logs to contain arbitrary key-value pairs, making them much easier to query and visualize through a lot of new dimensions like unique identifiers, route names, status codes, downstream services, etc.

After that, I integrated our logging with our HTTP server framework and our HTTP request library. The team could now send context-rich logs with just a few lines of code. The best part of this was that it standardized most logs across our different services. Querying behavior for a single web application or across different web applications could now be done easily without cross-referencing with application code.

The next biggest pain point was how complex our log queries were. We couldn't jump on the log platform and query something without finding an example somewhere to copy and tweak. I built a log query function that abstracted away the querying complexity. Queries could now be written from scratch in seconds.

After that, I made it easy to provision new dashboards to all of our environments. We could now build and refine graphs displaying all of the new rich logs and metrics we had. Dashboards made it much easier to grasp problematic patterns in our services at a glance.

Beyond the technical: paving a road for others

While building these technical capabilities, I realized that they would only be part of the solution.

One of our values I resonate with the most at Test Double is leaving our clients better than we found them. As a consultant, I knew I'd eventually leave, so keeping all the knowledge of operating our tools in my head would be doing my client a disservice. I repeatedly asked myself how I could pass on that know-how.

The result? I wrote extensive documentation with code samples, tutorials and upgrade paths. Every new capability came with release notes and demos to the team. I jumped on calls to share knowledge and help debug issues.

This took a lot more effort than I thought, but that work was rewarding because it was essential to achieving the best outcome for the team. I knew deep down that elevating the team's practices mattered much more than elevating their tooling. I was elated the first time I heard another senior engineer mention they'd deployed a new dashboard without my involvement. I knew at that point that I'd accomplished that personal goal.

New outcomes: reframing observability

With all the new additions in technical capabilities, I realized that our observability platform could be used for much more than assisting developers. Being an internal service team, we often had to collaborate with other teams and I planted the idea that we could be building dashboards for them.

The first application of that idea was in a new project with a fair amount of risk involved. We knew we'd have a lot of back-and-forth to fine-tune the result properly and to that end we built a dashboard that would provide answers for most of our collaborators' questions.

Shifting our observability platform from team tooling to something we could offer was hugely helpful for the project. It removed our team from being a bottleneck for our collaborators. They could now progress on their work and fix bugs while rarely needing to wait for our responses or book meetings with us.

Stakeholders especially enjoyed having access to these dashboards because it gave them accurate metrics on-demand. That transparency strengthened our stakeholders' trust in our systems.

The end result: find and fix issues quickly

I don't have an exact date, but observability stopped being a source of complaints within a few months. I still kept building more capabilities past that point because I saw a lot more potential and the leadership at the client trusted me to keep delivering value in that space. We wouldn't have achieved these new outcomes without that trust and I'm grateful for it.

I ended up staying at the client roughly two years after I finished that initiative and it's been clear that observability wasn't a weakness anymore. If anything, it had become a strength. Our stakeholders could trust that we'd find and fix issues quickly. We often did it so quickly that we'd let them know of a fix before they raised any issue to us.

Looking back: what I'd do differently

If I had to do it all over again, I would build hands-on workshops on the new observability capabilities. Demos are helpful, but there's a big qualitative difference between watching someone and trying something yourself. It would have sped up adoption of the new practices and made the collective expertise more consistent.

I would also be deliberately asking for internal collaborators to keep championing this work forward. I regret not having the conversation with a few developers that I wouldn't be around forever and that I'd love to give them more ownership. It would have helped make this observability initiative a collective one quicker.

Getting started: observability resources

If you’re feeling observability pains in your systems, my first suggestion would be to have a serious look at your observability platform’s documentation. For me, building a deeper understanding of our platform’s features helped me identify all sorts of wins that kept us moving in the right direction. Sometimes it’s as easy as knowing what you have access to and figuring out how to connect your systems to it.

If you’re looking for a soup to nuts understanding of observability, I’d recommend reading the book Observability Engineering. I read it after my observability initiative, but I would have had an easier time with the information in that book. It not only does a great job going in depth with the technical aspects and reasoning behind observability, but also talks of the human challenges involved in adopting observability.

Closing out: focusing on the direction

I wasn't an expert in structured logging at the time nor did I have a big plan. I did however have a rough north star: observability should be powerful and easy to use. From there, I built more and more expertise and leaned on the people around me for feedback and suggestions. Every step forward gave me enough visibility to see where to go next.

I rotated away from this client earlier this year and I have every confidence that they'll be able to sustain and increase that expertise without me. That outcome came from keeping in mind that software problems are also human problems and only tackling the technical part wouldn't have taken us all the way there.

Resource

  • The Observability Engineering book

‍

Related Insights

🔗
Why legacy code rewrites are the hardest job in software
🔗
Ratcheting to zero: How incremental constraints eliminate technical debt
🔗
The end of legacy code
🔗
Mastering automated refactoring tools: A practical guide
🔗
How modern frontend teams approach automated testing

Explore our insights

See all insights
Developers
Developers
Developers
C# and .NET tools and libraries for the modern developer

C# has a reputation for being used in legacy projects and is not often talked about related to startups or other new business ventures. This article aims to break a few of the myths about .NET and C# and discuss how it has evolved to be a great fit for almost any kind of software.

by
Patrick Coakley
Developers
Developers
Developers
Why I actually enjoy PR reviews (and you should, too)

PR reviews don't have to be painful. Discover practical, evidence-based approaches that turn code reviews into team-building opportunities while maintaining quality and reducing development friction.

by
Robert Komaromi
Developers
Developers
Developers
Build with HTMX: Simplify development with a return to fundamentals

Modern web development embraced complexity with frameworks like React, but at what cost? HTMX is a lightweight, progressively enhanced alternative, embraces web fundamentals, and reduces dependency overhead. Through a side-by-side comparison of identical applications built with React and HTMX, this screencast and blog explores the benefits of a hypermedia-first approach and when it might be the right choice for your projects.

by
Dave Mosher
Letter art spelling out NEAT

Join the conversation

Technology is a means to an end: answers to very human questions. That’s why we created a community for developers and product managers.

Explore the community
Test Double Executive Leadership Team

Learn about our team

Like what we have to say about building great software and great teams?

Get to know us
Test Double company logo
Improving the way the world builds software.
What we do
Services OverviewSoftware DeliveryProduct ManagementLegacy ModernizationDevOpsUpgrade RailsTechnical RecruitmentTechnical Assessments
Who WE ARE
About UsCulture & CareersGreat CausesEDIOur TeamContact UsNews & AwardsN.E.A.T.
Resources
Case StudiesAll InsightsLeadership InsightsDeveloper InsightsProduct InsightsPairing & Office Hours
NEWSLETTER
Sign up hear about our latest innovations.
Your email has been added!
Oops! Something went wrong while submitting the form.
Standard Ruby badge
614.349.4279hello@testdouble.com
Privacy Policy
© 2020 Test Double. All Rights Reserved.