Lights, camera, action - it’s the eBPF story!

Welcome!

Welcome to the third edition of the Observability 360 newsletter!

Lights, camera, action - it’s the eBPF story!

The rise of eBPF has been little short of meteoric and now the buzz has reached the silver screen, with an upcoming documentary on the packet filtering game-changer. In this mailing we also feature a new profiling solution from Elastic, another ‘State Of’ report, the $1bn dollar Observability startup you may not have heard of and some cool How-To’s.

The Costs Conundrum

One of the major challenges facing Observability engineers is keeping a lid on costs when telemetry volumes are increasing exponentially. We list a fascinating article on a potentially expensive gotcha in OTel ingestion as well as a New Stack piece on taming real-time analytics costs.

Feedback

As practitioners in the field, you will know that every good observability system needs a feedback loop. Let us know how we are doing at:

NEWS

Unlocking The Kernel - the eBPF documentary

Speakeasy Productions describes itself as a film company “on a mission to empower technology companies and open source communities to showcase their stories“. For their latest project “Unlocking The Kernel”, they have turned their attention to the story of the rise and rise of eBPF. As one commentator says in the trailer: "It's a revolution not an evolution". What next? Raiders of the Lost SPARK? Reservoir Logs?

Elastic Universal Profiling is now GA

Profiling is increasingly being regarded as the fourth fundamental pillar of observability practice. Elastic have now released what they describe as “a whole-system, always-on, continuous profiling solution”.

By leveraging eBPF, Universal Profiling is apparently able to profile every line of code running on a machine. Elastic are hailing it not only as a tool for increasing performance but also a means of achieving greater sustainability by reducing computational waste.

Chronosphere - a New Enterprise Observability Contender

Two of the primary concerns for many businesses are controlling costs and preventing data overload. Chronosphere, a startup founded by two former Uber engineers, are aiming to stake out their place in the enterprise observability market by ticking these two boxes. They are a relatively unknown player but they have some serious financial backing - they recently raised $200m in funding - valuing the company at over $1bn.

If you are interested in digging deeper, you can download a Forrester report evaluating the product's potential RoI.

You Gotta Scroll With It…

Observability is evolving rapidly and, increasingly, enterprises are concerned with customer journeys and SLO’s just as much as they are concerned with SLA’s and outages. DataDog have responded to this trend with their new Scroll Map feature. As the name suggests, it provides metrics on how far down a particular page users have scrolled, so product and marketing teams can gain extremely rich and granular insights into user behaviour.

EVENTS

📣 Next month’s Grafana ObservabilityCon in London has already sold out. Congratulations if you managed to snap up a ticket!

Unfortunately, we only have space to highlight a few of the many upcoming events relating to Observability. See the Observability 360 calendar for a fuller listing!

DevOpsCon December 4-7, Munich

DevOpsCon is actually a global series of conferences which will also be landing in London, New York and Singapore in the near future. The Munich edition features an Observability and Monitoring track with speakers from Netflix, ING Bank and yCrash.

ElasticOn Tour, 7 November, Frankfurt

World Tours are not just for U2 and Taylor Swift. The Elastic team will be packing their suitcases and taking their show on the road next month. First they will touch down in Frankfurt and then head to Amsterdam. You will hear keynote speeches, see the latest Elastic solutions and get the chance to talk to engineers IRL.

From the Blogosphere

How Canva implemented end-to-end tracing

Canva is an online graphic design platform with over 100m active monthly users. In this article, Ian Slesser, an Observability Engineering Manager at Canva describes how his team built an OTel-based tracing stack and even rolled their own front-end SDK to minimise the telemetry footprint for end users.

How SALT slashed resolution time with Helios

Setting up error alerts and notifications is mostly pretty trivial. Digging into the root cause of those errors can be much more painstaking and time-consuming - even with sophisticated observability platforms. SALT is a company specialising in API security and their telemetry generates some 50bn spans per month. Here, their Director of Platform Engineering discusses how Helios supercharged their diagnostics processes.

Cost Management

Compressing the Cost of Real Time Analytics

Real-time analytics are in demand - but the storage costs can induce a serious case of sticker shock. This article from The New Stack looks at how InfluxDB meets the dual challenge of querying at vast scale whilst storing large volumes of telemetry economically. They claim they can achieve savings of up to 90% compared to other products.

Spanning out of Control

This is an article that has really raised a few eyebrows - especially as it involves OpenTelemetry - which prides itself on being lean and efficient. A forensic analysis by Nikolay Sivko of coroot shows how just a few verbose OpenTel span attributes can potentially ramp up your ingestion fees.

VIDEOS & TUTORIALS

PromCon23 Recap

If you were not lucky enough to be at this year’s PromCon, then you can catch Prometheus maintainer Augustin Husson reflecting on the highlights - including the unveiling of the Perses visualisation tool, and some key inside info on the eagerly awaited release of Prometheus 3.0.

Influx DB Performance and Query tuning

InfluxDB is a popular choice for managing time series data at scale. This is a video aimed experienced users looking to optimise database performance and follow best practice in query and schema design.

PODCASTS

Cloudcast

Cloudcast has been running since 2011 and has built up a formidable catalogue of podcasts to browse through. As the name suggests they are cloud-focused and cover a wide range of big topics including Kubernetes, AI, Machine Learning, Big Data and more.

Recent podcasts have included:

They broadcast on all the usual channels and social media platforms

Docs

Database Performance At Scale - A Practical Guide

All too often, books on this topic can be very heavy going. Fortunately this volume is highly accessible and readable and packed with great advice from seasoned practitioners - and it’s free!

Splunk State of Observability Report

Last time out, we mentioned the New Relic State Of Observability report, now Splunk have produced a similar survey of their own - with a similar sample size of around 1,700 Observability managers. This paints a picture of rapid growth as well as an increasing recognition of the importance of Observability amongst senior management. Some of the interesting take-aways:

  • toolsets are becoming more complex *

  • an increasing convergence between the observability and security domains

  • 64% report that ROI on their AIOps tools has exceeded expectations

*Interestingly, the New Relic survey covered in our last newsletter seemed to point to the opposite trend - i.e. that toolsets were becoming more consolidated.

Community

📣 If you would like to publicise an Observability-related meetup/standup etc then please let us know and we will list it here.

OpenTelemetry End User Discussion Groups

These are online discussion groups where community members can share experiences on how they are using OTel. Outcomes are fed back to relevant project maintainers. There are monthly meetups across three geographical areas. November’s meetings are:

US: Nov 16 2023, 5pm GMT

Europe: Nov 21 2023, 11AM GMT

APAC (Asia Pacific): Nov 22 2023, 6:30Am GMT

The meetings are facilitated by the meetup.com platform.

Dapr Community Call

Dapr is a multi-platform open source abstraction layer with built-in OpenTelemetry support. Join in here (top hat and tie are not required). The Dapr community meetup takes place every second Wednesday at 9am PST. The next two meetings are October 18th and November 1st. You can find full details at:

That’s a wrap!

That’s all for this fortnight’s edition. See you in two weeks!