Simple library for capturing runtime analytic events from embedded device
Go to file
2025-09-23 22:23:47 +02:00
src initial commit 2025-09-23 22:23:47 +02:00
.gitignore initial commit 2025-09-23 22:23:47 +02:00
docker-build.sh initial commit 2025-09-23 22:23:47 +02:00
docker-run.sh initial commit 2025-09-23 22:23:47 +02:00
Dockerfile initial commit 2025-09-23 22:23:47 +02:00
LICENSE initial commit 2025-09-23 22:23:47 +02:00
README.md initial commit 2025-09-23 22:23:47 +02:00

Telemetry

A simple yet functional library for capturing runtime analytic events from embedded devices.

Build & run

Dockerfile with two helper scripts were added into project's root folder. Buy doing so, two useful goals were achieved:

  • Infrastructure as a code: All project dependancies and as well as all installation / configuration steps are easily documented as groups of handy scripts inside Dockerfile.
  • Containerization: Almost instant ability to jump into app development, testing and/or deployment with zero footprint (pollution) on main (host) PC.

IDE

VsCode IDE was used for in-container development. There is two different ways how one might approach this task:

Design goals

  • agnostic to payload format
  • adequate runtime overhead
  • non intrusive / easy to use

Protobuf v3

Generally speaking, it is impossible to predict what kind of information would posses the most value in the future. Thus communication protocols tend to evolve with time. Not all customers are willing to update their devices on demand. As a result of such forces of nature, it is unavoidable that even devices of a same model would send Telemetry messages of different format. Endpoin servers (collectors) shall be capable of effective handling of such situation. Protobuf is well known industry solution for such problems.

A known limitation of Protobuf library - is an inability to effectively store multiple message in file in serial aka one-by-one fashion. In order to overcome such limitation, our solution makes use of simple Length-Delimited (a simplified version of TLV format) file encoding. Where essentially first two bytes of each serialized message are used to store message length.

flowchart TB
	Length1
	Message1
	Length2
	Message2
	Length3
	Message3

Architecture

Datapoint aka analytic event representation - is a handy wrapper / utility class, aimed to ease usage of somewhat bloated autogenerated protobuf classes.

flowchart LR
    subgraph .proto
	    AnalyticsEvent --- M{Message}
	    TemperatureReading --- M  
	    ShutdownReason --- M  
	    etc.. --- M 
    end

	D(Datapoint) -. parse .-> M
	M -. make .-> D

Than, a Sink instance shall be used to establish flow from runtime memory of an captured Datapoints into it's serialized form on disk. To capture() some Datapoint, an instance of Writer class shall be used. Each Writer instance is linked to it's Sink-paren class. Writer is a movable class that implements a simple, polymorphic, buffered and thread safe API for Datapoints (events) capturing.

flowchart TB
	subgraph "runtime (orbiter)"
		DP1(Datapoint 1) -.capture.- W1(Writer 1)
		DP2(Datapoint 2) -.capture.- W1
		DP3(Datapoint 3) -.capture.- W2(Writer 2)
		DP4(Datapoint 4) -.capture.- W2
	    W1 --- S(Sink)
	    W3(Writer n) --- S
	    W2 --- S

	end
	S(Sink) ---> DB[(File)]
	DB[(File)] ---> R(Reader)
	subgraph "runtime (server)"
		R(Reader) -.parse.-> D1(Datapoint 1)
		R(Reader) -.parse.-> D2(Datapoint 2)
		R(Reader) -.parse.-> D3(Datapoint 3)
		R(Reader) -.parse.-> D4(Datapoint 4)
	end

Reader class shall be used in order to deserialize Datapoints from file. All Datapoints will be read in one-by-one fashion.

Tests and diskussion

Tests for the project designed in such a way, so they can be used as a case study of an API usage as well as a way to ensure code quality.

Datapoint

Shows basic API use case. As well as provides a way to reliably discriminate between multiple types of events.

Serial IO

Cowers a case when all events being captured by single reader in serial fashion:

write - write - write - read - read - read

Also shows a way to check if there is some data to read from file.

Mixed IO

Slightly more complex case, where writing into and reading from file done in mixed order:

write - read - write - write - read - read

Buferization

Writing data to disk (even SSD) is notoriously slow operation. Storing several messages in RAM and then writing all of them on disk as single batch - is a way to speed things up.

[!NOTE] Trade off
In case of sudden power los - all cached data (aka not stored to disk) will be irretrievably lost

Multithreading

Yes, all Writer instances are thread safe. Altho, so further improvement can be made here. See comments in telemetry\sink.cpp Sink::Writer::capture().

Shared

RAII and std::shared_pointer is exactly that type of magic, that provides ease of mind. No need to worry about dangling pointers, leaky memory and problems alike.