Logging Structured Data From an Application

What is structured data?

Structured data is data formatted in a structured manner so that the sender can clearly communicate to the receiver each field/property/part of each message without confusion or ambiguity about where the message starts/stops, and what each field represents and it’s value.

We usually represent structured data with fields or key value pairs. The data can even be represented in a tree such as in JSON or XML.

Logging Flavours

RFC 3164 doesn’t know about structured data:

<34>Oct 11 22:14:15 mymachine MyApplication: This is the message part

RFC 5424 supports structured data:

<74>1 2017-07-11T22:14:15.003Z myhostname evntslog - ID47 [exampleSDID@12345 x="y" eventSource="MyApplication" eventID="6789"] This is the message part

systemd journal supports structured data

Logging Methods

syslog(3) designed for RFC 3164, doesn’t know about structured data.
logger(1) can log with RFC 3164 or RFC 5242, and supports structured data, but just adds it as part of the message field unless you send it to a socket
liblogging mentions structured data in the readme, but it hasn’t been implemented
sd_journal_send/sd_journal_sendv logs to the systemd journal, supports structured data

…And the most common method of getting structured data into logs

Adding key/value pairs or JSON to the message part of any logging system. This works in any logging protocol, here is an example of RFC 3164 with JSON in the message part:

<34>Oct 11 22:14:15 mymachine MyApplication: This message has JSON in it { "x": "y", "age": 123, "flag": true }

This works universally for RFC 3164, RFC 5242 and the systemd journal.

With CEE:

<34>Oct 11 22:14:15 mymachine MyApplication: @cee: { "x": "y", "age": 123, "flag": true }

Adding JSON to the message part is the lingua franca of logs, it works everywhere, even with old APIs, adding the CEE signature makes the JSON payload easy to identify.

See Also


Stop Watch and Time Out Classes

I have a passing interest in stop watch and time out type classes. Keeping track of elapsed time, remaining time. They are very basic classes, but I find them interesting.

GetTimeMS function

The stop watch and timer classes use this function to get the system time, we actually return the system time minus the application start time to get a time relative to the application start time, starting at 0. This is not necessary, return the system time is fine, returning the time the application is running is usually just easier to think about than the time since 1970, which is kind of arbitrary.
This function uses milliseconds, if this is not enough, we can easily switch to nanoseconds.

typedef uint64_t durationms_t;
// Get the time since epoch in milliseconds
durationms_t GetTimeMS()
  static const std::chrono::system_clock::time_point start = std::chrono::system_clock::now();
  const std::chrono::system_clock::time_point now = std::chrono::system_clock::now();
  return std::chrono::duration_cast<std::chrono::milliseconds>(now - start).count();

Stop watch class

Provides start, stop, reset and we can get the total duration.

class cStopWatch
  void Start();
  void Stop();
  void Reset();
  durationms_t GetTotalDurationMS() const;
  bool running;
  durationms_t started;
  durationms_t totalDuration;
cStopWatch::cStopWatch() :
void cStopWatch::Start()
  // Started our stop watch
  started = GetTimeMS();
  running = true;
void cStopWatch::Stop()
  // Get the time now
  const durationms_t now = GetTimeMS();
  // Add the duration for this period
  totalDuration += now - started;
  // Reset our start time
  started = 0;
  running = false;
void cStopWatch::Reset()
  started = 0;
  totalDuration = 0;
  running = false;
durationms_t cStopWatch::GetTotalDurationMS() const
  if (running) {
    // Get the time now
    const durationms_t now = GetTimeMS();
    // Return the previous duration plus the duration of the current period
    return totalDuration + (now - started);
  return totalDuration;

Example usage

cStopWatch stopWatch;
// Start the stop watch
// Sleep for 1 second
const uint64_t timeout_ms = 1000;
::usleep(1000 * timeout_ms);
// Stop the stop watch
// We have now waited for at least 1 second
std::cout<<"Stop watch time: "<<stopWatch.GetTotalDurationMS()<<" ms"<<std::endl;

Time out class

Allows us to get the time remaining, check if the time out has expired and reset the time out.

class cTimeOut {
  explicit cTimeOut(durationms_t timeout);
  void Reset();
  bool IsExpired() const;
  durationms_t GetRemainingMS() const;
  const durationms_t timeout;
  durationms_t startTime;
cTimeOut::cTimeOut(durationms_t _timeout) :
void cTimeOut::Reset()
  startTime = GetTimeMS();
bool cTimeOut::IsExpired() const
  return ((GetTimeMS() - startTime) > timeout);
durationms_t cTimeOut::GetRemainingMS() const
  // Get the total time this timeout uhas been running for so far
  const int64_t duration = int64_t(GetTimeMS()) - int64_t(startTime);
  // Calculate the remaining time
  const int64_t remaining = (int64_t(timeout) - duration);
  // Return the remaining time if there is any left
  return (remaining >= 0) ? remaining : 0;

Example usage

// Create a 5 second time out
cTimeOut timeout(5000);
// Wait until the time out is expired
while (!timeout.IsExpired()) {
  // Sleep for 1 second
  const uint64_t timeout_ms = 1000;
  ::usleep(1000 * timeout_ms);
std::cout<<"Time out reached"<<std::endl;

Java Gotchas


  • Everything is a pointer that initially points to null. Every non-POD type must be newed (I forget this one a lot)
  • The POD types (int, float, enum, etc.) behave just like their C++ counter parts.
    The boxed versions (Integer, Float) and String in Java are more like a pointer to the POD value.
    For example:

    String a = "chris";
    String b = "chris";
    if (a == b) System.out.println("The strings are the same");

    This test will always fail because operator== for pointers (Instances of anything derived from java.lang.Object) will test if the pointers point to the same memory.
    To compare boxed types:

    if (a.equals(b)) System.out.println("The strings are the same");
  • Strings are immutable in Java, this doesn’t work as expected:
    String a = "a";
    a.replace('a', 'b');
    System.out.print(a); // Prints "a" instead of "b"

    A copy must be made:

    String a = "a";
    String b = a.replace('a', 'b');
    System.out.print(a); // Prints "b"

    String toUpperCase and toLowerCase have the same problem.

  • Java finalize() is not a C++ destructor.  It is not guaranteed to be called.  If a finalize() function is called it is up to the child class to call super.finalize().
  • If you specify no access modifier in C++ the default behaviour is private to the class, in Java it is private to the package. If you are bitten by this bug it kind of serves you right because you should always specify an access modifier anyway, but it may bite you when porting old code from C++.
  • There is no way to choose what lives on the stack and what lives on the heap. It doesn’t really matter most of the time, but it would be nice to have the option.
  • There is no passing by reference, everything is passes as a pointer.
  • There are no function pointers, anonymous interfaces are probably the closest thing to it.
  • Copy vs Reference Confusion
    Car a = new Car();
    Car b = a; // Takes a reference to a, does not create a copy
    System.out.print(a.GetColour()); // Prints "blue"
  • Java final is not the same as C++ const
    class PaintShop {
      public void PaintCarBlue(final Car car) {
        car.SetColour("blue"); // Changes the colour of the original car
    PaintShop paintShop = new PaintShop();
    Car a = new Car();
    System.out.print(a.GetColour()); // Prints "blue"

Debugging 101

Tips to help you find bugs.

Turn the warning level up and turn on warnings as errors
I know, it can be a pain, but it works. It helps you produce more rhobust, forward compatible, portable code that is syntactically correct on more compilers.
Use assert …
Assert will help you find errors before you even knew there was a problem. It is 100x better if your application tells you exactly where problems are and breaks into the debugger than for your application to crash for the customer. It goes hand in hand with compile time type safety, compiler warnings and static_assert.
… and use if () return
Think of assert as flagging the problem for you to fix and if () return as guarding the application from crashing for the customer.
Print to your errors/trace file
It can be easier to read what the application did rather than stepping through it, in some cases it is not possible to step through the code at all; full screen games, embedded devices and drivers for example. Print out variables, function entrance/exit and “I am here” lines.
Comment out or remove as much code as possible. Slowly remove lines until you find the offending line. Try different ways of doing the same thing. Read your errors/trace file. If I had a dollar for every time a coworker tells me there is a bug in the library/standard library/compiler/Operating System and it is actually an error in their code and they have ignored a printed error, I would buy a Ferrari, and then I would buy them a pair of spectacles and I would say, “Use these to read these”, and point at the error message.
Write data to files and validate your files
For example XML, HTML, JSON, RSS, KML, PNG, WAV and MP3, can all be written to files and run through third party validation software. You can even setup a unit test to write these to a file and run the validator for you.
Use your debugger
Put break points on suspect lines, step through functions, find out the value of variables, look at what each thread is doing, look at the call stack.
Read the documentation for the library you are using
You may be using a library incorrectly, it may have conditions or prerequisites you didn’t know about. If it is too complex wrap it into something easier to digest. If its usage is error prone wrap it into something simpler and type safe, that only uses what you need. An optimising compiler will factor out your wrapper and it will have no run time impact.
It is there for a reason. Millions of developers have used it for almost two decades, hundreds of thousands of eyes have pored over the source code to it. Chances are that it is more capable, less bug ridden and more portable than your hodge podge collection of classes. A large percentage of bugs I have seen are in classes that could be replaced by the STL. Plus developers will “just know” what your code does because they already have experience with STL. Just use it.

Pimp My Code Part 2


char szPassword[255];
GenerateRandomPassword(szPassword, 8);
char szText[255];
sprintf(szText, "User: %s, ID: %d, Password: %s", szUser, int(userid++), szPassword);

Don’t increment and use a variable on the same line. We know, you’re very tricky, you saved a line. You also made sure that beginners to C++ don’t know what the result will be. Keep it simple stupid. Create the simplest most readable code possible, it makes skimming over code and debugging code much easier. Fixed length arrays are very prone to buffer overruns, in this example szPassword is probably only 8 characters long after calling GenerateRandomPassword, but szUser could be any length and could definitely overrun 255 characters. The best way to mitigate this problem is to use a real string class such as std::string. We can also avoid using sprintf by using a type safe string writing class, std::ostringstream. Code using std::ostringstream is also slightly more human readable.

const std::string sPassword = GenerateRandomPassword(8);
std::ostringstream oText;
oText<<"User: "<<sUser<<", ID: "<<userid<<", Password: "<<sPassword;
std::string sText = oText.str();

There, type safe, buffer overflow safe, future proof and slightly more readable, what’s not to like?

Pimp My Code Part 1 Redux


Not exactly a redux, but very similar to last time:

bIsNotEmpty = false;
if (vNests.GetSize() != 0) {
  if (vNests[0]->vEggs.GetSize() != 0) bIsNotEmpty = true;

First of all we don’t actually care about the size, we just care that we have (Or don’t have) a nest with eggs in it. Depending on the container GetSize may or may not be a variable look up. IsEmpty is always a variable lookup:

bIsNotEmpty = false;
if (!vNests.IsEmpty()) {
  if (!vNests[0]->vEggs.IsEmpty()) bIsNotEmpty = true;

We can combine this into a single line:

bIsNotEmpty = (!vNests.IsEmpty() && !vNests[0]->vEggs.IsEmpty());

I would use more descriptive variable naming change my logic to use bIsEmpty/!bIsEmpty. Using bIsNot variables is often confusing and leads to harder to read code:

bIsNestWithEggs = (!vNests.IsEmpty() && !vNests[0]->vEggs.IsEmpty());
bIsEmpty = !bIsNestWithEggs;
// Use bIsEmpty and !bIsEmpty from now on

Pimp My Code Part 1


inline bool IsSpecial(const char* szValue)
 // Returns true if this is a special value
 if (stricmp(szValue, "MySpecialValue") == 0) {
   return true;
 return false;

I see this sort of thing all the time. For boolean functions that call boolean functions the if and return statements are usually superflous, we can use the return value of (stricmp(szValue, “MySpecialValue”) == 0) itself:

// Returns true if this is a special value
inline bool IsSpecial(const char* szValue)
 return (stricmp(szValue, "MySpecialValue") == 0);

If it were up to me I would also use a string class and keep as much of the code as possible in the string “realm” (This makes the code a lot simpler and easier to read):

// Returns true if this is a special value
inline bool IsSpecial(const string& sValue)
 return (sValue == "MySpecialValue");