Document whitespace insertion in sensor ID naming in Cassandra DB

Directly querying DCDB-aggregated data from the Cassandra DB shows unexpected sensor ID name padding. Given a dcdbpusher configured only with the procfs plugin, and the stock procfs plugin configuration present in procfs.conf, the following sensor IDs are inserted into the database:

cassandra@cqlsh> select sid,ws, count(*) from dcdb.sensordata group by sid,ws;

 sid                          | ws   | count
------------------------------+------+-------
             /test/ctxt       | 2821 |    18
      /test/meminfo/anonpages | 2821 |    12
   /test/vmstat/nr-file-pages | 2821 |     6
             /test/col-user   | 2821 |    12
 /test/vmstat/nr-dirty-thresh | 2821 |     9
         /test/cpu36/col-idle | 2821 |    18
             /test/col-idle   | 2821 |    18
        /test/meminfo/memfree | 2821 |     6

Since the sid values are right-aligned, e.g. sensor ID /test/col-idle contains two trailing white-spaces, i.e. '/test/col-idle '. This is deliberate behaviour, as stated in a comment in function SensorId::mqttTopicConvert:

> lib/src/sensorid.cpp:L48-59
   /* Fill string with trailing whitespace to 128bits so Cassandra's ByteOrder
Partitioner creates proper numerically sorted tokens */
   if (data.size() < 16) {
       data.append(16 - data.size(), ' ');
   }

I found this after digging through the code for some time in search of an explanation for the trailing whitespaces. It might be helpful to document this so that other users not familiar with these intricacies are not surprised.

Admin message

Document whitespace insertion in sensor ID naming in Cassandra DB