Timeseries Data Schema

Different code uses different formats for Time Series data. With query languages converted to, say, Python lists and dicts, the structure can depend on the query. The query can be dynamic and therefore the structure of the return types can be dynamic.

For clients receiving timeseries data (e.g. charting app), it’s a pain in the ass to handle different formats, as each needs different loopies.

This page is an effort define a few canonical types.

We list “All Types” (below) and then squish them into “Canonical Types”, thereby reducing the number of types to handle down to A,B instead of 1,2,3,4,5,6

The tradeoff is in inefficiency. For example, using “Type A: table” for a Type 2 is pretty verbose.

Canonical Types

Type A: table
-------------

Return as a dictionary of lists to cover types 1,2,3
Somewhat verbose 
Leaves list length undefined

Table:
k1      v1      v2
k2      v2      v3
k3      v5      v5

Example:

{
    'Net' : [ 101.1, 10.1 ],
    'Oven': [ 11, 11 ]
}

Covers:

- Type 1 : list 
    via a dictionary with single 'values' item 
    { 'values' : [ 1 , 2, 3 ]

- Type 2 : simple dictionary 
    via single valued live
    { k1: [ v1 ], k2: [v2], k3: [v3]}

- Type 3 : moment dictionary
    { k1 : [ v1, d1 ], k2: [v2, d3], k3: [v3, d1] }


Type C : (compact) series
-----------------------              

The most compact form, but doesn't cover sparse data well

Covers:
   Types 4,5,6


    k1        k2        k3
d1  v1        v2        v3
d2  v4        v5        v6


{
    _times : [ d1, d2, d3, ]
    k1 : [ v1, v2, v3]
    k2 : [ v1, None, v3]
    k3 : [ v1, v2, v3]
}

Type B : (sparse) series
------------------------

Potentially easy for (eg) ChartJS to ingest?

Covers:
   Type 4,5,6

  k1        k2        k3
d1          v1        
d2                    v2
d3  v3
d4          v4        v5


{
        'Net' : [(DateTime1, 10.1), (DateTime2, 10.3), ...]
        'Oven' :  [(DateTime1, 0.1), (DateTime2, 1.1), ...]
}

All the types

There are all the main types I find useful in timeseries data analysis. I assign my own names for reference.

Type 1 :  list
-----------------

v1      v2      v3

[1,2,3]

Type 2 : simple dictionary
-------------------

k1  v1
k2  v2
k3  v3

{
    'Net' : 101.1,
    'Oven': 121.1
}

Type 3 : moment dictionary
----------------------------

k1 ( v1, d1 )
k2 ( v3, d2 )
k3 ( v5, d3 )


{
        'Net' : (10.1, DateTime1)      # max of Net
        'Oven' :  (10.2 DateTime3)     # max of Oven
}

{
        'Net' : (10.1, None)      # mean of Net
        'Oven' :  (10.2 None)     # mean of Oven
}


Type 4 : sparse series by key
----------------------------


    k1        k2        k3
d1            v1        
d2                      v2
d3  v3
d4            v4        v5


{
        'Net' : [(DateTime1, 10.1), (DateTime2, 10.3), ...]
        'Oven' :  [(DateTime1, 0.1), (DateTime2, 1.1), ...]
}

Type 5 : sparse series by time
------------------------------

    k1        k2        k3
d1  v1                  v3
d2  v4        v5        v6

{
    datetime1: { "sensor1": value1, "sensor2": value1 },
    datetime2: { "sensor1": value2, "sensor2": value2 }
}


Type 6 : compact series 
-----------------------

    k1        k2        k3
d1  v1        v2        v3
d2  v4        v5        v6


{
    _times : [ d1, d2, d3, ]
    k1 : [ v1, v2, v3]
    k2 : [ v1, v2, v3]
    k3 : [ v1, v2, v3]
}

Leave a Reply

Your email address will not be published. Required fields are marked *