TimescaleDB

Why not Influx

I’ve been using Influx (v1 and v2). Flux is great but now (2025) unsupported. Influx has bugs with won’t be fixed. The worst is duplicate points which means you can’t rely on your database to return the correct result! I also had problems with memory and cpu overload.

The mess of versions, orphaned versions and late versions caused me to look elsewhere. Let’s try TimescaleDB!

Install

  • Follow install docs
    • I had to upgrade psql to v17:
      • sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
      • sudo apt update
      • sudo apt upgrade
      • sudo systemctl restart postgresql.service
      • pg_lsclusters # and pg_drop v14 when you’re ready
    • Couldn’t install dev (try later if required)
      • sudo apt install postgresql-server-dev-17
    • Had problems with my Ubuntu 22 install:
    • sudo timescaledb-tune
      • … and used defaults

Get Started

  • Make a testdb in postgres
  • Remember sudo su postgres if you want to get postgres user (not psql>) shell.
  • Load the extension, etc as per docs
  • Try tutorial 1 (PV data)
    • When creating day CQ with CREATE MATERIALIZED VIEW kwh_day_by_day(time, value) the result has timestamps of 08:00:00+10 which needs checking – refer to timezones, “origin” concept in TSDB, etc.
    • Had trouble making an new testdb and testuser (Google/ChatGPT fixed) in order to allow Grafana to connect.

Thoughts so Far

  • Docs pretty good.
  • IoT data is a clear use-case, which is reassuring.
  • The aggregate queries and support look good and easy to manage
  • More security options (if required) like RLS
  • Standard python library as standard PSQL

Schemas for IoT in SQL

Tables

Since it’s SQL, we need a fixed schema. Lots of good info (start here). There is the pseudo structure from Influx

REFSENSORTIMEVALUE UNIT
hfs02aNet2010…+10.1W
hfs02aNet2010…+10.2PF
hfs02aFridge2010…+10.3W
hfs02aFridge2010…+10W

Narrow

Having one table for each measurement (e.g. “power” table (average Watts for period), or equivalently “energy” (total kWh for period) would be convenient for querying.

We could add a “PF” or “Current” or “Voltage” table if required later.

timerefsensormetricvalueunit
2010-01-01 00:10:00hfs02aNetpower0.1W
2010-01-01 00:10:00hfs02aNetpf0.2PF
2010-01-01 00:10:00hfs02aFridgepower0.3W

Wide

We want to easily write from IotaWatt. This code combines measurements (e.g. power, current, etc) into one line and writes. That’s nice and simple. This would suit WIDE table.

timerefsensorpowerpfunit_powerunit_pf
2010-01-01 00:10:00hfs02aNet0.10.2WPF
2010-01-01 00:10:00hfs02aFridge0.3NULLWNULL

Middle

For this application, all the data is float, so this “middle” idea (See link above) is not useful.

Compare

The WIDE format looks good to me as we have a known set of metrics. But often I don’t need to report PF. Usually it’s just power/energy. We’ll “store” a lot of nulls (e.g. voltage, pf) but my reading of TSDB says this is not a big issue.

Power or Energy?

“Power” is less ‘correct’ but more simple to understand. “Energy” in kWh is dependant on the time-period (i.e. 1kWh over 1h is small 1kW but 1kWh over 5 minutes is big 60kW load).

Prefer ‘power’ in Watts

Materialized Tables

MATERIALIZED Tables are refreshable snapshots. We will use them to downsample data (5 minutes, 1 hour, etc).

Summary

WIDE table! We know all the metrics in advance.

It’s simpler to write from IotaWatt and simpler to query.

One big table with materialized tables to downsample to 5min, 1h

Indexes and PKs

Make a PK on (time, device, sensor). This will imply an index too, I think.

Check out more info on composite index and use EXPLAIN SQL to see if indexes are being used.

Results

I created the SQL to construct the hypertable, wrote a influx script to export to csv and then imported with \copy to psql:

testdb=# select * from iotawatt;
  timestamp        | device |    sensor    |   power   | pf | current | voltage
-------------------+--------+--------------+-----------+----+---------+--------
 2024-01-01 00:00:00+10 | hfs02a | Aircon          |     8.104 |    |         |        
 2024-01-01 00:05:00+10 | hfs02a | Aircon          |     8.066 |    |         |        
 2024-01-01 00:10:00+10 | hfs02a | Aircon          |     8.036 |    |         |        
 2024-01-01 00:15:00+10 | hfs02a | Aircon          |     8.012 |    |         |        
 2024-01-01 00:20

Further Bits to Check

  • Performance with lotsa data
  • Resource use at scale
  • UI: Tried PopSQL desktop… seems ok so far. pgadmin would be the other option.
  • Timezones… seems like good support with time_bucket. How to manage different devices in different timezones?

Leave a Reply

Your email address will not be published. Required fields are marked *