Why not Influx
I’ve been using Influx (v1 and v2). Flux is great but now (2025) unsupported. Influx has bugs with won’t be fixed. The worst is duplicate points which means you can’t rely on your database to return the correct result! I also had problems with memory and cpu overload.
The mess of versions, orphaned versions and late versions caused me to look elsewhere. Let’s try TimescaleDB!
Install
- Follow install docs
- I had to upgrade psql to v17:
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt update
sudo apt upgrade
- sudo systemctl restart postgresql.service
pg_lsclusters
# and pg_drop v14 when you’re ready
- Couldn’t install dev (try later if required)
sudo apt install postgresql-server-dev-17
- Had problems with my Ubuntu 22 install:
- Python3.12 error: remove python3.12-minimal
sudo timescaledb-tune
- … and used defaults
- I had to upgrade psql to v17:
Get Started
- Make a testdb in postgres
- Remember
sudo su postgres
if you want to get postgres user (not psql>) shell. - Load the extension, etc as per docs
- Try tutorial 1 (PV data)
- When creating day CQ with
CREATE MATERIALIZED VIEW kwh_day_by_day(time, value)
the result has timestamps of 08:00:00+10 which needs checking – refer to timezones, “origin” concept in TSDB, etc. - Had trouble making an new testdb and testuser (Google/ChatGPT fixed) in order to allow Grafana to connect.
- When creating day CQ with
Thoughts so Far
- Docs pretty good.
- IoT data is a clear use-case, which is reassuring.
- The aggregate queries and support look good and easy to manage
- More security options (if required) like RLS
- Standard python library as standard PSQL
Schemas for IoT in SQL
Tables
Since it’s SQL, we need a fixed schema. Lots of good info (start here). There is the pseudo structure from Influx
REF | SENSOR | TIME | VALUE | UNIT |
hfs02a | Net | 2010…+10 | .1 | W |
hfs02a | Net | 2010…+10 | .2 | PF |
hfs02a | Fridge | 2010…+10 | .3 | W |
hfs02a | Fridge | 2010…+10 | W |
Narrow
Having one table for each measurement (e.g. “power” table (average Watts for period), or equivalently “energy” (total kWh for period) would be convenient for querying.
We could add a “PF” or “Current” or “Voltage” table if required later.
time | ref | sensor | metric | value | unit |
---|---|---|---|---|---|
2010-01-01 00:10:00 | hfs02a | Net | power | 0.1 | W |
2010-01-01 00:10:00 | hfs02a | Net | pf | 0.2 | PF |
2010-01-01 00:10:00 | hfs02a | Fridge | power | 0.3 | W |
Wide
We want to easily write from IotaWatt. This code combines measurements (e.g. power, current, etc) into one line and writes. That’s nice and simple. This would suit WIDE table.
time | ref | sensor | power | pf | unit_power | unit_pf |
---|
2010-01-01 00:10:00 | hfs02a | Net | 0.1 | 0.2 | W | PF |
2010-01-01 00:10:00 | hfs02a | Fridge | 0.3 | NULL | W | NULL |
Middle
For this application, all the data is float, so this “middle” idea (See link above) is not useful.
Compare
The WIDE format looks good to me as we have a known set of metrics. But often I don’t need to report PF. Usually it’s just power/energy. We’ll “store” a lot of nulls (e.g. voltage, pf) but my reading of TSDB says this is not a big issue.
Power or Energy?
“Power” is less ‘correct’ but more simple to understand. “Energy” in kWh is dependant on the time-period (i.e. 1kWh over 1h is small 1kW but 1kWh over 5 minutes is big 60kW load).
Prefer ‘power’ in Watts
Materialized Tables
MATERIALIZED
Tables are refreshable snapshots. We will use them to downsample data (5 minutes, 1 hour, etc).
Summary
WIDE table! We know all the metrics in advance.
It’s simpler to write from IotaWatt and simpler to query.
One big table with materialized tables to downsample to 5min, 1h
Indexes and PKs
Make a PK on (time, device, sensor). This will imply an index too, I think.
Check out more info on composite index and use EXPLAIN
SQL to see if indexes are being used.
Results
I created the SQL to construct the hypertable, wrote a influx script to export to csv and then imported with \copy to psql:
testdb=# select * from iotawatt;
timestamp | device | sensor | power | pf | current | voltage
-------------------+--------+--------------+-----------+----+---------+--------
2024-01-01 00:00:00+10 | hfs02a | Aircon | 8.104 | | |
2024-01-01 00:05:00+10 | hfs02a | Aircon | 8.066 | | |
2024-01-01 00:10:00+10 | hfs02a | Aircon | 8.036 | | |
2024-01-01 00:15:00+10 | hfs02a | Aircon | 8.012 | | |
2024-01-01 00:20
Further Bits to Check
- Performance with lotsa data
- Resource use at scale
- UI: Tried PopSQL desktop… seems ok so far. pgadmin would be the other option.
- Timezones… seems like good support with time_bucket. How to manage different devices in different timezones?