While I want to be monitoring what's going on inside the house, I'd also like to know how that relates to what's happening outside. I also thought this would be a good chance to start writing the scripts to deal with SQLite, without in parallel having to deal with anything more complicated that using the Requests module.
The hardest part about getting local weather data is having a suitable source, you can set up your own weather station, or depend on someone else to supply it. Perhaps in the future I'll go for the former option, but as a starting point I'm going to use data from someone else.
There are a few options, and they will vary where in the world you are, such as in Europe things like Netamo look interesting, but in the end I decided to use data from the Deutscher Wetterdienst (DWD), Germany's weather service.
The main advantage of using a national weather service is the reliability; both in existence and as it's being run by professionals, you can probably assume that the data collected is of good quality, with maintained and calibrated instruments. It's also free, or perhaps better described as 'pre-paid', as it's funded through the taxes you're already paying. For the UK this would be the Met Office, and in the US it's The National Weather Service.
The main downside of the DWD is that actually finding the where you can download the current weather data wasn't straight forward. Historic, and current, data is available via the DWD's OpenData server, which has a huge structure, and isn't really self-explanatory to navigate. There is a readme [pdf] that explains roughly how it works, but figuring out that if you want recent temperature data that you need to navigate to climate_environment/CDC/observations_germany/climate/10_minutes/air_temperature/now/
and then download a .zip file, having picked the most appropriate station available from this list, I feel could have been made easier. The good news is that once you've made it that far, and are willing to dig around the structure and files a bit you can find clear and precise descriptions of the measurements, abbreviations etc.
This data source is also very complete; you can get data on not just air temperature, but rainfall. sunshine, wind etc. and normally in different measures as well, so you know exactly what you're getting.
This is almost overwhelming, and really what I had hoped for was a simpler API that could be called, perhaps something that feeds this kind of simple overview page of major cities in Germany. This particular page doesn't appear to be fed by an API, and I don't feel that scraping a HTML page is a very robust way of gathering data, it's too likely to change and start failing and then need fixing.
There does appear to be an official API coming from Germany's Federal government digitisation initiative at https://dwd.api.bund.dev/, but this is focused on extreme weather warnings, for which there is also an app, but not so much current weather observations. There is a get
endpoint /stationOverviewExtended
, and a schema description in the API documents, but appears to be forecast, and not current conditions, even if maybe the forecast for 0 hours from now are the current conditions?
There is a sample schema given, it doesn't do a good job of explaining the contents, for example the 'temperature' section has values in the range 30-150 'somethings', numbers that don't make sense in Celsius, Fahrenheit or even Kelvin if it's meant to be referring to temperatures in Germany at the end of April. The schema describes the shape of the content (can I expect an integer, or a string), but not its meaning - a problem painfully familiar to any Data Engineer.
The last joker is the link to 'more extensive documentation' (their words, not mine), that links to a very random blog that contains two posts, the linked one and a 'Hello World' post. In it the author complains about the lack of a good DWD API (I know that feeling), and then lists their experiments with the above weather warning API, circa 2019.
At this point I'm back to downloading and the zipped CSV files from the open data server, which isn't too difficult, and basically all that the get DWD weather data Python script I wrote does, followed by stuffing the data into an SQLite database, in more detail the steps for getting this up and running, which are automated via the accompanying Ansible role.
- Install the dependencies needs, only Python 3, git and Sqlite3 for this application
- Clone the get-dwd-weather-data script into a folder.
- Create a virtual python environment. This might be overkill for this application, in most cases there are very simple requirements, but giving each script/Python app its own venv means all the separate items can be installed independently. I do with using the Ansible
command
module instead of thepip
module, as it doesn's seem to be possible to pass the--upgrade-deps
flag to venv, which is useful for making sure that your venv has the latest version of all the setup tools, instead of whatever the version is that was shipped with your distributions Python 3 package. It's probably possible to do this with thepip
module somehow, but I couldn't figure it out. - Install the config file for the weather data, setting the target weather station from which you want the data. The options for which metrological data is downloaded is hard coded into the Python script, and would be adjust there if you want other details. This is decided in file get-dwd-weather.py file as part of function
download_current_obs()
. - Run the weather download once - just to check that it's working, if it does fail you'll see the failure in Ansible, and know that something hasn't worked.
- Install the cron job, so that it runs every ten minutes, adding any new weather data that hasn't already been downloaded into the database.
( In the /templates folder of the Ansible role there is a simple Bash backup script backup-db.sh that can be used to back up the database, but it's better to use the backup-sqlite-db Python script and role, which is better as it doesn't block the database if it's backing up the database, or another process is using it.)
The biggest remaining annoyance with this is that while the resolution of the weather data is every ten minutes, there is often a delay of at least 30 minutes between the most recent temperature in the file and the current time. I would prefer a much quicker update time1, but this isn't critical, and in the spirit of not getting stuck trying to make something perfect, I'll call this job done, and move on.
Overview: Home Sensor Overview
This may just be a case of suddenly being able to see the delay between the measurement time and seeing the results, which you normally don't. If I open the weather app on my phone, or look at a website, I assume the temperature shown is very recent (less than 5 minutes old), but I don't have anything to base that on. In most cases I probably wouldn't be able to feel the difference physically, unless you live somewhere with very rapid changes in temperature, and even then local conditions would mean that if I measure the temperature in my garden, I would not expect it to be within more than a couple of degrees of any official weather station located many kilometres from my house. ↩