Great! Our click bait title worked! But seriously stick around, this tutorial will get you a Prometheus + Grafana monitoring setup for resin devices in no time.
So a couple weeks ago we kicked the tires of
Prometheus to see if it's monitoring capabilities could be extended further than stock servers to the real world of dispersed IoT devices. We created a resin application that ran the
node exporter (collects Linux stats),
Prometheus server (scrapes node exporter data) and the
Alert Manager (Sends notifications when rules are broken).
Our previous
attempt was great because all the monitoring logic ran on the device so there was no need for a central server/database to be setup, but it was limiting for the same reasons as you couldn't see your entire fleets statistics in one place.
For a detailed explanation of what we did last time read the
blog post.
Revisiting the project I made some significant changes with significantly better outcomes.
What are we building?
This time, we are building a machine metrics monitoring mechanism for resin.io devices. Each device will host it's machine metrics via its resin enabled web URL.
A central Prometheus server will then use the resin API to discover these devices and scrape the metrics. The Prometheus Server Data will then be queried by a Grafana frontend and will display two dashboards "A Fleet view" and a "Single device view".
The best part is that this is all containerized so setup will be (almost) automatic.
The Device / Server Split
The app is split into two parts, a
device and a
server portion. The device portion is only responsible for running the node_exporter and exposing it via its resin web URL, the server portion is responsible for running the Prometheus Server and the Alert Manager as well as two new services.
The device portion is deployed via resin.io, while the server portion can be run on your local machine or hosted on remote server. For full instructions on deployment
skip ahead.
Discovery
Discovery is a custom node.js service that uses the resin API to keep Prometheus server aware of all of your resin devices.
It works by using the
resin Node.js SDK to poll the resin API for devices belonging to the configured application and then formats and writes the results to a JSON file that is readable by the Prometheus server.
resin.auth.login(credentials, function(error) {
if (error != null) {
throw error;
}
console.log("Successfully authenticated with resin API")
setInterval(function(){
resin.models.device.getAllByApplication(process.env.RESINAPP_NAME).then(function(devices) {
if (error) throw error;
saveJson(.map(devices, format));
});
}, process.env.DISCOVERY_INTERVAL);
});
The Prometheus server watches this JSON file and updates it's targets accordingly. Therefore you are able to synchronise Prometheus and resin.io accurate to the
DISCOVERY_INTERVAL
which is
set to 30000ms
by default.
All the discovered targets are viewable by visiting <prometheus-server-ip>:80/targets
.
Grafana
We also added
Grafana, a popular graphing library with a handy Prometheus plugin. The great thing about grafana is that it is very easily configurable. Below we automatically load the Prometheus data source
on startup via the api:
curl 'https://admin:admin@127.0.0.1:3000/api/datasources' -X POST -H 'Content-Type: application/json;charset=UTF-8' --data-binary '{"name":"Prometheus","type":"prometheus","url":"https://localhost:80","access":"proxy","isDefault":true}'
And then point grafana to a directory of JSON files with pre-made dashboards in the
grafana.ini
:
[dashboards.json]
enabled = true
path = /var/lib/grafana/dashboards
I've created two basic dashboards
all_devices
and
single_device
. The
all_devices
dashboard gives you a quick summary of the entire fleet for common metrics like CPU, memory and disk fullness.
The single_device
dashboard gives you a more detailed view of the same metrics using grafana's templating feature.
Running the App
Deploying the device portion
- Provision you're device(s) with resin.io
git clone git@github.com:resin-io-projects/resin-prometheus-device.git && cd resin-prometheus-device
git add remote resin <your-resin-app-endpoint>
git push resin master
- Enable your devices resin web URL
Of course, you'd typically run an actual application alongside the
node_exporter
. To do this you'd just add you app's logic to the
start.sh
echo "Your application code should go here!"
cd /etc/node_exporter-$NODE_EXPORTER_VERSION.$DIST_ARCH \
&& ./node_exporter -web.listen-address ":80"
Running the server portion
As I mentioned this can be run locally or on a remote server.
git clone git@github.com:resin-io-projects/resin-prometheus-server.git
- Add required environment variables in
Dockerfile
or at runtime.
- Optional: If you'd like persistent grafana storage run:
docker run -d -v /var/lib/grafana --name grafana-storage busybox:latest
docker build -t prometheus .
docker run -t -i -p 80:80 -p 3000:3000 --name resinMonitor prometheus
There is no login required to view the dashboards. But if you'd like to edit them you can use the default Grafana login.
user: admin
password: admin
Once you have made your dashboard changes export them as JSON and save them to
/dashboards
folder with the existing ones then rebuild the container.
To Summarise
We have set up basic machine metrics monitoring solution for IoT devices in just a few commands. Resin allowed us to deploy the same code (the node_exporter) to multiple devices as well as us a web-accessible URL for each device. At the same time the resin api allowed us to easily integrate those devices with 3rd party services like Prometheus and Grafana.
Prometheus gave us some super extendable services (
checkout all their others) and Grafana allowed us to automatically setup and customise visualisations. Safe to say these three work really well together.
I hope to keep improving this project with time, if you'd like to help please peruse the
issues and submit a PR if you get the feels.