Local Profile Server for Graceful Shutdown with a UPS: A Use Case

Introduction

A common requirement for edge computing is to use an Uninterruptible Power Supply (UPS) to prevent data corruption during a power outage. This example use case demonstrates how to build and deploy an application that monitors a connected UPS and uses the Local Profile Server (LPS) API to gracefully shut down the edge node when power is lost. 

UPS vendors typically provide software packages to interact with their UPS devices and allow a computer to be shutdown immediately when switching to battery power, or possibly only when reaching a certain threshold. There are different software packages and configuration customizations required to interact with the UPS infrastructure and therefore EVE-OS does not include these packages as part of the base OS. 

This is a series of articles. You will likely follow them in this order.

  1. Manage App Instances with the Local Profile Server
  2. Local Profile Server for Graceful Shutdown with a UPS: A Use Case - You are here!
  3. Local Profile Server for Managing Profiles: A Use Case

Prerequisites

  • You must have at least the SysManager role in your ZEDEDA Cloud enterprise.
  • You must have an edge node onboarded.
  • Your edge node must be running EVE-OS version 12.0.1 or greater.
  • You must build your own application using the Offline Profile Server API doc.
  • This article assumes that you have Linux knowledge. 

Architecture

The example solution consists of a single containerized application that contains two main components:

  1. A UPS Daemon: This example uses apcupsd, a common open-source package for managing APC UPS devices. The daemon communicates with the UPS hardware over USB.
  2. A Local Profile Server: A web server (for example, a Python Flask application) that implements the EVE-OS LPS API.

Workflow

  1. The apcupsd daemon constantly monitors the status of the UPS.
  2. When main power is lost, the UPS switches to battery power. The apcupsd daemon detects this state change.
  3. The daemon triggers a pre-configured script (for example, onbattery or doshutdown).
  4. The script sends an HTTP request (for example, using curl) to an endpoint on the LPS application running in the same container.
  5. The LPS application receives the request and prepares a COMMAND_SHUTDOWN or COMMAND_SHUTDOWN_POWEROFF command.
  6. The next time EVE-OS sends its periodic status update to the /api/v1/devinfo endpoint, the LPS responds with the prepared shutdown command.
  7. EVE-OS receives the command and begins a graceful shutdown of all application instances, followed by the edge node itself.

Configuration and Deployment Overview

  1. Build the application container. See Example App for details. The following is an overview :
    1. Create a container image (for example, using Dockerfile) based on an OS like Alpine.
    2. Install the necessary software: a web server framework (python3, flask), the UPS management package (apcupsd), and any other tools (curl, kmod).
    3. The container should be configured to run the UPS daemon and the LPS web server at startup.
    4. Configure the apcupsd service with a script that sends a curl command to your LPS when a power failure event occurs. For example, the /etc/apcupsd/onbattery script could contain:
      Shell
      ..
      #!/bin/sh
      curl -X POST localhost:8888/lps/devcmd -H 'Content-Type: application/json' -d '{"command":"COMMAND_SHUTDOWN"}'
  1. Add the app to ZEDEDA Marketplace:
    1. Push your container image to a registry.
    2. In ZEDEDA Cloud, create a new application, referencing your container image.
    3. Configure the application with a network interface.
    4. Configure a "Direct Attach" I/O adapter for the USB port that the UPS will be connected to.
  2. Deploy the app instance:
    1. Deploy the application to your edge node.
    2. In the deployment configuration, map the logical USB label from your application definition to the physical USB port on the device.
  3. Enable the LPS on the edge node by adding tags to your edge node:
    1. Tag key: $ztag.local.profile.server.host
      Tag value: IP_OF_YOUR_UPS_APP
    2. Tag Key: $ztag.local.profile.server.token
      Tag Value: YOUR_SECRET

With this configuration, the UPS application will have direct control over the edge node's power state, ensuring a safe shutdown during a power outage.

Example App 

The following is an example of building the LPS for the UPS use case. Relevant snippets are included to understand how the example application works. 

Disclaimer: The code snippets provided are for demonstration purposes only and are not production-ready code. You must add additional error checking, security hardening, and log handling for any production deployment.

Proxy Server for Development

EVE-OS requires that the Local Profile Server be accessible at an IP address assigned to an application running on the edge node. This can complicate development, as code changes would require rebuilding and redeploying the application container to the edge node for each iteration.

To streamline development, a proxy server can be deployed on the edge node. This proxy forwards LPS API requests from EVE-OS to an external development machine, allowing for rapid testing of the LPS application code without repeated deployments.

The following is an example of a simple NGINX proxy server packaged in a container. It forwards traffic to a host and port specified by the $HOST and $PORT environment variables.

Dockerfile for proxy server

FROM yobasystems/alpine-nginx:stable
RUN apk add --update --no-cache openssh sudo
RUN echo 'PasswordAuthentication yes'  /etc/ssh/sshd_config
RUN adduser -h /home/pocuser -s /bin/sh -D pocuser
RUN echo -n 'pocuser:pocuser' | chpasswd
RUN adduser pocuser wheel
RUN echo '%wheel ALL=(ALL) ALL'  /etc/sudoers.d/wheel
ENTRYPOINT ["/entrypoint.sh"]
EXPOSE 22
COPY entrypoint.sh /

entrypoint.sh for proxy server

#!/bin/sh
ssh-keygen -A
echo "REDIRECT HOST ENV=$HOST"
echo "REDIRECT PORT ENV=$PORT"

read -r -d '' VAR << EOM
daemon on;
events {
worker_connections 10;
}
http {
server {
listen $PORT;
location / {
access_log off;
proxy_pass http://$HOST:$PORT;
}
}}
EOM

echo $VAR > /etc/nginx/nginx.conf
nginx
exec /usr/sbin/sshd -D

When this proxy application is running on the edge node, EVE-OS is configured to point to the proxy's local IP address. The proxy then redirects the LPS traffic to the developer's PC, which is running the main LPS application.

Protobuf Messages 

EVE-OS sends and receives messages encoded with Protocol Buffers (protobuf). To parse these messages, the LPS application needs the corresponding language-specific bindings generated from the EVE-API .proto files.

The EVE-API repository and its pre-generated Python bindings can be found on GitHub:

  • EVE-API Repo: https://github.com/lf-edge/eve-api
  • Python Bindings: https://github.com/lf-edge/eve-api/tree/main/python

Profile Server

A local profile server (LPS) is effectively a webserver. Any programming language/server will do. The following example is a Python Flask based application. Starting with a very basic server that just catches all requests and returns a 404, you can observe EVE performing the HTTP POST and GET requests. 

This snippet shows a minimal Flask server that logs incoming requests.

Python
..
@app.errorhandler(404)
def not_found(e):
    return jsonify({'message': 'not implemented'}), 404

if __name__ == '__main__':
    log = logging.getLogger('werkzeug')
    # Set logging level to INFO to see console messages for each request
    log.setLevel(logging.INFO)
    app.run(host='0.0.0.0', debug=True, port=8888)

When EVE-OS is directed to this server, the server's console will show a series of POST and GET requests to various API endpoints, such as /api/v1/radio, /api/v1/local_profile, /api/v1/appinfo, and /api/v1/devinfo.

Example Log Output:

192.168.1.142 - - [4/Oct/2023 19:53:41] "POST /api/v1/radio HTTP/1.0" 404 -
192.168.1.142 - - [4/Oct/2023 19:53:51] "GET /api/v1/local_profile HTTP/1.0" 404 -
192.168.1.142 - - [4/Oct/2023 19:53:58] "POST /api/v1/appinfo HTTP/1.0" 404 -
192.168.1.142 - - [4/Oct/2023 19:54:12] "POST /api/v1/devinfo HTTP/1.0" 404 -

API Endpoints

To create a functional LPS, the application must implement handlers for the API endpoints that EVE-OS uses. The primary endpoints for monitoring are /api/v1/devinfo and /api/v1/appinfo.

Serving /api/v1/devinfo

This endpoint receives information about the edge node, such as its UUID and current state. The LPS can respond with a command for EVE-OS to execute. In this initial example, it returns a COMMAND_UNSPECIFIED, which is a null command.


Python
..
@app.route('/api/v1/devinfo', methods=['POST'])
def devinfo():
    log_message(f"Received {request.url} {request.method} request")
    if request.method == 'POST':
        # Process the devinfo message and display device UUID and State
        infoObj=local_profile_pb2.LocalDevInfo()
        infoObj.ParseFromString(request.data)
        device_uuid = infoObj.device_uuid
        device_state = info_pb2._ZDEVICESTATE.values_by_number[infoObj.state].name
        log_message(f" Device-UUID: {device_uuid}, Device-State: {device_state}")

        # Respond with a 'COMMAND_UNSPECIFIED' (null command)
        localDevCmdObj=local_profile_pb2.LocalDevCmd()
        localDevCmdObj.server_token = app.config['SECRET_KEY']
        localDevCmdObj.timestamp = int(datetime.timestamp(datetime.now()))
        command = "COMMAND_UNSPECIFIED"
        localDevCmdObj.command = local_profile_pb2._LOCALDEVCMD_COMMAND.values_by_name[command].number
        r = Response(response=localDevCmdObj.SerializeToString(), status=200, mimetype="application/x-proto-binary ")
        log_message(f" Returned command: '{command}' response. Received status {r.status_code}")
        return r

Serving /api/v1/appinfo

This endpoint receives a list of all applications running on the edge node and their current states.

Python
..
@app.route('/api/v1/appinfo', methods=['POST'])
def appinfo():
    log_message(f"Received {request.url} {request.method} request")
    if request.method == 'POST':
        infoObj=local_profile_pb2.LocalAppInfoList()
        infoObj.ParseFromString(request.data)
        appsList = MessageToDict(infoObj)
        if 'appsInfo' in appsList.keys() and len(appsList['appsInfo'])  0:
            appsInfo = appsList['appsInfo']
            # Pretty-print the app-info
            header = appsInfo[0].keys()
            rows = [x.values() for x in appsInfo]
            print(tabulate(rows, header, tablefmt='simple_outline'))
        return "", 200

With these handlers implemented, the application will log the node state and display a table of running applications.

Example Log Output with appinfo:

2023-10-05 07:56:38 - 192.168.1.142: Received http://192.168.1.115:8888/api/v1/devinfo POST request 
2023-10-05 07:56:38 - 192.168.1.142: Device-UUID: 4bcb8cc9-f9de-4e3c-bba0-f7a42bcced66, Device-State: ZDEVICE_STATE_ONLINE 
2023-10-05 07:56:38 - 192.168.1.142: Returned command: 'COMMAND_UNSPECIFIED' response. Received status 200 2023-10-05 07:57:32 - 192.168.1.142: Received http://192.168.1.115:8888/api/v1/appinfo POST request 
┌──────────────────────────────────────┬───────────┬─────────────────────────────┬─────────┐ 
│ id │ version │ name │ state │ 
├──────────────────────────────────────┼───────────┼─────────────────────────────┼─────────┤ 
│ 0c19b501-bdd6-4cf7-8703-f101d121d78c │ 1 │ dd_app1 │ RUNNING │ 
│ 0d2ba8bc-9ab0-4286-8e59-348beb911fb2 │ 1 │ alpine1_multipv │ RUNNING │ 
│ 4b069f6a-3efb-46a3-bb4e-771a93cce28d │ 1 │ dd-cluster-site1-agent1 │ RUNNING │ 
│ 613a649e-e301-49b6-b418-5f744a2e467d │ 2 │ dd_alpine1 │ RUNNING │ 
│ 6865dded-a19d-4594-8c91-3e2ccdf71061 │ 1 │ nuc11_reverse_proxy │ RUNNING │ 
│ 71ccf0ef-48be-42a1-9ec8-34275674faa1 │ 1 │ dd-cluster-site1-seedserver │ RUNNING │ 
└──────────────────────────────────────┴───────────┴─────────────────────────────┴─────────┘

In addition to the reverse_proxy application, the example edge node is running a few different applications already. These are used as example applications to shut down based on power failure later.

Shutdown vs. Shutdown and Power Off

The LPS can send specific commands to EVE-OS. For the UPS use case, the relevant commands are COMMAND_SHUTDOWN and COMMAND_SHUTDOWN_POWEROFF.

  • COMMAND_SHUTDOWN: EVE-OS initiates a graceful shutdown of all application instances. The edge node remains powered on in a Prepared Power Off state.
  • COMMAND_SHUTDOWN_POWEROFF: EVE-OS gracefully shuts down all applications and then powers off the edge node hardware.

For more details on available commands, refer to the local_profile.proto definition.

To trigger these commands, the LPS application can implement its own API endpoint. In this example, an endpoint /lps/devcmd is created to receive a command via a curl request. The received command is stored in a global variable and sent to EVE-OS in the next response to a /api/v1/devinfo request.

The devinfo handler is updated to check for a pending command and include it in its response. It uses timestamps to ensure a command is sent only once, by comparing the timestamp of the last command EVE-OS processed (last_cmd_timestamp) with the timestamp of the new command. For more details, see the EVE-OS Profile API documentation.

When a COMMAND_SHUTDOWN is sent, the logs will show the edge node state transitioning to ZDEVICE_STATE_PREPARING_POWEROFF, and the application states changing from RUNNING to HALTING and finally HALTED.

Python
..
@app.route('/lps/devcmd', methods=['POST'])
def process_eve_cmd():
    global dev_cmd
    global dev_cmd_ts
    log_message(f"Received {request.url} {request.method} request")
    # We test for localhost url; bit of extra safety
    if request.method == 'POST' and 'http://localhost:8888' in request.url:
        content = request.json
        if 'command' in content:
            dev_cmd = content['command']
            dev_cmd_ts = int(datetime.timestamp(datetime.now()))
            log_message(f"Command '{dev_cmd}' marked with timestamp '{date_str(dev_cmd_ts)}' will be propagated to EVE node on next 'devinfo' POST request.")
            return "", 200
        else:
            log_message("Command not implemented")
            ic(content)
            return "", 204
    else:
        return "", 204 

Packaging the UPS Solution

The previous sections demonstrated a LPS that accepts a shutdown command via curl. This section explains how to build the target solution by integrating the LPS with UPS software. The specific implementation details will depend on the UPS vendor and software.

UPS Daemon and Scripts

The apcupsd daemon can be configured to execute scripts based on UPS events. For this use case, a script is triggered when the UPS switches to battery power. This script contains a curl command that sends the COMMAND_SHUTDOWN to the LPS application's /lps/devcmd endpoint.

Example onbattery script:

Bash
..
#!/bin/sh
curl -X POST localhost:8888/lps/devcmd -H 'Content-Type: application/json' -d '{"command":"COMMAND_SHUTDOWN"}'

In a production scenario, one might use the doshutdown script instead, which triggers after a set time on battery or when the remaining charge is low.

See http://www.apcupsd.org/ for many more potential scripts and script parameters. They they don't host the HTML manual as a single page anymore, but if you download the source .tar.gz, the manual is located in the doc/ directory.

USB Passthrough

For the apcupsd daemon to communicate with the UPS, it needs access to the physical USB device. This can be achieved by enabling USB passthrough for the application. When deploying the application, the logical USB label defined in the device model is mapped to the physical USB port where the UPS is connected.

The UPS application in this example uses the HIDDEV driver module (usbhid), which is available in EVE-OS version 10.9.0 and later. The application's entrypoint script must run modprobe usbhid to load the module.

Final Dockerfile

The Dockerfile combines the Python/Flask LPS, apcupsd, scripts, and other dependencies into a single image.

Dockerfile
..
FROM alpine:latest
ENV HOST_NAME=""

# Add python + packages
ENV PYTHONUNBUFFERED=1
RUN apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python
RUN python3 -m ensurepip
RUN pip3 install --no-cache --upgrade pip setuptools
RUN python3 -m pip install flask tabulate protobuf

# Add UPS daemon + config files
RUN apk add --update --no-cache openssh apcupsd curl sudo kmod
COPY apcupsd.conf /etc/apcupsd/apcupsd.conf
COPY onbattery /etc/apcupsd/onbattery
RUN chmod +x /etc/apcupsd/onbattery

# Add the profile server + entrypoint script
COPY local_profile_server/ /
COPY entrypoint.sh /

# Setup ssh access
RUN echo 'PasswordAuthentication yes'  /etc/ssh/sshd_config
RUN adduser -h /home/pocuser -s /bin/sh -D pocuser
RUN echo -n 'pocuser:pocuser' | chpasswd
RUN adduser pocuser wheel
RUN echo '%wheel ALL=(ALL) ALL'  /etc/sudoers.d/wheel
ENTRYPOINT ["/entrypoint.sh"]
EXPOSE 22

When the container starts, it will execute the following entrypoint.sh script:

Bash
..
#!/bin/sh 
ssh-keygen -A 
echo "#################################### USBs #################################" 
lsusb 
if [ "${HOST_NAME}" ]; then 
  hostname "$HOST_NAME" 
  echo "$HOST_NAME"  /etc/hostname 
fi 
modprobe usbhid 
echo "########################## Starting servers ###############################" 
echo "Starting SSHD" 
sh -c "/usr/sbin/sshd" 
echo "Starting LPS server" 
sh -c "cd / ; python3 local_profile_server.py | tee /tmp/app.log"

Note: For production use cases, add log rotation functionality. The use of tee /tmp/app.log in this example will cause the log file to grow indefinitely.

Results

After the app is deployed, connect to the app container to see the different servers running. The output of the LPS server is redirected to /tmp/app.log.

Bash
..
ups_lps_app:~$ ps 
PID  USER   TIME   COMMAND 
  1  root   0:00   {entrypoint.sh} /bin/sh /entrypoint.sh 
  8  root   0:00   sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups 
  9  root   0:00   /sbin/apcupsd 
 10  root   0:00   sh -c cd / ; python3 local_profile_server.py | tee /tmp/app.log 
 12  root   0:00   python3 local_profile_server.py 
 14  root   0:00   tee /tmp/app.log 
 23  root   0:00   sshd: pocuser [priv] 
 26  pocuser 0:00   sshd: pocuser@pts/0 
 27  pocuser 0:00   -sh 
 28  pocuser 0:00   ps

The apcaccess command queries the apcupsd service for the current status of the UPS.

Bash
..
ups_lps_app:~$ apcaccess status | egrep "APC|UPSNAME|STATUS|BCHARGE|LEFT" 
APC      : 001,036,0861 
UPSNAME  : UPS 
STATUS   : ONLINE 
BCHARGE  : 100.0 Percent 
TIMELEFT : 55.3 Minutes 
END APC  : 2023-09-19 13:49:40 +0000

This test uses the onbattery script to trigger the shutdown, which allows you to test the app by unplugging the UPS from its power source.

Bash
..
ups_lps_app:~$ cat /etc/apcupsd/onbattery 
#!/bin/sh 
echo "## EXEC 'onbattery' script"  /tmp/app.log 
curl -X POST localhost:8888/lps/devcmd -H 'Content-Type: application/json' -d '{"command":"COMMAND_SHUTDOWN"}'

The following log shows a test run where power is removed from the UPS, triggering the shutdown sequence.

Bash
..
ups_lps_app:~$ tail -f /tmp/app.log 
...

# Pulling the power on the UPS 
Power failure on UPS UPS. Running on batteries. 

# This triggers the onbattery script to POST to the LPS server 
2023-10-05 14:18:05 - 127.0.0.1: Received http://localhost:8888/lps/devcmd POST request 
2023-10-05 14:18:05 - 127.0.0.1: Command 'COMMAND_SHUTDOWN' marked with timestamp '2023-10-05 14:18:05' will be propagated to EVE node on next 'devinfo' POST request. 

# On the next devinfo POST, the LPS returns the COMMAND_SHUTDOWN to the edge node 
2023-10-05 14:18:20 - 192.168.1.142: Received http://192.168.1.115:8888/api/v1/devinfo POST request 
2023-10-05 14:18:20 - 192.168.1.142: Device-UUID: 4bcb8cc9-f9de-4e3c-bba0-f7a42bcced66, Device-State: ZDEVICE_STATE_ONLINE, Last Processed Command: 2023-10-05 11:20:34 
2023-10-05 14:18:20 - 192.168.1.142: Returned command: 'COMMAND_SHUTDOWN' response with timestamp: 2023-10-05 14:18:05 

# Confirming the shutdown 
2023-10-05 14:18:20 - 192.168.1.142: Received http://192.168.1.115:8888/api/v1/devinfo POST request 
2023-10-05 14:18:20 - 192.168.1.142: Device-UUID: 4bcb8cc9-f9de-4e3c-bba0-f7a42bcced66, Device-State: ZDEVICE_STATE_PREPARING_POWEROFF, Last Processed Command: 2023-10-05 14:18:05 

# Apps begin to halt
2023-10-05 14:18:21 - 192.168.1.142: Received http://192.168.1.115:8888/api/v1/appinfo POST request 
┌──────────────────────────────────────┬───────────┬─────────────────────────────┬─────────┐ 
│ id                                   │ version   │ name                        │ state   │ 
├──────────────────────────────────────┼───────────┼─────────────────────────────┼─────────┤ 
│ 613a649e-e301-49b6-b418-5f744a2e467d │ 2         │ dd_alpine1                  │ HALTING │ 
│ 6865dded-a19d-4594-8c91-3e2ccdf71061 │ 1         │ nuc11_reverse_proxy         │ RUNNING │ 
│ 71ccf0ef-48be-42a1-9ec8-34275674faa1 │ 1         │ dd-cluster-site1-seedserver │ HALTING │ 
│ 0c19b501-bdd6-4cf7-8703-f101d121d78c │ 1         │ dd_app1                     │ RUNNING │ 
│ 0d2ba8bc-9ab0-4286-8e59-348beb911fb2 │ 1         │ alpine1_multipv             │ RUNNING │ 
│ 4b069f6a-3efb-46a3-bb4e-771a93cce28d │ 1         │ dd-cluster-site1-agent1     │ RUNNING │ 
└──────────────────────────────────────┴───────────┴─────────────────────────────┴─────────┘ 
...

 

Summary

The EVE-OS LPS provides an API for applications to monitor and control the state of an edge node. This example demonstrated how to integrate third-party UPS management software with the LPS to ensure a clean shutdown of applications (and optionally the edge node itself) to prevent data loss during a power outage.

By separating the UPS client software and its associated logic into a dedicated application (for example, a container), you can flexibly handle different UPS vendors, software packages, and custom configurations without creating new dependencies for EVE-OS itself.

Was this article helpful?
1 out of 1 found this helpful