Featured image of post Home Lab Maintenance

Home Lab Maintenance

Maintenance Plan

I love my home labs, at this point it has basically everything I could want. My own streaming media player, password manager, home automation, game servers, local AI, and even some websites. With my recent backup plan in place to feel confident in using these services at a production level. The only real struggle has been keeping up with all these services. They are kept isolated intentionally for security. How do I make sure they stay safe, secure and up to date?

Service Structure

I have a Proxmox LXC template used as a base for most services. This has Docker installed and then I run most services within docker. This allows for easy updating and network isolation using docker networks.

Stucture

  • Proxmox
    • LXC Instance
      • Docker

Maintenance Plan

Docker

Since most of my services are running in docker and I use watchtower in a scheduled maintenance window in the evenings. This is the only automated part of the process as all the docker containers are meant to be torn down and spun up and any static data is not messed with in these updates.

An example of the watchtower docker-compose.yml file with a CRON set to run updates at 4:00am est.

1
2
3
4
5
6
7
8
9
services:
  watchtower:
    image: containrrr/watchtower
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      TZ: America/New_York
      WATCHTOWER_SCHEDULE: "0 0 4 * * *"
    restart: unless-stopped

LXC Instance

This is where my new process comes into play, I have installed Ansible on a new instance that will be used to run Playbooks to update all the lxc instances at ones. An example of a playbook used to update servers

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
- name: Update and Upgrade Ubuntu Servers
  hosts: servergroup # This refers to a group defined in your Ansible inventory
  become: yes           # This ensures tasks are run with sudo privileges
  gather_facts: no      # No need to gather facts for these simple tasks, speeds things up

  tasks:
    - name: Update apt cache
      ansible.builtin.apt:
        update_cache: yes
        force_apt_get: yes # Use apt-get instead of apt for specific scenarios if preferred

    - name: Upgrade all packages
      ansible.builtin.apt:
        upgrade: dist # Use 'dist' for a distribution upgrade (handles dependency changes)
                      # or 'full' (same as dist-upgrade)
                      # or 'yes' (simple upgrade, may hold back some packages)
        autoremove: yes # Remove obsolete packages automatically
        force_apt_get: yes # Use apt-get instead of apt for specific scenarios if preferred

    - name: Reboot if necessary (e.g., kernel update)
      ansible.builtin.shell: "needs-restarting -r || echo 'No reboot needed'"
      register: reboot_status
      changed_when: "'reboot required' in reboot_status.stdout"
      failed_when: "'No reboot needed' not in reboot_status.stdout and 'reboot required' not in reboot_status.stdout" # Prevent failure if command not found or unexpected output
      ignore_errors: yes # Ignore errors if 'needs-restarting' is not installed

    - name: Perform reboot if required
      ansible.builtin.reboot:
        reboot_timeout: 600 # Wait up to 600 seconds (10 minutes) for the server to come back up
      when: "'reboot required' in reboot_status.stdout"

    - name: Clean up apt cache
      ansible.builtin.apt:
        autoclean: yes # Remove downloaded package files no longer available and in versions no longer needed