+
Skip to content
/ home-ops Public

A wildly over-engineered repository for HomeOps where I try to perform Infrastructure as Code (IaC) and GitOps practices.

License

Notifications You must be signed in to change notification settings

bykaj/home-ops

HOME OPERATIONS REPOSITORY

Managed with Flux, Renovate, and GitHub Actions

Talos  Kubernetes  Flux  Renovate

Age-Days  Uptime-Days  Node-Count  Pod-Count  CPU-Usage  Memory-Usage


Table of Contents (click to expand)
  1. Overview
  2. Kubernetes
  3. Cloud Dependencies
  4. DNS
  5. Hardware
  6. Future Plans
  7. Gratitude and Thanks
  8. Stargazers
  9. License

💡 Overview

This is a mono repository for my wildly over-engineered home infrastructure and Kubernetes cluster, because apparently I hate free time. I try to follow Infrastructure as Code (IaC) and GitOps practices using enterprise-grade tools like Ansible, Kubernetes, Flux, Renovate and GitHub Actions—you know, the same stack Netflix uses, except mine just runs my Plex server and some smart lightbulbs. Ok, I also use some trusty bash scripts held together by duct tape and prayer.


🌱 Kubernetes

My Kubernetes cluster is deployed on a three Proxmox VE node cluster with a Talos virtual machine on every node. This is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate virtualized TrueNAS server with multiple ZFS pools for NFS/SMB shares, bulk file storage and backups.

There is a template available at onedr0p/cluster-template if you want to try and follow along with some of the practices I use here.

Core Components

  • actions-runner-controller – Self-hosted GitHub runners.
  • cert-manager – Creates SSL certificates for services in my cluster.
  • cilium – eBPF-based networking for my workloads.
  • cloudflared – Enables Cloudflare secure access to my routes.
  • external-dns – Automatically syncs ingress DNS records to a DNS provider (see DNS below).
  • external-secrets – Kubernetes secrets injection using 1Password Connect.
  • flux – Syncs Kubernetes configuration in Git to the cluster.
  • kube-prometheus-stack – Kubernetes cluster monitoring and alerting.
  • openebs – Local container-attached storage for caching.
  • rook – Distributed block storage with Ceph for persistent storage.
  • sops – Managed secrets using AGE encryption for Kubernetes and Ansible which are commited to Git.
  • spegel – Stateless local OCI registry mirror.
  • volsync – Backup and recovery of persistent volume claims.

GitOps

Flux watches the cluster in my kubernetes folder (see Folder Structure below) and makes the changes to my cluster based on the state of my Git repository.

The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml will generally only have a namespace resource and one or many Flux kustomizations (ks.yaml). Under the control of those Flux kustomizations there will be a HelmRelease or other resources related to the application which will be applied.

Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.

Folder Structure

This Git repository contains the following directories:

📁 /
├── 📁 kubernetes/
│   ├── 📁 apps/        # Application deployments (organized by namespace)
│   ├── 📁 components/  # Re-useable kustomize components
│   └── 📁 flux/        # Flux system configuration
├── 📁 talos/           # Talos cluster configuration
├── 📁 bootstrap/       # Initial cluster bootstrap (Helmfile)
└── 📁 scripts/         # Utility scripts

Flux Workflow

This is a high-level look how Flux deploys my applications with dependencies. In most cases a HelmRelease will depend on other HelmRelease's, in other cases a Kustomization will depend on other Kustomization's, and in rare situations an app can depend on a HelmRelease and a Kustomization. The example below shows that plex won't be deployed or upgraded until the rook-ceph-cluster Helm release is installed or in a healthy state.

graph TD
    A>Kustomization: rook-ceph] -->|Creates| B[HelmRelease: rook-ceph]
    A>Kustomization: rook-ceph] -->|Creates| C[HelmRelease: rook-ceph-cluster]
    C>HelmRelease: rook-ceph-cluster] -->|Depends on| B>HelmRelease: rook-ceph]
    D>Kustomization: plex] -->|Creates| E(HelmRelease: plex)
    E>HelmRelease: plex] -->|Depends on| C>HelmRelease: rook-ceph-cluster]
Loading

😶 Cloud Dependencies

While most of my infrastructure and workloads are self-hosted, I do rely on the cloud for certain key parts:

  • 1Password – Password management and Kubernetes secrets injection with External Secrets.
  • Cloudflare – Public DNS, Zero Trust tunnel and hosting Kubernetes schemas.
  • Fastmail – Email hosting.
  • GitHub – Hosting this repository and continuous integration/deployments.
  • Pushover – Kubernetes alerts and application notifications.
  • Storj – S3 object storage for applications and backups.

This helps me avoid three major headaches:

  1. Chicken-and-egg scenarios – Dependencies that prevent initial system bootstrapping.
  2. Critical service availability – Services I need whether my cluster is up or not.
  3. The "hit by a bus" factor – Making sure critical apps like email, password management, and photo storage stay accessible to my family and friends when I'm no longer around.

I could tackle the first two problems by spinning up another Kubernetes cluster in the cloud and deploying alternative apps like HCVault, Vaultwarden, ntfy, and Gatus. But honestly, maintaining another cluster and babysitting more workloads would be way more work and cost. Something about free time.


🌎 DNS

My cluster implements a split-horizon DNS configuration using two ExternalDNS instances, each handling different DNS zones. This setup allows me to maintain separate private and public DNS records while orchestrating them through distinct ingress classes.

The first ExternalDNS instance manages private DNS records, syncing them to my UniFi UDM gateway via the ExternalDNS webhook provider for UniFi. The second instance handles public DNS records, syncing them directly to Cloudflare. Each instance monitors only its designated ingress class—internal for private DNS management and external for public DNS synchronization—ensuring precise control over which DNS platform receives updates.

To complete the setup, I've configured a third (internal) ingress class called services that serves as a reverse proxy for external services running outside the cluster but within my private network.


⚙ Hardware

Device Num Disks RAM Network Function
Lenovo M920q, i5-8500T 2 1TB NVMe 64GB 10Gb Proxmox VE Host
Self-built 3U, i7-6700K 1 512GB SSD, 1TB NVMe, 5x14TB SATA (ZFS), 5x4TB SAS (ZFS) 64GB 10Gb Proxmox VE Host, SMB/NFS + Backup Server
UniFi UDM Pro Max 1 8TB SATA - 10Gb Router & NVR
UniFi USW Pro HD 24 PoE 1 - - 2.5Gb/10Gb PoE Core Switch
UniFi USW Flex 2.5G 5 1 - - 2.5Gb Switch
Home Assistant Yellow 1 8GB eMMC, 256GB NVMe 4GB 1Gb Home Automation
PiKVM V4 Plus 1 32GB eMMC 8GB 1Gb KVM
JetKVM 3 8GB eMMC - 100Mb KVM
Eaton Ellipse Pro 650 2U 1 - - - UPS

🔮 Future Plans

  • Upgrading to more powerful hardware – I'm planning to replace my current Lenovo M920q units and self-built server with three Minisforum MS-01 units as Proxmox VE hosts.
  • Building a distributed storage foundation – The new hardware will enable me to implement Ceph distributed block storage directly on my Proxmox VE cluster, creating true high availability. My Kubernetes cluster can then leverage this same storage layer using only the rook-ceph-operator as an entry point, eliminating the need for separate storage components within Kubernetes.
  • Expanding network capacity – I'll add an aggregation switch (most likely the UniFi USW-Aggregation) since my current 10Gb SFP+ ports are at capacity. This also aligns with networking best practices.
  • Optimizing inter-node connectivity – I'm implementing 20Gb Thunderbolt networking between cluster nodes, plus dedicated 10Gb SFP+ connections for virtualized Kubernetes nodes to the aggregation switch.
  • Dedicated NAS hardware – TrueNAS will move from its current virtualized setup with hardware passthrough to running bare-metal on my existing 3U server.
  • Better power management – I'll upgrade to a more powerful UPS and add a managed PDU for improved power distribution and management.

🙏 Gratitude and Thanks

A lot of inspiration for my cluster comes from the people that have shared their clusters using the k8s-at-home GitHub topic. Be sure to check out the Kubesearch tool for ideas on how to deploy applications or get ideas on what you can deploy.

For learning the basics of running and maintaining a Kubernetes cluster, particularly K3s, I highly recommend starting with Jim's Garage excellent Kubernetes at Home series. Once you're comfortable with the basics and ready to automate your deployments, Techno Tim's K3s Ansible guide provides a great foundation for automated cluster rollouts. Thanks to both @JamesTurland and @timothystewart6 for these great resources!

And of course, shoutout to @QNimbus for his bash scripts that are more engineered than a Swiss watch—but hey, they actually work!


🌟 Stargazer


🔒 License

See LICENSE. TL;DR: Do with it as you please, but if it becomes sentient, you're responsible for teaching it manners.

About

A wildly over-engineered repository for HomeOps where I try to perform Infrastructure as Code (IaC) and GitOps practices.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •  
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载