Posts - Arch · Dev · Ops

Cost-effective Kubernetes-based development environment on Hetzner

Hetzner Cloud offers robust VPS and dedicated server solutions at a fraction of the cost of major cloud providers like AWS, GCP, or Azure. This guide outlines the steps to configure a fully functional development environment on Hetzner Cloud, incorporating the following components 1) Secure Virtual Private Cloud (VPC) using Hetzner Cloud Networks for isolated networking. 2) WireGuard VPN for secure access to the VPC. 3) Hetzner Cloud Load Balancers (public and internal) to manage access to services. 4) Kubernetes Cluster to orchestrate and run containerized applications. 5) Hetzner Cloud Controller to enable Kubernetes to provision and manage Hetzner Cloud Load Balancers. 6) Hetzner CSI Driver for Kubernetes to dynamically provision and manage Hetzner Cloud Volumes. 7) Kubernetes Node Autoscaler for Hetzner to dynamically scale cluster capacity based on workload demands. 8) Ingress Nginx Controller to provide access to services. 9) Cert-Manager with Cloudflare Integration to automate valid TLS certificates for public and internal services. 10) Gitea Git Hosting Service with Gitea Actions for version control and CI/CD workflows. 11) ArgoCD for GitOps-driven deployments, ensuring continuous delivery and infrastructure consistency. Read More →

Basic LLM performance testing of A100, RTX A6000, H100, H200 Spot GPU instances from DataCrunch

It looks like DataCrunch currently offers one of the best cost-effective pricing plans for running GPU-powered Virtual machines for LLM applications. In this guide we will provision Spot GPU-powered VM at DataCrunch, create a Kubernetes cluster, install the NVIDIA Helm chart to unlock GPU-capabilities for the applications, install the Ollama Helm chart to run a self-hosted LLM in Kubernetes, configure OpenWebUI to use a remote self-hosted LLM and test LLM performance for different configurations (A100, RTX A6000, H100, H200). Read More →

Basic LLM performance testing of GPU-powered Kubernetes nodes from Rackspace Spot

It looks like Rackspace Spot currently offers attractive pricing plans for running GPU-powered Kubernetes nodes for LLM applications. In this guide we will create a managed Kubernetes cluster in Rackspace Spot, install the NVIDIA Helm chart to unlock GPU-capabilities for the applications, install the Ollama Helm chart to run a self-hosted LLM in Kubernetes, configure OpenWebUI to use a remote self-hosted LLM and test LLM performance for different configurations (A30, H100). Read More →

LLM context size support in Ollama Helm-chart

Ollama Helm chart simplifies the deployment of self-hosted LLMs on Kubernetes. However, it originally didn't support model creation. To bridge this gap, I extended the chart in my helm repository and proposed this functionality to the main Ollama Helm chart. Today, that proposal was successfully implemented by chart maintainers. Let's look how it works. Read More →

Self-hosted Wireguard VPN on VPS

Some AI services have geographic restrictions. To bypass these, you have to use a VPN in that location. While services like NordVPN or ProtonVPN are popular, self-hosting a WireGuard VPN on VPS offers a budget-friendly and private alternative. This guide walks you through setting up a cost-effective WireGuard VPN server and client. Read More →