ken

The Zero-Waste Cloud Minecraft Server

Automating AWS Spot Instances for Cost Savings + User Friendly Discord Bot Interface

Intro

The challenge began with a classic performance-vs-cost trade-off. While studying abroad in Ireland, I needed a high-spec server to host a heavy modded Minecraft pack for friends across different timezones. Hosting locally was impossible due to security risks and the need for 24/7 uptime.

My initial, "lazy" cloud setup, leaving a high-spec On-Demand EC2 instance running nearly 24/7, resulted in a single-month bill exceeding $120. This expensive experience highlighted the necessity of cost optimization and became the catalyst for building a smarter, automated system.

Overview

The goal was to architect a production-grade system providing "on-demand" high-performance gaming with near-zero idle costs. I needed a solution where:

01

Persistent Data

The game state must be persistent despite using ephemeral hardware

02

User Controlled Wake

Friends must be able to wake the server without my attention or AWS Console access.

03

Automatic Shutdown

The infrastructure would "self-terminate" when idle to prevent runaway billing

Technical Implementation

The architecture follows a "Cattle, Not Pets" philosophy. Instead of a single persistent instance, the server lives on an ephemeral Spot Instance that provides up to a 90% discount compared to standard rates.

AWS Structure
AWS System Structure

To manage state, I decoupled storage from compute using a standalone Amazon EBS volume tagged server-files. On every boot, a master User Data Script programmatically finds the volume, mounts the filesystem, associates a static Elastic IP, and fetches secrets from the SSM Parameter Store.

ComputeEC2 Fleet + Spot (ARM)
ControlLambda + API Gateway
InterfaceDiscord Slash Commands

The user interface is a Serverless Discord Bot. When a user triggers /start, a Lambda function verifies the request signature using PyNaCl and modifies the Spot Fleet’s target capacity from 0 to 1, initiating a "cold start" of the entire stack.

Discord Bot Screenshot
Discord Bot with '/status' command

Key Outcomes

01

95% Cost Savings

Reduced monthly hosting fees from $120+ to ~$6 for typical use patterns.

02

Sub-Minute Boots

Developed a Custom AMI with Java 21 and mcrcon pre-baked to ensure the server is playable almost immediately after a command.

03

Self-Healing State

Implemented an orchestration layer where the World Data survives instance termination and automatically re-attaches to fresh hardware.

04

Secure Public Access

Authorized non-technical users to manage complex AWS infrastructure safely through a familiar chat interface.

Challenges & Solutions

Problem Ephemeral Compute Termination

Spot instances can be reclaimed by AWS with a 2-minute notice, which would normally delete all un-saved game progress.

Solution:

Configured a systemd shutdown service that triggers on instance termination, running an RCON 'save-all' and 'stop' command to flush data to the persistent EBS volume.

Problem Dynamic Fleet ID Management

Because Fleets are occasionally recreated, the Discord bot needed a way to know which Fleet ID to scale without updating the Lambda code manually.

Solution:

Integrated SSM Parameter Store as a dynamic configuration layer; the bot reads and writes the current 'active' Fleet ID to SSM to maintain a persistent link between the UI and the hardware.

Problem Resource-Heavy Library Packaging

Packaging the PyNaCl security library for Lambda is difficult because it requires C-extensions compiled for the specific Lambda Linux environment.

Solution:

Utilized Docker-based builds to compile the library, ensuring binary compatibility for the production environment.

Cost Analysis: From "Always-On" to "Pay-as-you-Play"

The primary driver of the original $120 bill was idle time: paying for high-spec compute while friends were asleep or at work. This new architecture shifts the majority of the cost from variable compute to a small set of predictable "flat fees."

The "Flat Fee"

Even when no one is playing, the system incurs minor costs to maintain the server's identity and data:

Elastic IP (EIP)~$3.60/mo to reserve a consistent entry point.
Amazon EBS~$0.80/mo to store the persistent world data
Total Maintenance~$4.40 / month

Compute Efficiency

For a standard t4g.small (ARM-based) instance, the Spot market price is roughly $0.01 per hour. This is an order of magnitude smaller than the flat fees, meaning a 6-hour session costs only about $0.07 in compute.

ComponentLazy System (24/7)This System (6hr Session)
Compute (EC2)>$120.00 /mo$0.07
Static IP (EIP)$3.60 /mo$3.60 /mo (idle)
Storage (EBS)$0.80 /mo$0.80
Total Monthly>$120~$4.40 + $0.01/hr

Scalability and Modpacks

While this comparison uses a vanilla server as a baseline, the logic remains consistent for high-performance modded servers. Even if a heavy modpack requires a t4g.large or m8g.large (5-7x the cost), the Spot Instance Pricing and Auto-Shutdown engine ensure that I am still saving 70-95% compared to a traditional "Always-On" cloud setup.

At the time of writing, a t4g.small spot instance running 24/7 for a month costs $5.20, while a m8g.large would cost $19.30 in the us-east-1 region. With cost savings from idle-time reductions in mind, the system is clearly a cost efficient upgrade over the lazy system.

Future Work