The Zero-Waste Cloud Minecraft Server
Automating AWS Spot Instances for Cost Savings + User Friendly Discord Bot Interface
Intro
The challenge began with a classic performance-vs-cost trade-off. While studying abroad in Ireland, I needed a high-spec server to host a heavy modded Minecraft pack for friends across different timezones. Hosting locally was impossible due to security risks and the need for 24/7 uptime.
My initial, "lazy" cloud setup, leaving a high-spec On-Demand EC2 instance running nearly 24/7, resulted in a single-month bill exceeding $120. This expensive experience highlighted the necessity of cost optimization and became the catalyst for building a smarter, automated system.
Overview
The goal was to architect a production-grade system providing "on-demand" high-performance gaming with near-zero idle costs. I needed a solution where:
Persistent Data
The game state must be persistent despite using ephemeral hardware
User Controlled Wake
Friends must be able to wake the server without my attention or AWS Console access.
Automatic Shutdown
The infrastructure would "self-terminate" when idle to prevent runaway billing
Technical Implementation
The architecture follows a "Cattle, Not Pets" philosophy. Instead of a single persistent instance, the server lives on an ephemeral Spot Instance that provides up to a 90% discount compared to standard rates.

To manage state, I decoupled storage from compute using a standalone Amazon EBS volume tagged server-files. On every boot, a master User Data Script programmatically finds the volume, mounts the filesystem, associates a static Elastic IP, and fetches secrets from the SSM Parameter Store.
The user interface is a Serverless Discord Bot. When a user triggers /start, a Lambda function verifies the request signature using PyNaCl and modifies the Spot Fleet’s target capacity from 0 to 1, initiating a "cold start" of the entire stack.

Key Outcomes
95% Cost Savings
Reduced monthly hosting fees from $120+ to ~$6 for typical use patterns.
Sub-Minute Boots
Developed a Custom AMI with Java 21 and mcrcon pre-baked to ensure the server is playable almost immediately after a command.
Self-Healing State
Implemented an orchestration layer where the World Data survives instance termination and automatically re-attaches to fresh hardware.
Secure Public Access
Authorized non-technical users to manage complex AWS infrastructure safely through a familiar chat interface.
Challenges & Solutions
Problem Ephemeral Compute Termination
Spot instances can be reclaimed by AWS with a 2-minute notice, which would normally delete all un-saved game progress.
Solution:
Configured a systemd shutdown service that triggers on instance termination, running an RCON 'save-all' and 'stop' command to flush data to the persistent EBS volume.
Problem Dynamic Fleet ID Management
Because Fleets are occasionally recreated, the Discord bot needed a way to know which Fleet ID to scale without updating the Lambda code manually.
Solution:
Integrated SSM Parameter Store as a dynamic configuration layer; the bot reads and writes the current 'active' Fleet ID to SSM to maintain a persistent link between the UI and the hardware.
Problem Resource-Heavy Library Packaging
Packaging the PyNaCl security library for Lambda is difficult because it requires C-extensions compiled for the specific Lambda Linux environment.
Solution:
Utilized Docker-based builds to compile the library, ensuring binary compatibility for the production environment.
Cost Analysis: From "Always-On" to "Pay-as-you-Play"
The primary driver of the original $120 bill was idle time: paying for high-spec compute while friends were asleep or at work. This new architecture shifts the majority of the cost from variable compute to a small set of predictable "flat fees."
The "Flat Fee"
Even when no one is playing, the system incurs minor costs to maintain the server's identity and data:
Compute Efficiency
For a standard t4g.small (ARM-based) instance, the Spot market price is roughly $0.01 per hour. This is an order of magnitude smaller than the flat fees, meaning a 6-hour session costs only about $0.07 in compute.
| Component | Lazy System (24/7) | This System (6hr Session) |
|---|---|---|
| Compute (EC2) | >$120.00 /mo | $0.07 |
| Static IP (EIP) | $3.60 /mo | $3.60 /mo (idle) |
| Storage (EBS) | $0.80 /mo | $0.80 |
| Total Monthly | >$120 | ~$4.40 + $0.01/hr |
Scalability and Modpacks
While this comparison uses a vanilla server as a baseline, the logic remains consistent for high-performance modded servers. Even if a heavy modpack requires a t4g.large or m8g.large (5-7x the cost), the Spot Instance Pricing and Auto-Shutdown engine ensure that I am still saving 70-95% compared to a traditional "Always-On" cloud setup.
At the time of writing, a t4g.small spot instance running 24/7 for a month costs $5.20, while a m8g.large would cost $19.30 in the us-east-1 region. With cost savings from idle-time reductions in mind, the system is clearly a cost efficient upgrade over the lazy system.
Future Work
Infrastructure as Code
Migrating the manual console setup to Terraform to allow for one-click replication of the entire VPC and server stack.
Automated S3 Backups
Implementing a nightly cron job to snapshot the EBS volume and upload compressed world data to Amazon S3 for disaster recovery.
Admin UI
Creating an intuitative UI to configure the server to easily modify server settings, EC2 instance types, and more.