Local AI Hosting
Docker · Ollama · Open WebUI
Deployed a fully self-contained AI stack running large language models entirely offline. Sending sensitive security research to a cloud API creates a data exfiltration risk by design. This project eliminates that risk at the infrastructure level: no prompts leave the machine, no credentials are shared with a third party, and no data is retained by an external provider.
Skills Applied
What This Project Demonstrates
Project Context
Why Host AI Locally?
This project started with a single question: what does it actually take to run AI models privately, without relying on any external service?
Eliminating the Exfiltration Risk
Every prompt sent to a cloud AI provider leaves the machine. For security research, that means malware samples, log excerpts, configuration details, and vulnerability notes are transmitted to and potentially retained by a third party. The decision to self-host was a deliberate security control: close the data exfiltration risk at the infrastructure level rather than relying on a provider's privacy policy.
No API Fees
Third-party AI APIs charge per token. Self-hosting eliminates that cost entirely, making it viable to run queries at any volume without watching a billing meter. The only cost is compute, and that's hardware you already own.
AI Without Abstraction
Using hosted AI tools is easy. Understanding how models are loaded, served, and managed is a different skill entirely. This project forced engagement with the underlying stack: model weights, runtime behavior, resource constraints, and interface configuration.
Always On, No Rate Limits
A local AI instance doesn't have API quotas, service outages from external providers, or throttling. Once running, it's available whenever you need it, which matters when integrating AI-assisted workflows into daily security work.
Build Process
How It Was Built
Docker as a Security Boundary
Packaged the entire AI stack inside Docker to enforce container isolation. Ollama and Open WebUI communicate through a shared internal Docker network, not the host system. Neither container has access to the host filesystem or other running services. This network segmentation reduces the attack surface: even if a container were compromised, lateral movement to the host or other services is contained. Containers can be torn down and rebuilt without touching the host environment.
Ollama: Local Model Inference
Ollama serves as the runtime engine for large language models. It handles model loading, memory management, and inference requests, all locally. Pulled and tested multiple models (Llama 3, Mistral, Phi-3) to evaluate performance versus resource consumption on available hardware. Learned that model selection is as much about resource constraints as capability.
Open WebUI: Browser Interface
Deployed Open WebUI alongside Ollama within the same Docker setup. Open WebUI provides a ChatGPT-style browser interface that talks to Ollama's local API, making model interaction accessible without command-line prompting. Configured the service to bind to the local network for multi-device access while keeping it off the public internet.
Resource Management
Running LLMs locally is resource-intensive. Monitored CPU, RAM, and temperature under load to find the ceiling of what the host hardware could handle. Adjusted model parameters and Docker resource limits to prevent instability. Takeaway: infrastructure decisions directly impact AI capability. This is as much a systems problem as a software one.
Log Analysis & Troubleshooting
Used Docker logs and Ollama's output streams to diagnose startup failures, model loading errors, and connectivity issues between containers. This reinforced the same log analysis skills used in SOC work: reading structured output, identifying failure points, and tracing issues back to root causes without external tooling.
AI as a Research Tool
Once the stack was running, used the local AI instance as a research and documentation aid, drafting queries, summarizing documentation, and exploring edge cases in configurations. This created a feedback loop: AI helped build the infrastructure, and the infrastructure became a better tool for ongoing work. Prompt engineering became a practical skill, not a theoretical one.
Lessons Learned
Key Takeaways
Interested in what Eduardo built?
Let's talk about how this applies to your open roles.