← Back to Recruiter Hub
Infrastructure AI / LLMs Privacy Docker

Local AI Hosting
Docker · Ollama · Open WebUI

Deployed a fully self-contained AI stack running large language models entirely offline. Sending sensitive security research to a cloud API creates a data exfiltration risk by design. This project eliminates that risk at the infrastructure level: no prompts leave the machine, no credentials are shared with a third party, and no data is retained by an external provider.

Stack Docker · Ollama · Open WebUI · Linux
Security Focus Data Sovereignty · Container Isolation · Attack Surface Reduction
Environment Fully Offline · Self-Contained · Zero API Exposure

Skills Applied

What This Project Demonstrates

Linux CLI
Docker & Containerization
Ollama (LLM Inference Engine)
Open WebUI
Model Management
Resource Management
Data Privacy Principles
Interface Configuration
Log Analysis
Troubleshooting
LLMs & AI Tooling
Research

Project Context

Why Host AI Locally?

This project started with a single question: what does it actually take to run AI models privately, without relying on any external service?

Data Sovereignty

Eliminating the Exfiltration Risk

Every prompt sent to a cloud AI provider leaves the machine. For security research, that means malware samples, log excerpts, configuration details, and vulnerability notes are transmitted to and potentially retained by a third party. The decision to self-host was a deliberate security control: close the data exfiltration risk at the infrastructure level rather than relying on a provider's privacy policy.

Cost Control

No API Fees

Third-party AI APIs charge per token. Self-hosting eliminates that cost entirely, making it viable to run queries at any volume without watching a billing meter. The only cost is compute, and that's hardware you already own.

Hands-On Learning

AI Without Abstraction

Using hosted AI tools is easy. Understanding how models are loaded, served, and managed is a different skill entirely. This project forced engagement with the underlying stack: model weights, runtime behavior, resource constraints, and interface configuration.

Availability

Always On, No Rate Limits

A local AI instance doesn't have API quotas, service outages from external providers, or throttling. Once running, it's available whenever you need it, which matters when integrating AI-assisted workflows into daily security work.

Build Process

How It Was Built

01

Docker as a Security Boundary

Packaged the entire AI stack inside Docker to enforce container isolation. Ollama and Open WebUI communicate through a shared internal Docker network, not the host system. Neither container has access to the host filesystem or other running services. This network segmentation reduces the attack surface: even if a container were compromised, lateral movement to the host or other services is contained. Containers can be torn down and rebuilt without touching the host environment.

02

Ollama: Local Model Inference

Ollama serves as the runtime engine for large language models. It handles model loading, memory management, and inference requests, all locally. Pulled and tested multiple models (Llama 3, Mistral, Phi-3) to evaluate performance versus resource consumption on available hardware. Learned that model selection is as much about resource constraints as capability.

03

Open WebUI: Browser Interface

Deployed Open WebUI alongside Ollama within the same Docker setup. Open WebUI provides a ChatGPT-style browser interface that talks to Ollama's local API, making model interaction accessible without command-line prompting. Configured the service to bind to the local network for multi-device access while keeping it off the public internet.

04

Resource Management

Running LLMs locally is resource-intensive. Monitored CPU, RAM, and temperature under load to find the ceiling of what the host hardware could handle. Adjusted model parameters and Docker resource limits to prevent instability. Takeaway: infrastructure decisions directly impact AI capability. This is as much a systems problem as a software one.

05

Log Analysis & Troubleshooting

Used Docker logs and Ollama's output streams to diagnose startup failures, model loading errors, and connectivity issues between containers. This reinforced the same log analysis skills used in SOC work: reading structured output, identifying failure points, and tracing issues back to root causes without external tooling.

06

AI as a Research Tool

Once the stack was running, used the local AI instance as a research and documentation aid, drafting queries, summarizing documentation, and exploring edge cases in configurations. This created a feedback loop: AI helped build the infrastructure, and the infrastructure became a better tool for ongoing work. Prompt engineering became a practical skill, not a theoretical one.

Lessons Learned

Key Takeaways

Docker networking between containers: Ollama and Open WebUI communicate via an internal Docker network, not the host system.
Model selection is a resource constraint problem. Larger models produce better output but require significantly more RAM and CPU.
Data privacy is an infrastructure decision, not just a policy. Running locally eliminates the data-sharing risk at the source.
Log analysis applies everywhere. The same skills used to read firewall logs transfer directly to diagnosing container failures.
AI as a productivity multiplier: using a local LLM to navigate sparse documentation reduced research time significantly.
Self-hosted AI stacks are viable on consumer hardware, but require intentional resource management to stay stable.

Interested in what Eduardo built?

Let's talk about how this applies to your open roles.