Self-Hosted AI Coding Platforms: On-Prem, Customer-Cloud, and What “Self-Hosted” Really Means (2026)
“Self-hosted” gets used as marketing shorthand for two genuinely different deployment models: on-premises container installations (Docker, Kubernetes on customer hardware) and customer-cloud infrastructure-as-code installations (Terraform applying into the customer’s AWS / GCP / Azure account). Both keep operational ownership with the customer. They have different operational characteristics, different cost profiles, and different fit patterns. Treating them as one category obscures the choice that matters.
This post separates the two patterns, compares which AI coding platforms actually deliver each, and explains how the choice maps to real organizational constraints — existing cloud footprint, regulatory posture, on-prem investment, model-weight management appetite.
Two Self-Hosted Patterns
Pattern A: On-prem container deployment
The platform ships as Docker images or Helm charts. The customer deploys to their own Kubernetes cluster, on-premises VMs, or managed Kubernetes service (EKS, AKS, GKE). The customer manages persistence (Postgres, OpenSearch), networking, observability, and TLS.
Fit: organizations with significant on-prem investment, classified-program engineering teams who can’t use any cloud, or regulated industries with strict on-prem-only mandates. Tabnine Enterprise and Sourcegraph Cody self-hosted both follow this pattern.
Pattern B: Customer-cloud Terraform deployment
The platform ships as a Terraform module. The customer applies it to their own AWS account. The platform runs as Lambda + DynamoDB + S3 + Bedrock — standard cloud-native services the customer already uses for other workloads. No new infrastructure paradigm to manage.
Fit: organizations already running production AWS, regulated industries with cloud-deployed compliance posture (HIPAA-ready AWS, SOC 2-scoped AWS, FedRAMP GovCloud), and engineering teams that prefer serverless to container-managed footprint. OutcomeOps follows this pattern by default.
Comparison: Self-Hosted Patterns Across AI Coding Tools
Cells marked ⚠ reflect partial support, claimed-but-not-verified availability, or capabilities limited to higher tiers.
| Tool | On-prem container | Customer-cloud Terraform | Customer manages model weights |
|---|---|---|---|
| OutcomeOps | No (cloud-native by design) | Yes — AWS Bedrock | No — Bedrock-managed |
| Tabnine Enterprise | Yes (Docker) | No | Yes — on-prem model |
| Sourcegraph Cody Enterprise | Yes | ⚠ | ⚠ Configurable |
| Augment Code | No | ⚠ VPC tier claimed | No |
| GitHub Copilot | No | No | No |
| Cursor | No | No | No |
Status as of May 2026. Verify on vendor docs.
At Liberty Mutual we built Fusion — a Jenkins, Chef, and Docker Datacenter platform that ran in their AWS account and on-prem, side by side. Engineering teams maintained their own Fusionfiles in their own repos: declarative configs that defined sidecars (IBM DataPower, Nginx Enterprise), data layer (MongoDB, Redis, RDS, ElastiCache), and pre and post-deployment steps. The platform shipped as code; the customer applied it. By 2017, Liberty had over 330 services running in containers and hundreds of deployments a day, and Liberty Mutual became one of Docker Inc.’s flagship case studies — not because of what we sold them, but because of what they ran themselves. That is what self-hosted actually means. The platform is the customer’s. The vendor’s job is to ship a clean enough pattern that the customer can run it without phoning home. The same logic decides which AI coding platforms compound value past year one.
Operational Trade-offs: On-Prem vs. Customer-Cloud
On-prem container path
Strengths: Full control of physical infrastructure. No cloud egress cost. Compatible with classified programs that have no cloud option. Existing on-prem ops tooling (Prometheus, ELK, on-prem Kubernetes) applies.
Costs: Customer manages model weights and GPU infrastructure (or accepts smaller / older models). Updates require pulling new container images and applying through the customer’s release process. Scaling AI workloads on-prem is operationally heavier than scaling on a cloud-managed model service.
Customer-cloud Terraform path
Strengths: Cloud-native services the customer already uses. AWS Bedrock provides managed model invocation — no GPU management, no weight downloads. Existing CloudWatch, GuardDuty, and AWS Backup tooling applies. Existing AWS compliance posture (HIPAA, SOC 2, FedRAMP) covers the deployment.
Costs: Cloud egress and model invocation costs (paid directly to AWS, no vendor markup). Requires the customer to operate in AWS — not a fit for organizations without an AWS footprint.
For organizations already running on AWS, the customer-cloud path has materially lower operational overhead than on-prem container. For classified programs that cannot use any cloud, on-prem container is the only option. Most enterprises in 2026 land on customer-cloud because their AWS footprint is already mature.
When On-Prem Is the Right Answer
Specific situations favor the on-prem container path:
- No cloud allowed. Classified-program work where any commercial cloud is off-limits. Tabnine on-prem or Cody self-hosted on customer hardware are the realistic options.
- Existing on-prem investment. Organizations that just stood up large on-prem GPU clusters for ML/AI workloads. AI coding tooling can amortize the same infrastructure.
- Network-isolated regulated environments. Pharma manufacturing networks, OT/ICS environments, certain healthcare clinical networks. The customer’s existing on-prem Kubernetes cluster is the deployment target.
- Customer wants full model weight control. Some organizations want to fine-tune the model on internal data and host the resulting weights themselves — an on-prem path enables that.
For air-gapped defense work specifically (where on-prem is one of several valid answers), see Air-Gapped AI Coding for Defense and Aerospace.
When Customer-Cloud Is the Right Answer
The customer-cloud Terraform path fits the majority of enterprise scenarios:
- Existing AWS footprint at scale. The platform deploys into infrastructure the team already operates. No new ops paradigm.
- HIPAA-ready, SOC 2-scoped, or FedRAMP-authorized AWS. The existing compliance posture covers the deployment because it runs inside that posture.
- Bedrock model access. Anthropic Claude (and other Bedrock models) is the target generation engine; Bedrock’s managed inference avoids GPU management.
- Predictable cost economics. Direct AWS Bedrock charges, paid by the customer to AWS, no vendor markup. OutcomeOps generates features at $2–$4 each.
For more on the customer-cloud architecture specifically, see AI Coding Tool That Deploys in Your AWS Account.
What the Customer Owns Operationally
Either pattern shifts operational ownership. The customer is now responsible for:
- Capacity. Scaling Lambda concurrency (or container replicas) with usage. Standard cloud / orchestration patterns apply.
- Observability. CloudWatch metrics, X-Ray traces, custom dashboards. Existing observability stack covers the platform.
- Security monitoring. CloudTrail anomaly detection, GuardDuty findings, Security Hub aggregation. The platform shows up in existing tooling.
- Backup and DR. AWS Backup for DynamoDB and S3, cross-region replication where required. Same patterns the customer uses for other production workloads.
- Updates. Pull new Terraform module versions on the customer’s schedule, apply through normal change-control. No vendor-controlled auto-update.
None of this is unfamiliar territory for an enterprise team running production AWS. The trade-off is operational ownership in exchange for full control over data residency, audit boundary, cost economics, and upgrade timeline. For most enterprise teams in 2026, the trade favors self-hosted strongly.
How to Evaluate
The two-week PoC for self-hosted deployment varies by pattern. For the customer-cloud Terraform path:
- Day 1–3: Apply the Terraform into a non-production AWS account. Verify the architectural bill of materials matches existing patterns.
- Week 1: Connect representative repositories, generate code, inspect audit logs in customer DynamoDB.
- Week 2: Compliance and ops review. Verify operational tooling covers the deployment, no surprises in CloudWatch, no unexpected egress in VPC Flow Logs.
Book an enterprise briefing to start the PoC.
OutcomeOps: The Future of AI Engineering
Opens Substack in a new tab to confirm. No spam — unsubscribe anytime.
Related reading
- AI Coding Tool That Deploys in Your AWS Account — the customer-cloud architecture.
- Enterprise AI Coding That Stays in Your Infrastructure — the three-layer framing.
- Air-Gapped AI Coding for Defense and Aerospace — on-prem and air-gap.
- AI Coding Tools for Regulated Industries — the compliance lens.