Testing Infrastructure

Real Infrastructure.
Real Results.

No mocking. No fake environments. EnGenAI deploys code to actual GKE clusters for testing. If it breaks in production, it breaks in the sandbox first.

The Mock Testing Problem

Mocked tests pass. Production fails. The classic scenario: your mock returns what you told it to return — not what the real system does.

Race Conditions

Mocks execute synchronously. Real systems have network latency, concurrent writes, and transaction lock contention. Your mock never sees any of it.

Auth Edge Cases

Mocking auth means mocking your own assumptions. Token expiry during long requests, RBAC enforcement at the database level — invisible to mocked tests.

Real Network Latency

Timeout thresholds that work in mocks fail under real load. Resource limits that look fine locally hit hard in Kubernetes. Only real infrastructure reveals them.

"A mock that passes is only as good as the assumptions you built it on."

How Live Testing Works

Push code → Docker build → GCR push → ArgoCD sync → Live endpoint ready. Under 5 minutes. Every time.

Developer / Agent
pushes feature branch
push code
GitHub
feature branch
CI trigger
GitHub Actions
runs CI pipeline
Build Docker image
→ Push to GCR (Artifact Registry)
ArgoCD
syncs manifest → deploys
GKE Sandbox Cluster
Pod: engenai-app
test version
Pod: engenai-backend
test version
Pod: PostgreSQL
ephemeral test DB
Live Endpoint Ready
https://sandbox-{id}.dev.engenai.app
Run Integration Tests
against real endpoint
Results → Agent → Canvas
pass/fail surfaced in real-time
<5 minutes from commit to live endpoint

Test Types That Run

924 tests across the platform. Every PR runs the full suite — no skipping, no selective execution in CI.

Unit Tests
pytest / Jest

Individual function testing with mocks for external dependencies. Validates business logic in isolation — no network, no database.

Test count ~200 tests
Runtime <30s
Integration Tests
real PostgreSQL

API endpoint testing against a real PostgreSQL database. Catches constraint violations, transaction rollbacks, and auth edge cases invisible to unit tests.

Test count ~150 tests
Runtime ~2 min
Type Checking
mypy / TypeScript

Full static analysis across Python (mypy) and TypeScript. Zero errors enforced in CI — a single type error blocks the merge. One caught a real production bug.

Error tolerance 0 errors
Enforced CI gate
Linting
ruff / ESLint

Code style enforcement on every pull request. Ruff for Python, ESLint for TypeScript and React. Violations block the PR — no exceptions, no overrides.

On violation PR blocked
Runs on every PR

Bugs That Only Real Infrastructure Catches

We've caught bugs in production-equivalent environments that no mock would have found.

Real Databases

Catch constraint violations, unique key conflicts, and transaction isolation problems that in-memory stores silently swallow.

Real Networks

Catch timeout assumptions that fail under real network conditions. Real retries, real circuit breakers, real latency spikes.

Real Kubernetes

Catch resource limit errors, pod evictions, and readiness probe failures that only appear when containers run under real K8s scheduling.

Real Load

Catch concurrency issues that only emerge under realistic request patterns. Connection pool exhaustion. Lock contention. Race conditions.

<5min
from commit to live test endpoint
1,642+
tests in the current suite
0
mocks used in integration tests

Next: Security Architecture

Live testing is one layer. See how every other layer of the platform is secured by design.