Skip to content

IaC Principles

Infrastructure as Code is not a tool or a technology — it is a discipline and a philosophy. It means applying the same practices used to build reliable software (version control, testing, code review, automation) to the infrastructure that software runs on.

The shift from manually managed servers to code-defined infrastructure is one of the most consequential changes in how modern systems are built and operated.


Traditional “Iron Age” infrastructure was static. Servers were physical machines, provisioned by hand, configured manually, and treated as long-lived, precious resources - the “pet” model. Changing them was risky. Replacing them was expensive.

Cloud-age infrastructure is dynamic. Resources are API-driven, provisioned in minutes, and designed to be disposable - the “cattle” model.

Iron AgeCloud Age
Physical servers, manual provisioningAPI-driven, programmatic provisioning
Servers as pets - named, cherished, maintainedServers as cattle - numbered, replaced on failure
Infrastructure changes are risky, infrequentInfrastructure changes are routine, automated
Snowflake configurations (unique, fragile)Reproducible, identical environments
Change management is a bottleneckChange management is embedded in code review

Why IaC Exists: The Core Problem It Solves

Section titled “Why IaC Exists: The Core Problem It Solves”

Manual infrastructure management fails at scale for three reasons:

  1. No repeatability - Two engineers provisioning the same environment by hand produce different results. Debugging becomes archaeology.
  2. No shareability - Tribal knowledge locked in one engineer’s head becomes a bus-factor risk.
  3. Drift - Environments configured by hand diverge over time. Production works; staging doesn’t. The cause is never documented.

IaC replaces all three failure modes with code:

ProblemIaC Solution
RepeatabilitySame code produces identical infrastructure every time
ShareabilityCode lives in Git — reviewable, forkable, discoverable
DriftDesired state is declared; any deviation can be detected and corrected automatically

IaC rests on three practices that every team must adopt before anything else:

If it’s not in code, it doesn’t exist. This includes:

  • Compute resources (VMs, GKE node pools, Cloud Run services)
  • Networking (VPCs, subnets, firewall rules, load balancers)
  • IAM (service accounts, bindings, organization policies)
  • Storage (buckets, databases, Pub/Sub topics)
  • Configuration (environment variables, feature flags)

2. Continually Test and Deliver All Work in Progress

Section titled “2. Continually Test and Deliver All Work in Progress”

IaC that is only applied once a week is not continuous delivery — it’s batch infrastructure changes. IaC should be held to the same standard as application code: every commit triggers validation, every PR triggers a plan review, and merging to main triggers automated deployment.

3. Build Small, Simple Pieces That Can Change Independently

Section titled “3. Build Small, Simple Pieces That Can Change Independently”

Monolithic infrastructure stacks — one giant Terraform configuration that manages everything — are the infrastructure equivalent of a monolithic application. When something changes, everything is at risk. Small, composable stacks with clear interfaces fail in isolation and change safely.


The Four Key Metrics (DORA for Infrastructure)

Section titled “The Four Key Metrics (DORA for Infrastructure)”

DORA Metrics

The same DORA metrics used to measure software delivery performance apply directly to infrastructure:

MetricInfrastructure meaning
Deployment frequencyHow often infrastructure changes are applied to production
Lead time for changesTime from an infrastructure change commit to it running in production
Change failure ratePercentage of infrastructure changes that cause incidents
MTTRHow quickly infrastructure failures are recovered from

Beyond the core practices, the following are seven properties that all well-designed cloud infrastructure should have:

PrincipleWhat it means in practice
Assume systems are unreliableDesign for failure — don’t assume a VM, network call, or managed service will always be available
Make everything reproducibleAny resource can be destroyed and recreated from code; no manual steps required
Avoid snowflake systemsIf it took a specific person to build it, it’s a snowflake. Replace it with code.
Create disposable thingsInfrastructure components are replaced, not repaired. Immutability is the default.
Minimize variationDev, staging, and production environments should be as identical as possible — defined by the same code with only parameterized differences
Ensure any procedure can be repeatedRunbooks become pipelines. No step that humans do manually that could fail differently each time.
Apply software design principlesSeparation of concerns, single responsibility, DRY — these apply to infrastructure code too

Aligning Infrastructure with Organizational Strategy

Section titled “Aligning Infrastructure with Organizational Strategy”

Strategic Hierarchy

The Strategic Hierarchy The alignment of infrastructure with an organization’s broader goals is fundamentally driven by customer value. This creates a top-down strategic flow where organizational strategy drives product strategy, which drives technology strategy, and ultimately dictates infrastructure strategy. In return, each foundational technical layer must be designed to explicitly support the strategic business layers above it.

The Disconnect Between Leadership and Engineering A significant challenge for many organizations is the communication gap between the people making strategic commercial decisions and the engineers building the foundational systems.

  • Leadership blind spots: Organizational leaders often dismiss the need for detailed infrastructure planning, mistakenly assuming that simply selecting a cloud vendor is the end of the process. When architectural problems eventually limit growth, security, or stability, these leaders tend to demand quick fixes rather than addressing the structural root causes.
  • Engineering blind spots: Conversely, engineering teams frequently focus on implementing obvious technical solutions without thoroughly understanding the commercial context or the end-user requirements. For example, an engineering team once built a highly segregated multiregion cloud architecture to strictly comply with privacy regulations. Because they did not communicate closely with the commercial strategy team, they missed a critical business requirement: users needed international roaming access while traveling abroad. This strategic misalignment resulted in massive delays, immense expense, and a necessary total rework of the system architecture.

Mapping Business Goals to Infrastructure Capabilities To prevent costly misalignments, it is essential that everyone—from boardroom executives to software developers—understands how technical architecture either enables or hinders strategic success. Specific business goals directly require specific infrastructure capabilities:

  • Delivering continuous customer value: To release new products and features quickly and reliably, an organization requires infrastructure that easily supports developing, testing, and hosting services. Success in this area is measured by strong performance on the four key metrics (delivery lead time, deployment frequency, change fail percentage, and MTTR) and a low dependency on platform teams for routine software delivery tasks.
  • Growing revenue and expanding into new markets: Expanding into new geographic regions or launching new product lines demands the ability to rapidly deploy new hosting environments and system capacity. The effectiveness of this infrastructure is measured by the speed at which new hosting can be added and the incremental cost of each new region or product instance.
  • Providing highly reliable services: To maintain customer trust and satisfaction, systems must possess robust scaling, disaster recovery, and monitoring capabilities. Success here is tracked through standard availability and performance metrics.

The Role of Infrastructure as Code in Achieving Strategic Alignment Broad organizational objectives ultimately filter down into specific, actionable goals for infrastructure architecture. Key strategic infrastructure requirements usually include environment consistency, self-service provisioning, automated recovery testing, and standardized platform products.

Adopting Infrastructure as Code (IaC) is highly effective for achieving these critical goals. For instance, IaC enforces environment consistency across the entire development lifecycle, ensuring that test environments perfectly mirror live production environments. This one foundational infrastructure capability ripples upward to support multiple high-level business goals by:

  • Improving software delivery effectiveness and speed.
  • Minimizing the manual customization and effort required when expanding into new global regions or launching new products.
  • Making it significantly easier to automate overarching operational necessities like system security, regulatory compliance, and disaster recovery.
  • Allowing the organization to consolidate, simplify, and rationalize its overall system architecture.

IaC and CI/CD are the same discipline applied to different artifacts. The pipeline that runs terraform apply is constructed the same way as the pipeline that deploys application code — it validates, tests, stages, and promotes. The difference is the artifact at the center:

Software CI/CDInfrastructure CI/CD
Build artifact from sourceGenerate plan from Terraform config
Run unit + integration testsRun terraform validate, tflint, Checkov
Deploy to stagingApply to a test environment
Deploy to productionApply to production
Monitor and roll backDetect drift, run terraform apply to reconcile

Infrastructure “artifacts” (a GKE cluster, a Cloud SQL instance) take minutes to provision and are expensive to test in isolation. This shapes the testing strategy significantly - covered in Testing IaC and the delivery pipeline in IaC & CI/CD.