Terraform: Do's and Don'ts
Terraform makes infrastructure reproducible, but bad patterns make it fragile. Here are the practices we enforce at MajorLinkx and the anti-patterns we have seen destroy production environments.
Terraform Done Right
Do's and Don'ts for Production Infrastructure
State Is Sacred:
The single most destructive Terraform mistake is mismanaging state. We have seen teams store terraform.tfstate in git repositories, run apply from local machines without locking, and share state files over Slack. Every one of these patterns leads to the same outcome: someone runs terraform apply, clobbers another engineer's changes, and production infrastructure drifts into an unrecoverable mess. State is not a config file. It is a real-time record of what exists in your cloud account, and treating it casually will cost you.
At MajorLinkx, every project uses an S3 backend with DynamoDB locking from day one. The backend configuration lives in each environment's backend.tf, and we never allow local state under any circumstances. DynamoDB locking prevents concurrent applies; if two engineers run terraform plan at the same time, one waits. This is not overhead. This is the difference between a controlled infrastructure pipeline and a game of chicken with your production VPC.
Module Composition Over Monoliths:
A single main.tf with 800 lines of resource definitions is not Terraform. It is a liability. We structure every project with a modules/ directory containing reusable, versioned modules and an environments/ directory containing per-environment root configurations (dev, staging, production). Each module handles one concern: networking, compute, database, DNS. Modules accept variables, expose outputs, and never hard-code values. This means our staging environment uses the exact same module as production, just with different input variables. When we fix a security group rule in the networking module, it propagates to every environment on the next apply.
The anti-pattern is copying and pasting resource blocks between environments. We have inherited projects where the dev VPC configuration had drifted so far from production that deploying a fix in dev told you nothing about what would happen in prod. Modules eliminate this entirely. We also enforce default tags on every resource using a shared locals block: project name, environment, managed-by terraform, and a cost-center tag. If a resource does not have tags, our CI pipeline rejects the plan. Version constraints use the ~> operator so patch updates apply automatically but minor version bumps require explicit approval.
Plan Before You Apply:
Running terraform apply without reviewing the plan output is reckless. We enforce a strict plan/review/apply workflow in CI. Every pull request triggers terraform plan, the output is posted as a PR comment, and a senior engineer reviews the diff before apply runs. This catches destructive changes like accidental resource replacements, security group modifications that would lock out SSH access, and IAM policy changes that could escalate privileges. Terraform's plan output tells you exactly what will change; ignoring it is like ignoring compiler warnings.
The other critical rule: never run terraform apply -auto-approve in production. That flag exists for automated pipelines with proper guardrails, not for engineers running commands on their laptops. We use -auto-approve only in our CI/CD pipeline after the plan has been reviewed and approved by a human. For destructive operations like replacing an RDS instance, we require manual confirmation even in CI. The extra thirty seconds of review have saved us from outages more times than we can count.
The Patterns That Actually Matter:
Beyond state and modules, a few patterns separate professional Terraform from hobbyist Terraform. Use data sources to reference existing resources instead of hard-coding ARNs. Use terraform workspace only if your team has explicitly agreed on how workspaces map to environments; otherwise, use directory-based separation. Use moved blocks when refactoring resource addresses instead of deleting and recreating infrastructure. Pin your provider versions. Use terraform fmt and terraform validate in pre-commit hooks. Store sensitive values in AWS Secrets Manager or SSM Parameter Store and reference them with data sources.
The goal of all these patterns is the same: anyone on the team should be able to run terraform plan on any environment and get a clean, predictable diff. If your plan output is full of changes you did not expect, your Terraform codebase has a trust problem. We treat unexpected plan diffs the same way we treat failing tests. They block the pipeline until someone investigates. Infrastructure as code only works if you can trust the code.