DEVOPS · Contributor

I Used to Fear Infrastructure. Terraform Changed That.

A computer engineering student documents their hands-on journey learning Terraform and Infrastructure as Code — from Googling basic definitions to building a full three-tier AWS architecture with a CI/CD pipeline. The piece covers real concepts (remote state, security group bugs, credential management, configuration drift) through the lens of someone who learned by breaking things, fixing them, and being honest about the confusion along the way.

By
Safalta Khanal
Published
June 17, 2026
Issue
02 · June 2026
I Used to Fear Infrastructure. Terraform Changed That.
Submitted by Safalta Khanal · Build With Her Magazine

Let me be honest with you.

The first time someone said “just write some Terraform to spin that up,” I nodded confidently — and then immediately opened a new tab to Google what Terraform actually was.

That feeling of faking it in a room full of people who seem to already know everything? I think a lot of us have been there. The imposter syndrome in cloud engineering is real, and it doesn’t go away just because you land a job title or finish a course. It lingers — especially when the gap between “I understand this conceptually” and “I can actually build this” feels enormous.

But after going deep on Terraform and Infrastructure as Code over the past few weeks — hands-on, breaking things, fixing them, breaking them again — I finally feel like I get it. Not just the commands, not just the syntax, but the why behind all of it. And I want to share what that journey looked like, because I think it might help someone who’s standing exactly where I was standing not long ago.

So what even is Terraform? And why does it matter?
Let’s start from zero, because I wish someone had explained this to me clearly from the beginning.

Here’s the one-line version: Terraform lets you describe your infrastructure in code, and then builds it for you.

No clicking around the AWS console. No praying that you remember every security group rule you configured three months ago. No “works on my machine” infrastructure that nobody else can recreate. No more frantically screenshotting console settings before you accidentally delete something.

You write what you want. Terraform figures out how to make it real.

What makes this genuinely powerful — not just convenient — is the declarative model. You’re not writing a script that says “first create this, then attach that, then configure this.” You’re writing a description of your desired end state, and Terraform works backwards from there. It compares what you want with what currently exists, calculates the difference, and applies only the necessary changes.

That’s a fundamentally different way of thinking about infrastructure, and once it clicks, you can’t un-see it.

The core workflow is beautifully simple:

terraform init — Initializes the working directory, downloads providers and modules. Think of it as npm install but for your infrastructure dependencies.
terraform plan — Creates an execution plan. Shows you exactly what will be created, updated, or destroyed before you touch anything. No surprises.
terraform apply — Applies the planned changes. Your infrastructure gets built.
terraform destroy — Tears everything down cleanly. Great for ephemeral environments, terrible for forgetting to run it and getting a surprise AWS bill.
Four commands. Billions of dollars of cloud infrastructure managed through them every day.

The moment declarative infrastructure actually clicked
I’ll admit — the word “declarative” bounced around in my head for a while before it actually landed.

I kept mentally modeling Terraform like a fancy shell script. Something that runs top to bottom, does stuff, and exits. But that mental model breaks down the moment you make a second change.

Here’s what actually helped me understand it: imagine you have a team of 10 engineers, all touching the same AWS account. Without IaC, you have 10 people clicking around the console, each making slightly different decisions about naming, tagging, security rules. Over time, the environment becomes a museum of decisions nobody remembers making. Debugging becomes archaeology.

Terraform solves this because the code is the source of truth. If someone wants to add an inbound rule to a security group, they don’t go to the console — they open a .tf file, make the change, open a pull request, get it reviewed, and merge. The deployment happens automatically. The history of that change lives in Git forever.

That’s not just convenient. That’s auditable, reproducible, and team-scalable in a way that clicking through a UI never will be.

State management — the thing nobody explains well enough
If there’s one concept that separates Terraform beginners from Terraform practitioners, it’s understanding state.

Terraform keeps a state file — typically named terraform.tfstate — that maps your configuration to real-world resources. It's how Terraform knows that the aws_instance.web in your code corresponds to a specific EC2 instance with ID i-0abc123def456 in your AWS account.

When you run terraform plan, it reads this file, compares it to your current configuration, and figures out what needs to change. Destroy your resources, and the state file gets updated to reflect that. Simple enough locally.

But here’s where it gets important: local state files are a team anti-pattern.

If your state file lives on your laptop and a teammate runs terraform apply on their machine, you now have two divergent state files. Chaos ensues. Resources get duplicated or destroyed unexpectedly. The state no longer reflects reality.

The solution is remote state storage, and we implemented this using an Amazon S3 bucket as our Terraform backend, with a DynamoDB table for state locking.

Here’s why each piece matters:

S3 stores the state file remotely, so every team member pulls from the same source of truth
DynamoDB locking ensures that only one terraform apply can run at a time — no two people accidentally overwriting each other's changes simultaneously
Versioning on the S3 bucket gives you a rollback history of your state file, which is invaluable when things go sideways
Once we set this up, the workflow transformed. Engineers could work in parallel, confident that their changes wouldn’t stomp on each other. Every apply was tracked. The state was always consistent.

If you’re still keeping terraform.tfstate locally, please — do yourself a favor and move to remote state before you learn this lesson the hard way.

What we actually built — real infrastructure, not toy examples
One of my frustrations with a lot of Terraform tutorials is that they stop at “here’s how to create an EC2 instance.” Congratulations, you provisioned a single server. That’s not production. That’s not even close.

We went further. Here’s what we actually built across these sessions:

Full networking stack — We started from scratch: a custom VPC with a defined CIDR block, public and private subnets deployed across multiple availability zones for fault tolerance, an internet gateway for outbound traffic from public resources, route tables with appropriate associations, and security groups with granular ingress and egress rules. Every piece of the networking puzzle, in code, reproducible in minutes.

Compute layer with real access patterns — We launched EC2 instances running NGINX and verified public accessibility using their IP addresses. But more importantly, we implemented a jump host (bastion host) pattern — a public-facing EC2 instance that serves as the single secure entry point into the private network. Private instances have no direct internet access; you go through the bastion. This mirrors how production environments actually work.

We also discussed alternative access methods — specifically AWS Systems Manager Session Manager with IAM roles, which eliminates the need for a bastion host entirely by tunneling through the SSM service. No open SSH port, no key management headache. This is increasingly the production-preferred approach.

Secure database layer — We deployed Amazon RDS PostgreSQL inside private subnets, completely isolated from the public internet. Security groups were configured to allow database connections only from the application layer — not from the internet, not from the bastion, only from resources that explicitly needed access. Principle of least privilege, applied at the network level.

Three-tier architecture — Putting it all together: a public-facing web layer, a private application layer, and a private database layer, each communicating only with the layers they need to and nothing else. This is the architecture pattern that underpins a huge percentage of production web applications, and we built the whole thing in code.

All of it reproducible. All of it destroyable and rebuildable in under 10 minutes. That’s the promise of IaC, and we delivered on it.

Secure credential management — the part everyone gets wrong first
Before we talk about automation, we need to talk about credentials. Because this is where a lot of people make mistakes that have real security consequences.

When you’re working with Terraform and AWS, you need credentials to authenticate your operations. The wrong way to handle this — and yes, people do this — is to hardcode access keys directly in your Terraform files or GitHub Actions workflows. Please don’t. Those files end up in Git history. Git history is forever. Leaked AWS keys are a very bad day.

The right way: environment variables for local development, repository secrets for CI/CD.

For local Terraform runs, we configured credentials through environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN when using temporary credentials). Terraform automatically picks these up. No keys in code.

For GitHub Actions, the same credentials were stored as encrypted repository secrets — managed through GitHub’s secrets interface, never visible in logs, referenced in workflow YAML as ${{ secrets.AWS_ACCESS_KEY_ID }}. The workflow can authenticate to AWS without the credentials ever appearing in plain text anywhere in the repository.

This is standard practice, and it’s not complicated once you understand the pattern. The important thing is building the habit early.

GitHub Actions + Terraform — this is how real teams deploy
Once I understood Terraform locally, connecting it to GitHub Actions was what made everything feel production-real.

Learn about Medium’s values
The architecture we implemented had separate workflow files for three operations: plan, apply, and destroy. Here’s the typical flow:

An engineer makes a change to the Terraform configuration and opens a pull request
The plan workflow triggers automatically, runs terraform plan, and posts the output as a comment on the PR
The team reviews the plan — exactly what will be created, modified, or destroyed — before merging
Once merged to the main branch, the apply workflow triggers and executes terraform apply automatically
Infrastructure is updated. No manual steps. Full audit trail in GitHub.
This pattern is called GitOps — using Git as the single source of truth for infrastructure state, with automated pipelines enforcing the changes. It means infrastructure changes go through the same review process as application code. That’s a significant shift from “whoever has console access makes the call.”

One critical lesson we emphasized: only use verified, official GitHub Actions in your workflows — especially anything that handles AWS credentials or checks out source code. The Actions marketplace is large and not everything in it is trustworthy. A malicious action with console access to your AWS account is a catastrophic security event. Stick to actions with the verified badge and high usage counts.

We also implemented OIDC-based AWS authentication as a more advanced option — this eliminates the need for long-lived access keys entirely by using short-lived tokens that are issued per-workflow-run. Fewer permanent credentials in your environment means a smaller attack surface.

The mistake that taught me the most
No learning journey is complete without a good failure story, and this one is mine.

During one of our CI/CD automation exercises, a Terraform deployment failed mid-run. The workflow errored out, the apply didn’t complete, and for a moment everything felt broken.

We dug into the logs. The culprit? A single incorrect protocol definition in a security group rule. We had used an invalid value for the HTTPS traffic protocol — one character off from what Terraform expected. The configuration looked fine to the human eye. Terraform disagreed.

Here’s what struck me: the error message was clear. Terraform told us exactly what was wrong, on which resource, with which argument. There was no guessing, no grepping through logs hoping to find a clue. We fixed the value, ran terraform validate to catch any remaining issues before applying, committed the fix, pushed it, and watched the workflow re-execute cleanly.

From failure to resolution: maybe 15 minutes.

Compare that to debugging a manually configured resource in the console where the misconfiguration might not surface for hours, and when it does, there’s no record of what changed or when.

That’s the real power of Infrastructure as Code: your mistakes are visible, traceable, and correctable. They don’t hide in the console. They don’t get lost in verbal handoffs. They live in code, and code can be reviewed, fixed, and improved.

Variables, outputs, and modules — the trio that makes infra actually reusable
Understanding Terraform is one thing. Writing good Terraform is another. And these three concepts are the foundation of infrastructure that scales beyond a single project or a single engineer.

Variables are how you make configurations flexible. Instead of hardcoding a VPC CIDR block or an instance type, you define them in variables.tf and set actual values in terraform.tfvars (or pass them via environment variables in CI/CD). The same Terraform code can deploy to dev with a t3.micro, staging with a t3.medium, and production with a t3.large — just by changing a variables file. No touching the actual resource definitions.

Outputs are how Terraform communicates back to you — and to other systems — after a successful apply. The public IP of your EC2 instance, the endpoint of your RDS database, the ARN of a newly created IAM role. Outputs make these values accessible without having to go hunting through the console. They’re also how modules expose information to the root configuration that uses them.

Modules are where infrastructure becomes genuinely reusable. Instead of defining your VPC, subnets, and route tables directly in main.tf every time, you create a VPC module that encapsulates all of that logic. Now you have a versioned, testable, shareable piece of infrastructure that any project can consume with a few lines of code.

We also explored the count meta-argument for dynamic resource creation. Rather than writing three separate subnet resource blocks for three availability zones, you write one and set count = 3. Terraform creates all three, with index-based CIDR block allocation handled automatically. The first time I saw this work correctly — three subnets, three AZs, one resource block — it genuinely felt like a superpower.

The mental shift is from “infrastructure as a set of resources” to “infrastructure as a library of reusable, composable components.” That’s what separates a Terraform file that works from a Terraform codebase that a team can maintain for years.

Configuration drift — the silent killer of infrastructure consistency
There’s a concept in IaC that doesn’t get enough airtime in beginner tutorials, and it’s one of the most important things to internalize early: configuration drift.

Configuration drift happens when someone modifies a resource outside of Terraform — a quick security group tweak in the console because the deployment was taking too long, an EC2 instance type change via the CLI because “it’s just temporary,” a DNS record added manually because “I’ll add it to Terraform later” (spoiler: later never comes).

Each of these changes creates a gap between what your Terraform state thinks exists and what actually exists in AWS. The next time someone runs terraform plan, they might see unexpected changes — Terraform trying to revert the manual modification back to the declared state. Or worse, the state file gets confused and starts making incorrect assumptions about your infrastructure.

The discipline of IaC isn’t just about the tooling. It’s about the team agreement: if it’s not in the Terraform code, it doesn’t exist. Every change goes through the repository. No exceptions, no “just this once.”

This is harder than it sounds, especially in fast-moving environments where the pressure to make a quick manual fix is real. But the cost of drift compounds over time. An environment that’s been manually modified a hundred times is an environment that nobody fully understands anymore. Terraform can’t help you with what it doesn’t know about.

AWS key management and secure instance access
One area where we went deeper than most beginner content covers is key management for EC2 access.

The traditional approach — manually create an AWS key pair in the console, download the .pem file, hope you don't lose it — doesn't scale and doesn't fit an IaC workflow well. We used the Terraform TLS provider to generate SSH key pairs dynamically as part of the infrastructure provisioning process. The private key is captured as a Terraform output, and the public key is registered with AWS automatically.

This means key pairs are:

Generated fresh for each environment
Never manually managed
Traceable to a specific Terraform run
Destroyable along with the rest of the infrastructure
For teams moving toward zero-trust access models, the better answer is AWS Systems Manager Session Manager with IAM roles — which removes SSH from the equation entirely. No open port 22, no key distribution, access controlled entirely through IAM policies. It’s more complex to set up initially, but it’s the direction the industry is moving.

What’s next — Terraform meets Ansible
We’re wrapping up with a capstone that I’m genuinely excited about: automating a full WordPress deployment using Terraform for infrastructure provisioning and Ansible for application configuration management.

The split makes sense once you think about it. Terraform is excellent at creating infrastructure — VPCs, EC2 instances, databases, security groups. But it’s not really designed to go inside those instances and configure the operating system, install packages, manage service files. That’s where configuration management tools like Ansible shine.

Combined, the workflow looks like:

Terraform provisions the infrastructure — network, compute, database
Ansible connects to the provisioned instances and installs/configures WordPress, PHP, NGINX, and everything else the application needs
The result is a fully running WordPress site, deployed from zero, with zero manual steps
And because both the infrastructure and the application configuration are in code, the entire environment is ephemeral — you can tear it down completely and rebuild it identically in minutes.

Build it. Use it. Destroy it. Rebuild it. That’s the mental model that IaC demands, and it’s also the mental model that unlocks things like on-demand staging environments, blue-green deployments, and disaster recovery that actually works when you need it.

If I had to summarize everything in one paragraph
Terraform is not just a tool. It’s a philosophy about how infrastructure should be managed — with the same rigor, review processes, and version control that we apply to application code. The four-command workflow is simple on the surface, but underneath it is a complete rethinking of how teams build and maintain cloud environments. Remote state, modules, CI/CD pipelines, least-privilege security, configuration drift discipline — all of it points toward the same goal: infrastructure that is reproducible, auditable, and owned by the whole team rather than living in one person’s head or one person’s console session.

If you’re on the fence about learning Terraform, let this be your signal. The learning curve is real but it’s not steep, and the payoff — in terms of confidence, career trajectory, and actual engineering capability — is absolutely worth the investment.

I’m happy to connect with anyone else on the same journey. Whether you’re just starting out with terraform init or already managing complex multi-region architectures, there's always more to learn and more to share.

What’s been your biggest Terraform “aha” moment? Drop it in the comments — I’d genuinely love to hear it.

#Terraform #InfrastructureAsCode #AWS #DevOps #CloudEngineering #GitHubActions #Ansible #SiteReliabilityEngineering #CloudComputing #LearningInPublic #TechCommunity

About the contributor
Safalta Khanal
Contributor · Build With Her Magazine

I write about the real side of learning cloud and DevOps — the bugs that stumped me, the concepts that finally clicked, and the mistakes I made so you don't have to. Currently building with AWS, Terraform, Docker and GitHub Actions — and documenting every step of it publicly.

LinkedIn
Keep Reading

More from DevOps

A Note From The Editors

Every story we publish is a reminder that more women are building than the world often sees.

Build With Her exists to document women who are building, leading, learning, surviving, creating, and becoming visible.

If this article resonated with you, maybe your story belongs here too.

You do not need to have everything figured out. You do not need a perfect title, a perfect company, or a perfect journey.

You only need a story worth sharing.

Conversation

0 comments on “I Used to Fear Infrastructure. Terraform Changed That.

Welcome to the comments section. We moderate every submission according to our community guidelines.

Sort

Loading conversation…