Day 5: Resource Dependencies & Data Sources

I'm a cloud-native enthusiast and tech blogger, sharing insights on Kubernetes, AWS, CI/CD, and Linux across my blog and Facebook page. Passionate about modern infrastructure and microservices, I aim to help others understand and leverage cloud-native technologies for scalable, efficient solutions.
Welcome to Day 5! Today we’ll explore how Terraform manages relationships between resources and how to query existing infrastructure using data sources. Understanding dependencies is crucial for building complex, reliable infrastructure.
🎯 Today’s Goals
Understand implicit vs explicit dependencies
Master the
depends_onmeta-argumentLearn about data sources and their uses
Query existing AWS resources
Build infrastructure that references external resources
Understand the resource graph
🔗 Resource Dependencies
When building infrastructure, resources often depend on each other. Terraform needs to know the order to create or destroy them.
Example Dependency Chain
VPC
│
├─► Subnet
│ │
│ └─► EC2 Instance
│
└─► Internet Gateway
│
└─► Route Table
🤝 Implicit Dependencies
Implicit dependencies are automatically detected when one resource references another’s attributes.
# VPC is created first (no dependencies)
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
}
# Subnet depends on VPC (implicit dependency)
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id # ← This creates implicit dependency
cidr_block = "10.0.1.0/24"
}
# Instance depends on Subnet (implicit dependency)
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
subnet_id = aws_subnet.public.id # ← Implicit dependency
}
Terraform’s creation order:
VPC
Subnet (waits for VPC)
Instance (waits for Subnet)
Destruction order (reverse):
Instance
Subnet
VPC
📌 Explicit Dependencies (depends_on)
Sometimes dependencies exist that Terraform can’t detect automatically. Use depends_on for explicit dependencies.
When to Use depends_on
# IAM role must exist before instance profile
resource "aws_iam_role" "instance_role" {
name = "instance-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}]
})
}
# IAM policy attachment
resource "aws_iam_role_policy_attachment" "instance_policy" {
role = aws_iam_role.instance_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
# Instance profile needs the policy to be attached
resource "aws_iam_instance_profile" "instance_profile" {
name = "instance-profile"
role = aws_iam_role.instance_role.name
# Explicit dependency - ensure policy is attached first
depends_on = [aws_iam_role_policy_attachment.instance_policy]
}
Multiple Dependencies
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
depends_on = [
aws_iam_instance_profile.instance_profile,
aws_security_group.web,
aws_subnet.public
]
}
🔍 Data Sources
Data sources allow Terraform to query existing infrastructure or external information. They don’t create resources - they only read data.
Data Source Syntax
data "provider_resource" "name" {
# Filter criteria
}
# Reference with: data.provider_resource.name.attribute
Example: Query Existing VPC
# Query existing VPC by tag
data "aws_vpc" "existing" {
tags = {
Name = "production-vpc"
}
}
# Use the VPC ID in a new subnet
resource "aws_subnet" "new_subnet" {
vpc_id = data.aws_vpc.existing.id
cidr_block = "10.0.10.0/24"
}
📚 Common AWS Data Sources
1. AWS AMI (Amazon Machine Image)
# Get latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t2.micro"
}
2. AWS Availability Zones
# Get all available AZs in current region
data "aws_availability_zones" "available" {
state = "available"
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
}
3. AWS Account Information
data "aws_caller_identity" "current" {}
output "account_id" {
value = data.aws_caller_identity.current.account_id
}
output "caller_arn" {
value = data.aws_caller_identity.current.arn
}
4. AWS Region
data "aws_region" "current" {}
output "current_region" {
value = data.aws_region.current.name
}
5. Existing Security Group
data "aws_security_group" "default" {
name = "default"
vpc_id = aws_vpc.main.id
}
6. Existing Subnet
data "aws_subnet" "selected" {
filter {
name = "tag:Name"
values = ["production-subnet-1"]
}
}
🧪 Hands-On Lab: Dependencies & Data Sources
Let’s build a complete infrastructure using both implicit/explicit dependencies and data sources!
Step 1: Create Project Directory
mkdir terraform-dependencies-lab
cd terraform-dependencies-lab
Step 2: Create data-sources.tf
# data-sources.tf
# Get current AWS region
data "aws_region" "current" {}
# Get current AWS account
data "aws_caller_identity" "current" {}
# Get available availability zones
data "aws_availability_zones" "available" {
state = "available"
}
# Get latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "root-device-type"
values = ["ebs"]
}
}
Step 3: Create main.tf
# main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
# VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "dependencies-lab-vpc"
}
}
# Internet Gateway (implicit dependency on VPC)
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "dependencies-lab-igw"
}
}
# Public Subnets using data source for AZs
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 1}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = {
Name = "public-subnet-${count.index + 1}"
AZ = data.aws_availability_zones.available.names[count.index]
}
}
# Route Table
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "public-route-table"
}
}
# Route Table Association
resource "aws_route_table_association" "public" {
count = 2
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# Security Group
resource "aws_security_group" "web" {
name = "web-security-group"
description = "Allow HTTP and SSH"
vpc_id = aws_vpc.main.id
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "web-sg"
}
}
# IAM Role for EC2
resource "aws_iam_role" "ec2_role" {
name = "ec2-ssm-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}]
})
tags = {
Name = "ec2-ssm-role"
}
}
# Attach SSM policy to role
resource "aws_iam_role_policy_attachment" "ssm_policy" {
role = aws_iam_role.ec2_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
# Instance Profile (explicit dependency on policy attachment)
resource "aws_iam_instance_profile" "ec2_profile" {
name = "ec2-instance-profile"
role = aws_iam_role.ec2_role.name
# Explicit dependency to ensure policy is attached first
depends_on = [aws_iam_role_policy_attachment.ssm_policy]
}
# EC2 Instance using AMI from data source
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t2.micro"
subnet_id = aws_subnet.public[0].id
vpc_security_group_ids = [aws_security_group.web.id]
iam_instance_profile = aws_iam_instance_profile.ec2_profile.name
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from Terraform!</h1>" > /var/www/html/index.html
echo "<p>Instance in ${data.aws_availability_zones.available.names[0]}</p>" >> /var/www/html/index.html
echo "<p>AMI: ${data.aws_ami.amazon_linux.id}</p>" >> /var/www/html/index.html
EOF
tags = {
Name = "web-server"
}
# Explicit dependency on route table association
depends_on = [aws_route_table_association.public]
}
Step 4: Create outputs.tf
# outputs.tf
output "account_id" {
description = "AWS Account ID"
value = data.aws_caller_identity.current.account_id
}
output "region" {
description = "AWS Region"
value = data.aws_region.current.name
}
output "availability_zones" {
description = "Available AZs"
value = data.aws_availability_zones.available.names
}
output "ami_id" {
description = "AMI ID used for instance"
value = data.aws_ami.amazon_linux.id
}
output "ami_name" {
description = "AMI name"
value = data.aws_ami.amazon_linux.name
}
output "vpc_id" {
description = "VPC ID"
value = aws_vpc.main.id
}
output "subnet_ids" {
description = "Subnet IDs"
value = aws_subnet.public[*].id
}
output "instance_id" {
description = "EC2 Instance ID"
value = aws_instance.web.id
}
output "instance_public_ip" {
description = "EC2 Instance Public IP"
value = aws_instance.web.public_ip
}
output "website_url" {
description = "Website URL"
value = "http://${aws_instance.web.public_ip}"
}
Step 5: Initialize and Plan
terraform init
terraform plan
Notice in the plan:
Data sources are read first
Resources are created in dependency order
Implicit dependencies shown with arrows
Step 6: Visualize Dependencies
terraform graph | dot -Tpng > dependencies.png
Open dependencies.png to see the dependency graph!
Step 7: Apply Configuration
terraform apply
Type yes to confirm.
Step 8: Test the Website
After apply completes:
# Get the website URL
terraform output website_url
# Test with curl
curl $(terraform output -raw website_url)
You should see the HTML page with instance details!
Step 9: Examine Data Source Values
terraform output ami_id
terraform output ami_name
terraform output availability_zones
These values were queried from AWS, not hardcoded!
Step 10: Understand the Dependency Chain
Data Sources (Read First)
├── aws_region.current
├── aws_caller_identity.current
├── aws_availability_zones.available
└── aws_ami.amazon_linux
Resources (Created in Order)
├── 1. aws_vpc.main
├── 2. aws_internet_gateway.main (depends on VPC)
├── 3. aws_subnet.public[0,1] (depends on VPC, uses AZ data)
├── 4. aws_security_group.web (depends on VPC)
├── 5. aws_iam_role.ec2_role
├── 6. aws_iam_role_policy_attachment.ssm_policy (depends on role)
├── 7. aws_iam_instance_profile.ec2_profile (explicit depends_on policy)
├── 8. aws_route_table.public (depends on VPC and IGW)
├── 9. aws_route_table_association.public[0,1] (depends on subnet and RT)
└── 10. aws_instance.web (depends on subnet, SG, profile, uses AMI data)
Step 11: Clean Up
terraform destroy
Terraform destroys in reverse dependency order!
🎨 Resource Graph
Terraform builds a dependency graph to determine execution order:
# Generate graph in DOT format
terraform graph
# With specific plan
terraform graph -type=plan
# For destroy operations
terraform graph -type=plan-destroy
📊 Data Sources vs Resources
| Data Source | Resource |
| Reads existing infrastructure | Creates new infrastructure |
data "aws_vpc" "main" | resource "aws_vpc" "main" |
Referenced with data.aws_vpc.main | Referenced with aws_vpc.main |
| Read-only | Create/Update/Delete |
| No state changes | Manages state |
🔑 Key Concepts
Implicit Dependencies
Created automatically when referencing attributes
Most common type
Terraform detects them automatically
Explicit Dependencies
Use
depends_onmeta-argumentFor non-obvious dependencies
Accepts list of resources
Data Sources
Query existing infrastructure
Read external information
Don’t create resources
Evaluated before resources
📝 Best Practices
✅ DO:
Prefer implicit dependencies
subnet_id = aws_subnet.main.id # ✅ ImplicitUse depends_on sparingly
# Only when necessary depends_on = [aws_iam_role_policy_attachment.policy]Use data sources for existing resources
data "aws_ami" "latest" { most_recent = true }Document why depends_on is needed
depends_on = [aws_route_table.main] # Ensure route exists before instance tries to access internet
❌ DON’T:
Don’t use depends_on when implicit dependencies work
Don’t create circular dependencies
Don’t hardcode AMI IDs - use data sources
Don’t assume resource creation order without dependencies
📝 Summary
Today you learned:
✅ Implicit vs explicit dependencies
✅ When and how to use
depends_on✅ Data sources and their purpose
✅ Common AWS data sources
✅ How Terraform builds the resource graph
✅ Best practices for managing dependencies
🚀 Tomorrow’s Preview
Day 6: State Management Fundamentals
Tomorrow we’ll:
Deep dive into Terraform state
Understand state file structure
Learn about state locking
Configure remote state backends
Master state commands
Implement S3 backend for state storage
💭 Challenge Exercise
Modify today’s lab to:
Use a data source to find an existing S3 bucket
Add a resource that depends on both the VPC and the bucket
Create an explicit dependency between two resources
Add a data source for AWS SSM parameters
← Day 4: Variables & Outputs | Day 6: State Management →
Remember: Understanding dependencies is crucial for building reliable, complex infrastructure!



