Terraform入门与进阶
1. 概念
Terraform 用 HCL 描述云资源,terraform apply 调云 API 创建。所有资源状态存到 state 文件。
.tf 文件(声明) → terraform plan(diff) → terraform apply(执行)
↓
state 文件
2. 安装与项目结构
brew install terraform
terraform version
项目典型结构:
infra/
├── main.tf # 主资源
├── variables.tf # 输入变量
├── outputs.tf # 输出
├── providers.tf # provider 配置
├── versions.tf # 版本约束
├── terraform.tfvars # 变量值(gitignore,含敏感信息)
└── modules/
└── vpc/
├── main.tf
├── variables.tf
└── outputs.tf
3. 阿里云 ECS + OSS 完整示例
# providers.tf
terraform {
required_version = ">= 1.6"
required_providers {
alicloud = {
source = "aliyun/alicloud"
version = "~> 1.220"
}
}
backend "oss" {
bucket = "my-tf-state"
prefix = "infra/prod"
region = "cn-hangzhou"
encrypt = true
}
}
provider "alicloud" {
region = var.region
}
# variables.tf
variable "region" { default = "cn-hangzhou" }
variable "env" { default = "prod" }
# main.tf
resource "alicloud_vpc" "main" {
vpc_name = "${var.env}-vpc"
cidr_block = "10.0.0.0/16"
}
resource "alicloud_vswitch" "main" {
vpc_id = alicloud_vpc.main.id
cidr_block = "10.0.1.0/24"
zone_id = "cn-hangzhou-i"
}
resource "alicloud_security_group" "web" {
name = "${var.env}-web"
vpc_id = alicloud_vpc.main.id
}
resource "alicloud_security_group_rule" "http" {
type = "ingress"
ip_protocol = "tcp"
port_range = "80/80"
security_group_id = alicloud_security_group.web.id
cidr_ip = "0.0.0.0/0"
}
resource "alicloud_instance" "web" {
count = 2
instance_name = "${var.env}-web-${count.index}"
instance_type = "ecs.c6.large"
image_id = "ubuntu_22_04_x64_20G_alibase_20240101.vhd"
vswitch_id = alicloud_vswitch.main.id
security_groups = [alicloud_security_group.web.id]
internet_max_bandwidth_out = 5
password = var.ecs_password
tags = {
Env = var.env
Role = "web"
}
}
resource "alicloud_oss_bucket" "static" {
bucket = "${var.env}-frontend-static"
acl = "public-read"
website {
index_document = "index.html"
error_document = "index.html"
}
cors_rule {
allowed_origins = ["https://app.example.com"]
allowed_methods = ["GET"]
allowed_headers = ["*"]
max_age_seconds = 3600
}
}
# outputs.tf
output "ecs_public_ips" {
value = alicloud_instance.web[*].public_ip
}
output "oss_endpoint" {
value = alicloud_oss_bucket.static.intranet_endpoint
}
4. 工作流
# 初始化(下 provider、配置 backend)
terraform init
# 看变更
terraform plan
terraform plan -out=plan.tfplan
# 应用
terraform apply
terraform apply plan.tfplan # 应用之前的 plan
terraform apply -auto-approve # 不交互
# 销毁
terraform destroy
# 看 state
terraform state list
terraform state show alicloud_instance.web[0]
# 改 state(小心)
terraform state mv old_name new_name
terraform state rm resource.name
terraform import alicloud_instance.web[0] i-xxx
5. State 管理
State 是 Terraform 的真理来源。远程 state 是必须的:
# AWS S3 backend
backend "s3" {
bucket = "my-tf-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "tf-state-lock" # 锁
}
# 阿里云 OSS backend
backend "oss" {
bucket = "my-tf-state"
prefix = "prod"
region = "cn-hangzhou"
encrypt = true
}
锁防止多人同时 apply 冲突。
6. Module 复用
# modules/vpc/main.tf
variable "name" {}
variable "cidr_block" { default = "10.0.0.0/16" }
resource "alicloud_vpc" "this" {
vpc_name = var.name
cidr_block = var.cidr_block
}
output "vpc_id" {
value = alicloud_vpc.this.id
}
# 调用
module "prod_vpc" {
source = "./modules/vpc"
name = "prod-vpc"
cidr_block = "10.0.0.0/16"
}
module "staging_vpc" {
source = "./modules/vpc"
name = "staging-vpc"
cidr_block = "10.1.0.0/16"
}
社区 module:Terraform Registry。
7. 多环境(Workspace)
terraform workspace new prod
terraform workspace new staging
terraform workspace select prod
terraform workspace list
或更推荐:目录区分:
infra/
├── modules/
├── environments/
│ ├── prod/
│ │ └── main.tf
│ ├── staging/
│ └── dev/
每环境独立 state,不会误操作。
8. 变量与 Secret
variable "db_password" {
type = string
sensitive = true # 不在日志中显示
}
# 环境变量传入
export TF_VAR_db_password="xxx"
terraform apply
或用 Vault provider 动态获取:
data "vault_kv_secret_v2" "db" {
mount = "secret"
name = "db"
}
resource "alicloud_db_instance" "main" {
account_password = data.vault_kv_secret_v2.db.data["password"]
}
9. CI/CD 集成
# .github/workflows/terraform.yml
on:
pull_request:
paths: [infra/**]
jobs:
plan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init
working-directory: infra
env:
ALICLOUD_ACCESS_KEY: $}} secrets.ALICLOUD_ACCESS_KEY }}
ALICLOUD_SECRET_KEY: $}} secrets.ALICLOUD_SECRET_KEY }}
- run: terraform plan -no-color
working-directory: infra
合并 main → 触发 apply(带 manual approval)。
10. terraform-docs / tflint / checkov
# 自动生成文档
terraform-docs markdown table . > README.md
# 静态检查
tflint
checkov -d . # 安全合规
11. 常见反模式
- state 存本地:团队协作崩溃,丢失就重建灾难
- state 不加锁:并发 apply 损坏状态
- state 含密码:明文,必须加密 + ACL
- 不分环境直接 apply prod:少一个 plan 等于赌
- 手动改云控制台:state 漂移,下次 apply 删掉手动建的资源
- module 不锁版本:上游改了你不知
- 大 main.tf 几千行:拆 module
- state 文件提交 git:泄密
- 不用 plan 直接 apply:跳过审查
12. 延伸阅读
- Terraform 官方文档
- Terraform Best Practices
- Terraform Registry
- Terragrunt — Terraform 工程化包装
- tflint / checkov