Securely Accessing Private EC2 Instances with Session Manager and VPC Endpoints

Posted October 18, 2025 by Trevor Roberts Jr ‐ 8 min read

EC2 instances in private subnets need management access, but they can't reach the internet. SSH keys are hard to rotate. Enter Session Manager with VPC endpoints—a secure, auditable way to access private instances without exposing them directly to the internet. Here's how to set it up.

Introduction

i wrote about this earlier: https://www.trevorrobertsjr.com/blog/ec2-instance-connect-automate/

I've seen many teams still using bastion hosts or SSH keys for accessing private EC2 instances. Both approaches work, but they come with operational headaches: bastion hosts are resources you have to manage, and SSH keys are painful to rotate at scale.

AWS Systems Manager Session Manager changed all this. It provides browser-based or CLI access to instances without SSH keys, and every session is logged to CloudTrail and optionally CloudWatch Logs. When you combine Session Manager with VPC endpoints, you get a secure, private communication channel between your instances and AWS services—all without traversing the internet.

In this post, I'll walk through the architecture and show you how to automate it all with Terraform.

The Architecture

Here's what we're building:

┌─────────────────────────────────────────────────────┐
│                     VPC                             │
├─────────────────────────────────────────────────────┤
│  Private Subnet                                     │
│  ┌──────────────────────────────────────────────┐  │
│  │ EC2 Instance                                 │  │
│  │ Security Group: Allow 443 outbound to VPC   │  │
│  │ IAM Role: SSM permissions                   │  │
│  └──────────────────────────────────────────────┘  │
│                                                    │
│  VPC Interface Endpoints (in same subnet)         │
│  ┌──────────────────────────────────────────────┐  │
│  │ ssm (com.amazonaws.region.ssm)               │  │
│  │ ec2messages (com.amazonaws.region.ec2m...)  │  │
│  │ ssmmessages (com.amazonaws.region.ssm...)   │  │
│  │ ec2 (com.amazonaws.region.ec2)              │  │
│  │ s3 (gateway endpoint)                        │  │
│  │ kms (optional)                               │  │
│  │ logs (optional)                              │  │
│  └──────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────┘
           ↓
    AWS Services (private)

The key insight: your instances never leave your VPC. All communication with AWS services happens through private VPC endpoints. Session Manager handles the connection, and your instance is fully auditable and private.

Why This Matters

Security Benefits:

  • No SSH keys to manage or rotate
  • No bastion host attack surface
  • All access is logged for compliance and auditing
  • Can restrict by IAM policies and instance tags
  • No inbound security group rules needed

Operational Benefits:

  • Access instances from anywhere (via AWS console or CLI)
  • No infrastructure to manage (no bastions)
  • Session logs stored in S3 or CloudWatch Logs
  • Multi-factor authentication support
  • Can run commands across fleets

Cost Benefits:

  • Eliminate bastion host infrastructure
  • Pay only for VPC endpoints (roughly $7-8/month each)
  • No data transfer charges for internal AWS communication

The Required VPC Endpoints

Session Manager needs three interface endpoints plus one gateway endpoint. Let me break down what each does:

Required Interface Endpoints

1. SSM Endpoint (com.amazonaws.region.ssm)

  • Initiates Session Manager sessions
  • Core Systems Manager API calls

2. EC2 Messages Endpoint (com.amazonaws.region.ec2messages)

  • SSM Agent uses this to receive commands from Systems Manager
  • Bidirectional communication channel

3. SSM Messages Endpoint (com.amazonaws.region.ssmmessages)

  • Only needed if using Session Manager for secure shell access
  • Handles the interactive session protocol

4. EC2 Endpoint (com.amazonaws.region.ec2)

  • Required if creating VSS-enabled snapshots
  • Used for EBS volume operations

Required Gateway Endpoint

S3 Endpoint (com.amazonaws.region.s3)

  • SSM Agent downloads patches and updates from S3
  • Returns logs to S3 buckets
  • Retrieves scripts and files from S3

Optional Endpoints

KMS Endpoint (com.amazonaws.region.kms)

  • If encrypting Session Manager sessions with KMS
  • If encrypting Parameter Store parameters

CloudWatch Logs Endpoint (com.amazonaws.region.logs)

  • If sending Session Manager session logs to CloudWatch Logs
  • If sending SSM Agent logs to CloudWatch

Terraform Implementation

Let me show you how to build this stack. I'll break it into modular pieces for reusability.

Step 1: VPC and Subnet Setup

# vpc.tf
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name = "ssm-vpc"
  }
}

resource "aws_subnet" "private" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = data.aws_availability_zones.available.names[0]

  tags = {
    Name = "private-subnet"
  }
}

# Get available AZs for the region
data "aws_availability_zones" "available" {
  state = "available"
}

Step 2: Security Groups

# security_groups.tf

# Security group for EC2 instances
resource "aws_security_group" "instance" {
  name_prefix = "ssm-instance-"
  description = "Security group for instances using Session Manager"
  vpc_id      = aws_vpc.main.id

  # Allow outbound HTTPS to VPC endpoints
  egress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.main.cidr_block]
    description = "HTTPS to VPC endpoints"
  }

  # Allow outbound HTTP to S3 (for gateway endpoint)
  egress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.main.cidr_block]
    description = "HTTP to VPC endpoints"
  }

  tags = {
    Name = "ssm-instance-sg"
  }
}

# Security group for VPC endpoints
resource "aws_security_group" "vpc_endpoints" {
  name_prefix = "vpc-endpoints-"
  description = "Security group for VPC endpoints"
  vpc_id      = aws_vpc.main.id

  # Allow HTTPS from instance security group
  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.instance.id]
    description     = "HTTPS from instances"
  }

  tags = {
    Name = "vpc-endpoints-sg"
  }
}

Step 3: VPC Endpoints

# vpc_endpoints.tf

# SSM Interface Endpoint
resource "aws_vpc_endpoint" "ssm" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.ssm"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = [aws_subnet.private.id]
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "ssm-endpoint"
  }
}

# EC2 Messages Interface Endpoint
resource "aws_vpc_endpoint" "ec2_messages" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.ec2messages"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = [aws_subnet.private.id]
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "ec2messages-endpoint"
  }
}

# SSM Messages Interface Endpoint (required for Session Manager)
resource "aws_vpc_endpoint" "ssm_messages" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.ssmmessages"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = [aws_subnet.private.id]
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "ssmmessages-endpoint"
  }
}

# EC2 Interface Endpoint (for EBS snapshots)
resource "aws_vpc_endpoint" "ec2" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.ec2"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = [aws_subnet.private.id]
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "ec2-endpoint"
  }
}

# S3 Gateway Endpoint
resource "aws_vpc_endpoint" "s3" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.${data.aws_region.current.name}.s3"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = [aws_route_table.private.id]

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Principal = "*"
        Action = [
          "s3:GetObject",
          "s3:PutObject",
          "s3:ListBucket"
        ]
        Resource = [
          "arn:aws:s3:::patch-baseline-snapshot-${data.aws_region.current.name}/*",
          "arn:aws:s3:::aws-ssm-${data.aws_region.current.name}/*",
          "arn:aws:s3:::aws-windows-downloads-${data.aws_region.current.name}/*",
          "arn:aws:s3:::amazoncloudwatch-agent-${data.aws_region.current.name}/*"
        ]
      }
    ]
  })

  tags = {
    Name = "s3-endpoint"
  }
}

# Optional: KMS Endpoint
resource "aws_vpc_endpoint" "kms" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.kms"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = [aws_subnet.private.id]
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "kms-endpoint"
  }
}

# Optional: CloudWatch Logs Endpoint
resource "aws_vpc_endpoint" "logs" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.logs"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = [aws_subnet.private.id]
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "logs-endpoint"
  }
}

# Data source for current region
data "aws_region" "current" {}

# Route table for S3 gateway endpoint
resource "aws_route_table" "private" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "private-rt"
  }
}

resource "aws_route_table_association" "private" {
  subnet_id      = aws_subnet.private.id
  route_table_id = aws_route_table.private.id
}

Step 4: IAM Role for EC2 Instances

# iam.tf

# IAM role for EC2 instances
resource "aws_iam_role" "ssm_instance_role" {
  name_prefix = "ssm-instance-role-"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
        Action = "sts:AssumeRole"
      }
    ]
  })

  tags = {
    Name = "ssm-instance-role"
  }
}

# Attach the managed policy for Session Manager
resource "aws_iam_role_policy_attachment" "ssm_managed_instance_core" {
  role       = aws_iam_role.ssm_instance_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

# Instance profile
resource "aws_iam_instance_profile" "ssm_instance_profile" {
  name_prefix = "ssm-instance-profile-"
  role        = aws_iam_role.ssm_instance_role.name
}

# Optional: Add inline policy for specific S3 bucket access
resource "aws_iam_role_policy" "s3_access" {
  name_prefix = "ssm-s3-access-"
  role        = aws_iam_role.ssm_instance_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "s3:GetObject",
          "s3:ListBucket"
        ]
        Resource = [
          "arn:aws:s3:::patch-baseline-snapshot-${data.aws_region.current.name}/*",
          "arn:aws:s3:::aws-ssm-${data.aws_region.current.name}/*"
        ]
      }
    ]
  })
}

Step 5: EC2 Instance

# ec2.tf

# Get the latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux_2" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }

  filter {
    name   = "root-device-type"
    values = ["ebs"]
  }
}

# EC2 instance in private subnet
resource "aws_instance" "private" {
  ami                         = data.aws_ami.amazon_linux_2.id
  instance_type               = "t3.micro"
  subnet_id                   = aws_subnet.private.id
  iam_instance_profile        = aws_iam_instance_profile.ssm_instance_profile.name
  associate_public_ip_address = false
  vpc_security_group_ids      = [aws_security_group.instance.id]

  # SSM Agent is pre-installed on Amazon Linux 2
  # No user data script needed

  tags = {
    Name = "private-instance"
  }
}

Using Session Manager

Once everything is deployed, connecting to your instance is simple:

Via AWS Console

  1. Go to Systems Manager → Session Manager
  2. Click "Start Session"
  3. Select your instance
  4. Click "Start Session"
  5. You get an interactive shell in your browser

Via AWS CLI

# Start an interactive session
aws ssm start-session --target i-1234567890abcdef0

# Run a single command
aws ssm start-session \
  --target i-1234567890abcdef0 \
  --document-name "AWS-RunShellScript" \
  --parameters "command=['echo Hello from Session Manager']"

# Start session in a specific region
aws ssm start-session \
  --target i-1234567890abcdef0 \
  --region us-east-1

Enabling Session Logging

For audit purposes, log sessions to S3 or CloudWatch Logs:

# Enable Session Manager logging
resource "aws_ssm_document" "session_manager_config" {
  name            = "SSM_SessionManagerRunShell"
  document_type   = "Session"
  document_format = "JSON"

  content = jsonencode({
    schemaVersion = "1.0"
    description   = "Session Manager configuration"
    sessionType   = "Standard_Stream"
    inputs = {
      s3BucketName      = aws_s3_bucket.session_logs.id
      s3KeyPrefix       = "session-logs"
      s3EncryptionEnabled = true
      cloudWatchLogGroupName = aws_cloudwatch_log_group.session_logs.name
      kmsKeyId          = aws_kms_key.ssm.id
    }
  })
}

# S3 bucket for session logs
resource "aws_s3_bucket" "session_logs" {
  bucket_prefix = "ssm-session-logs-"
}

resource "aws_s3_bucket_versioning" "session_logs" {
  bucket = aws_s3_bucket.session_logs.id

  versioning_configuration {
    status = "Enabled"
  }
}

# CloudWatch log group for session logs
resource "aws_cloudwatch_log_group" "session_logs" {
  name_prefix       = "/aws/ssm/session-logs/"
  retention_in_days = 30
}

Troubleshooting Common Issues

Issue: "EC2 instance is not available"

  • Verify the EC2 instance has the IAM role with AmazonSSMManagedInstanceCore policy
  • Ensure SSM Agent is running: sudo systemctl status amazon-ssm-agent

Issue: "ServiceUnavailable: Connection failed"

  • Check security group on VPC endpoints allows port 443 from instance security group
  • Verify all required endpoints exist (ssm, ec2messages, ssmmessages, s3)
  • Ensure DNS resolution is working for private endpoints

Issue: "Unable to reach Systems Manager service"

  • Verify the instance can reach the VPC endpoints (test with curl)
  • Check route table includes routes to VPC endpoints
  • Confirm instance is in private subnet (no internet)

Wrapping Things Up...

Session Manager with VPC endpoints provides a secure, auditable way to access private EC2 instances without SSH keys or bastion hosts. By using only AWS-managed services and VPC endpoints, you eliminate infrastructure management and reduce your attack surface.

The initial setup feels like overhead, but it pays dividends immediately: your security team is happy, your system is auditable, you can rotate credentials without touching instances, and access is completely logged.

If you're still using SSH keys or bastion hosts, I'd strongly encourage you to try this approach on new infrastructure. Your future self will thank you.

If you found this article useful, let me know on BlueSky or on LinkedIn!