VPC Lattice vs PrivateLink Endpoint Services: Choosing the Right Connectivity Pattern for Resilience

Posted January 12, 2026 by Trevor Roberts Jr ‐ 5 min read

You need to connect applications to backend resources securely across VPCs or accounts. Do you use VPC Lattice or PrivateLink Endpoint Services? This decision matters more than you might think, especially when your backend has dynamic IPs like an Aurora failover scenario.

Introduction

AWS's network connectivity options have evolved significantly over the years. We went from VPC peering to Transit Gateway to PrivateLink to the newer VPC Lattice. Each has its place, but I'm seeing teams make poor choices in their backend connectivity layer that bite them later.

The question that keeps coming up: Should I use VPC Lattice or PrivateLink Endpoint Services to talk to my backend resources?

The answer, like many things in architecture, is "it depends." But I'll give you a framework to decide, and more importantly, I'll highlight a critical issue that I don't see discussed enough: how these services handle dynamic backend IPs like Aurora master nodes during failover.

Understanding the Architectures

PrivateLink Endpoint Services are the traditional approach. Here's how they work:

Consumer VPC (Application)
    ↓
Client-Side Endpoint
    ↓
VPC Peering (or Transit Gateway)
    ↓
Service Provider's Network Load Balancer (NLB)
    ↓
Backend Resources (EC2, RDS, etc)

The key components:

  • Provider side: Network Load Balancer routes traffic to your backend resources
  • Consumer side: VPC Endpoint connects consumer VPC to the provider's NLB
  • Transport: Handles traffic securely through AWS's private network

VPC Lattice

VPC Lattice is the newer, purpose-built approach:

Consumer VPC (Application)
    ↓
VPC Lattice Service Network
    ↓
Lattice's managed routing and security
    ↓
Backend resources (target groups with specific selection mechanisms)

VPC Lattice centralizes network management with a service-oriented architecture and built-in observability.

The Aurora Problem: Where Things Get Tricky

Here's the scenario that exposes fundamental differences in these architectures:

You're connecting an application to an Aurora cluster with a master node. The master node's IP address changes during failover when Aurora promotes a replica to become the new master.

With PrivateLink, your NLB needs to be able to route traffic to the Aurora endpoints. Here's where the problem emerges:

# PrivateLink: Using NLB with Aurora
resource "aws_lb_target_group" "aurora_targets" {
  name       = "aurora-endpoint-group"
  port       = 3306
  protocol   = "TCP"
  vpc_id     = var.vpc_id
  
  # Option 1: By IP Address
  # Problem: Aurora failover changes the master IP
  # You need automatic target deregistration and re-registration
  
  # Option 2: By DNS Name
  # NLB doesn't support DNS target resolution
  # This isn't available for network load balancers
}

# This is the catch-22:
# - Target by IP? Breaks on Aurora failover
# - Target by DNS? NLB doesn't support it

This is a fundamental limitation of using NLB as the PrivateLink endpoint for dynamic backends. Your NLB can't automatically update its targets when Aurora's underlying IP changes.

Workaround (it's not ideal):

# Auto-detect Aurora failover and update NLB targets
import boto3
import boto3.session

elb = boto3.client('elbv2')
rds = boto3.client('rds')

def update_nlb_targets_for_aurora():
    """Monitor Aurora and update NLB targets on failover"""
    
    # Get Aurora cluster endpoint
    clusters = rds.describe_db_clusters(DBClusterIdentifier='my-cluster')
    cluster = clusters['DBClusters'][0]
    
    # Get the endpoints
    instances_info = rds.describe_db_instances(
        Filters=[
            {
                'Name': 'db-cluster-id',
                'Values': ['my-cluster']
            }
        ]
    )
    
    # Find master endpoint IP
    master_ips = []
    for instance in instances_info['DBInstances']:
        if instance['DBInstanceStatus'] == 'available':
            # Resolve DNS to IP
            import socket
            endpoint = instance['Endpoint']['Address']
            ip = socket.gethostbyname(endpoint)
            master_ips.append({'Id': ip, 'Port': 3306})
    
    # Register targets with NLB
    response = elb.register_targets(
        TargetGroupArn='arn:aws:elasticloadbalancing:...',
        Targets=master_ips
    )
    
    return response

This requires Lambda functions, additional monitoring, and operational overhead.

VPC Lattice Advantage

VPC Lattice handles this scenario more elegantly through its target group capabilities:

# VPC Lattice: Using service-based routing
resource "aws_vpclattice_target_group" "aurora_targets" {
  name       = "aurora-tg"
  protocol   = "TCP"
  port       = 3306
  vpc_id     = var.vpc_id
  
  # Use DNS targets - VPC Lattice supports this natively
  health_checks {
    enabled             = true
    healthy_threshold   = 3
    unhealthy_threshold = 2
    interval_seconds    = 10
    timeout_seconds     = 5
    protocol            = "TCP"
  }
  
  target_type = "ALB"
}

# Register Aurora endpoint target
resource "aws_vpclattice_target_group_target" "aurora" {
  target_group_arn  = aws_vpclattice_target_group.aurora_targets.arn
  target_id         = aws_lb.aurora_front.arn  # Application Load Balancer in front of Aurora
  port              = 3306
}

VPC Lattice can work with Application Load Balancers in front of your Aurora cluster, and ALBs do support DNS target resolution:

# Put an ALB in front of Aurora to abstract the IP changes
resource "aws_lb_target_group" "aurora_dns" {
  name       = "aurora-dns-targets"
  port       = 3306
  protocol   = "TCP"
  vpc_id     = var.vpc_id
  
  # ALB supports DNS
  targets_by_dns_name = [
    {
      name = aws_rds_cluster.main.endpoint  # DNS name, not IP
      port = 3306
    }
  ]
}

Comparison Matrix

FeaturePrivateLinkVPC Lattice
Dynamic IP handlingPoor (NLB limitation)Better (DNS support)
Cross-account connectivityExcellentExcellent
ObservabilityCloudWatch onlyBuilt-in metrics & logs
Operational simplicityModerateSimpler
CostLowerHigher
MaturityProduction-readyNewer but solid
Multi-region supportVia Transit GatewayNative support

My Recommendation

Use VPC Lattice when:

  • You need clean service-to-service connectivity
  • Your backends have dynamic IPs (databases, managed services)
  • You want built-in observability
  • You're willing to invest in a newer service

Use PrivateLink when:

  • You have static backend infrastructure (EC2 instances with fixed IPs)
  • You need the lowest operational overhead
  • Cost is a critical factor
  • You're integrating with third-party services already using PrivateLink

Special case - Aurora backends:

  • If using PrivateLink with Aurora, put an ALB in front as an abstraction layer
  • If using VPC Lattice with Aurora, leverage DNS targeting directly
  • Monitor and test your failover procedures regardless

Wrapping Things Up...

VPC Lattice and PrivateLink Endpoint Services are both solid solutions, but they excel in different scenarios. The Aurora failover scenario I highlighted is just one example where understanding the nuances matters.

As your AWS infrastructure grows more sophisticated, investing time in understanding these connectivity patterns will save you headaches during failure scenarios. Test your failovers. Understand your target group behavior. Monitor your endpoints.

The difference between handling a backend failure gracefully and having a cascading outage often comes down to these architectural decisions made months earlier.

If you found this article useful, let me know on BlueSky or on LinkedIn!