Adding AWS VPC Endpoint for Snowflake PrivateLink using Terraform

Until recently creating an AWS VPC Endpoint for Snowflake PrivateLink using pure Terraform was not possible. So at Crimson Macaw we issued a pull request to the Snowflake Provider to add this feature.

Snowflake PrivateLink is a feature that allow direct, secure connectivity between Snowflake and your Cloud environment; ensuring your network traffic does not traverse the public internet. For AWS this is in the form of an AWS VPC Endpoint, enabling a highly-secure network between Snowflake and your VPC, fully protected from unauthorised external access.

Adding Snowflake PrivateLink requires Business Critical or higher, and to enable this feature you will need to raise a support case to Snowflake.

Initial Setup

The majority of the setup was achievable with the AWS Provider; I'll not walk you through the setup of these, but as an assumption you will have configured.

For the purpose of this blog, the assumption is that these resources are addressable at aws_vpc.this and aws_subnet.this.*

Creating the VPC Endpoint

Previously, the creation of the subnet would have been achieved using variables. The information to supply to the AWS provider was obtainable by executing the SYSTEM$GET_PRIVATELINK_CONFIG function on Snowflake

SELECT SYSTEM$GET_PRIVATELINK_CONFIG();

This would result in a JSON payload with the following properties.

{
    "privatelink-account-name": "<account_name>",
    "privatelink-account-url": "<privatelink_account_url>",
    "privatelink-ocsp-url": "<privatelink_ocsp_url>",
    "privatelink-vpce-id": "<aws_vpce_id>"
}

At the time of writing this blog post, the function actually returns privatelink_ocsp-url rather than privatelink-ocsp-url

To complete the setup on AWS you will need the following resources.

variable "snowflake_private_link_vpce_id" {
  description = "The snowflake privatelink vpce id; property privatelink-vpce-id from calling SYSTEM$GET_PRIVATELINK_CONFIG"
  type        = string
}

resource "aws_security_group" "snowflake_private_link" {
  vpc_id = aws_vpc.this.id

  ingress {
    from_port   = 80
    to_port     = 80
    cidr_blocks = aws_vpc.this.cidr_block
    protocol    = "tcp"
    description = "Allow communication to Snowflake OSCP URL"
  }

  ingress {
    from_port   = 443
    to_port     = 443
    cidr_blocks = aws_vpc.this.cidr_block
    protocol    = "tcp"
    description = "Allow communication to Snowflake Account URL"
  }
}

resource "aws_vpc_endpoint" "snowflake_private_link" {
  vpc_id              = aws_vpc.this.id
  subnet_ids          = [for subnet in aws_subnet.this: subnet.id]
  service_name        = var.snowflake_private_link_vpce_id
  vpc_endpoint_type   = "Interface"
  security_group_ids  = [aws_security_group.snowflake_private_link.id]
  private_dns_enabled = false
}

At this point you will have a VPC Endpoint with an AWS Elastic Network Interface in each of the subnets.

However, once setup and to use, you will need to address snowflake on a subdomain of privatelink.snowflakecomputing.com. This is not resolvable, and so you will need to add the Account URL and OCSP URL to your own hosting of this domain.

If you use Route53, then you can do this with a private zone associated with you VPC

variable "snowflake_private_link_account_url" {
  description = "The snowflake privatelink account url; property privatelink-account-url from calling SYSTEM$GET_PRIVATELINK_CONFIG"
  type        = string
}

variable "snowflake_private_link_oscp_url" {
  description = "The snowflake privatelink vpce id; property privatelink-ocsp-url from calling SYSTEM$GET_PRIVATELINK_CONFIG"
  type        = string
}

resource "aws_route53_zone" "snowflake_private_link" {
  name = "privatelink.snowflakecomputing.com"

  vpc {
    vpc_id = aws_vpc.this.id
  }
}

resource "aws_route53_record" "snowflake_private_link_account_url" {
  zone_id = aws_route53_zone.snowflake_private_link.zone_id
  name    = var.snowflake_private_link_account_url
  type    = "CNAME"
  ttl     = "300"
  records = [aws_vpc_endpoint.snowflake_private_link.dns_entry[0]["dns_name"]]
}

resource "aws_route53_record" "snowflake_private_link_oscp_url" {
  zone_id = aws_route53_zone.snowflake_private_link_url.zone_id
  name    = var.snowflake_private_link_oscp_url
  type    = "CNAME"
  ttl     = "300"
  records = [aws_vpc_endpoint.snowflake_private_link.dns_entry[0]["dns_name"]]
}

Making this dynamic

Our change to the provider wraps up the call to the SYSTEM$GET_PRIVATELINK_CONFIG system function for you; it also handles privatelink-ocsp-url and privatelink_ocsp-url as returned values just in case ;-)

Provider Setup

If you have not done so already, then you will need to include the provider in your terraform code.

terraform {
  required_providers {
    snowflake = {
      source  = "chanzuckerberg/snowflake"
      version = ">= 0.25.4"
    }
  }
}

provider "snowflake" {
  # use environment variables to control access
}

Using snowflake_system_get_privatelink_config

You can make use of the new data source snowflake_system_get_privatelink_config to dynamically pull in and map the returned to AWS setup and you no longer need the variables declared previously to provide this information.

data "snowflake_system_get_privatelink_config" "snowflake_private_link" {}

# ...

resource "aws_vpc_endpoint" "snowflake_private_link" {
  vpc_id              = aws_vpc.this.id
  subnet_ids          = aws_subnet.this.*.id
  service_name        = data.snowflake_system_get_privatelink_config.snowflake_private_link.aws_vpce_id
  vpc_endpoint_type   = "Interface"
  security_group_ids  = [aws_security_group.snowflake_private_link.id]
  private_dns_enabled = false
}

# ...

resource "aws_route53_record" "snowflake_private_link_account_url" {
  zone_id = aws_route53_zone.snowflake_private_link.zone_id
  name    = data.snowflake_system_get_privatelink_config.snowflake_private_link.account_url
  type    = "CNAME"
  ttl     = "300"
  records = [aws_vpc_endpoint.snowflake_private_link.dns_entry[0]["dns_name"]]
}

resource "aws_route53_record" "snowflake_private_link_oscp_url" {
  zone_id = aws_route53_zone.snowflake_private_link_url.zone_id
  name    = data.snowflake_system_get_privatelink_config.snowflake_private_link.oscp_url
  type    = "CNAME"
  ttl     = "300"
  records = [aws_vpc_endpoint.snowflake_private_link.dns_entry[0]["dns_name"]]
}

Putting it all together

A complete example of the above

resource "aws_security_group" "snowflake_private_link" {
  vpc_id = aws_vpc.this.id

  ingress {
    from_port   = 80
    to_port     = 80
    cidr_blocks = aws_vpc.this.cidr_block
    protocol    = "tcp"
    description = "Allow communication to Snowflake OSCP URL"
  }

  ingress {
    from_port   = 443
    to_port     = 443
    cidr_blocks = aws_vpc.this.cidr_block
    protocol    = "tcp"
    description = "Allow communication to Snowflake Account URL"
  }
}

data "snowflake_system_get_privatelink_config" "snowflake_private_link" {}

resource "aws_vpc_endpoint" "snowflake_private_link" {
  vpc_id              = aws_vpc.this.id
  subnet_ids          = [for subnet in aws_subnet.this: subnet.id]
  service_name        = data.snowflake_system_get_privatelink_config.snowflake_private_link.aws_vpce_id
  vpc_endpoint_type   = "Interface"
  security_group_ids  = [aws_security_group.snowflake_private_link.id]
  private_dns_enabled = false
}

resource "aws_route53_zone" "snowflake_private_link" {
  name = "privatelink.snowflakecomputing.com"

  vpc {
    vpc_id = aws_vpc.this.id
  }
}

resource "aws_route53_record" "snowflake_private_link_account_url" {
  zone_id = aws_route53_zone.snowflake_private_link.zone_id
  name    = data.snowflake_system_get_privatelink_config.snowflake_private_link.account_url
  type    = "CNAME"
  ttl     = "300"
  records = [aws_vpc_endpoint.snowflake_private_link.dns_entry[0]["dns_name"]]
}

resource "aws_route53_record" "snowflake_private_link_oscp_url" {
  zone_id = aws_route53_zone.snowflake_private_link_url.zone_id
  name    = data.snowflake_system_get_privatelink_config.snowflake_private_link.oscp_url
  type    = "CNAME"
  ttl     = "300"
  records = [aws_vpc_endpoint.snowflake_private_link.dns_entry[0]["dns_name"]]
}

Summary

Adding changes to any Terraform provider requires coding knowledge of golang and of course the cloud or system being managed by Terraform. Our change to the Snowflake provider is available from release 0.25.4.

As Crimson Macaw is a partner of both AWS and Snowflake, this is just a typical example of the type of work that we do. If you are interested in working for us, then please ensure refer to our careers section for available opportunities. You can also contact us if you would like for Crimson Macaw to work with you on your next Data/Cloud Analytics project.