Skip to main content

Issues faced while launching ECS Tasks (pulling image from an ECR repo) from a private subnet

I created a private subnet. I created an ECR repo with Private visibility and pushed an image into it.

Then, I created an ECS Cluster.

I added a Task Definition with No Task Role and a Task Execution Role (ecs-tasks.amazonaws.com can assuem this role), with AmazonECSTaskExecutionRolePolicy permission policy. The container in the task definition has private repository authentication enabled.

Then, I created a task as follows:

aws ecs run-task --task-definition <task-definition-name> --cluster <ecs-cluster-name> --network-configuration '{"awsvpcConfiguration": {"subnets":["<subnet-id>"], "securityGroups": ["<sg-id>"], "assignPublicIp": "DISABLED" }}' --count 1 --launch-type FARGATE

The task did not start and stopped with the following error:

ERROR: ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3 time(s): RequestError: send request failed caused by: Post "https://api.ecr.<region>.amazonaws.com/": dial tcp <Public-IP>:443: i/o timeout

FIX: I added the following VPC Interface endpoints & linked Security Group:

com.amazonaws.<region>.ecr.api

com.amazonaws.<region>.ecr.dkr

The Security Group associated with these VPC endpoints has the following rules:

Inbound - HTTPS from VPC CIDR

Outbound - HTTPS from Anywhere

Running the task again, created the following error:

CannotPullContainerError: inspect image has been retried 5 time(s): failed to resolve ref "<ECR-Image-URI>": failed to do request: Head "<ECR-URI>/v2/<name>/manifests/latest": dial tcp: lookup <ECR-URI> on <Private-IP>:53: no such host

FIX: The same private subnet (where ECS task was launched) is linked to these VPC endpoints. 

ERROR: CannotPullContainerError: ref pull has been retried 5 time(s): failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://prod-<region>-starport-layer-bucket.s3.<region>.amazonaws.com/

FIX: Add the following VPC Gateway endpoint:

com.amazonaws.<region>.s3

Link it to a Route Table (my RT has no NAT/Internet GW).

ERROR: ResourceInitializationError: failed to validate logger args: : signal: killed

FIX: I added the following VPC Interface endpoint & linked the same Security Group above:

com.amazonaws.<region>.logs

The ECS Task started running after this.


Comments

Popular posts from this blog

AWS Route53 - Private Hosted Zone

AWS - Error - An error occurred (ExpiredToken) when calling the DescribeStacks operation: The security token included in the request is expired

Error:   An error occurred (ExpiredToken) when calling the DescribeStacks operation: The security token included in the request is expired. Reason: It occurred when I ran a MAKE command with a profile having expired token (security credentials) Fix: Generate new security credentials (aws sts assume-role) and run the command again

AWS CloudTrail

AWS CloudTrail is an API monitoring service.  It records activities in your account. We can log those activities in S3 bucket It gives visibility to user activities e.g., if you want to know who created an EC2 instance, you can get the answer using CloudTrail Using CloudTrail, you can track changes to AWS resources in your accounts