I created a private subnet. I created an ECR repo with Private visibility and pushed an image into it.
Then, I created an ECS Cluster.
I added a Task Definition with No Task Role and a Task Execution Role (ecs-tasks.amazonaws.com can assuem this role), with AmazonECSTaskExecutionRolePolicy permission policy. The container in the task definition has private repository authentication enabled.
Then, I created a task as follows:
aws ecs run-task --task-definition <task-definition-name> --cluster <ecs-cluster-name> --network-configuration '{"awsvpcConfiguration": {"subnets":["<subnet-id>"], "securityGroups": ["<sg-id>"], "assignPublicIp": "DISABLED" }}' --count 1 --launch-type FARGATE
The task did not start and stopped with the following error:
ERROR: ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3 time(s): RequestError: send request failed caused by: Post "https://api.ecr.<region>.amazonaws.com/": dial tcp <Public-IP>:443: i/o timeout
FIX: I added the following VPC Interface endpoints & linked Security Group:
com.amazonaws.<region>.ecr.api
com.amazonaws.<region>.ecr.dkr
The Security Group associated with these VPC endpoints has the following rules:
Inbound - HTTPS from VPC CIDR
Outbound - HTTPS from Anywhere
Running the task again, created the following error:
CannotPullContainerError: inspect image has been retried 5 time(s): failed to resolve ref "<ECR-Image-URI>": failed to do request: Head "<ECR-URI>/v2/<name>/manifests/latest": dial tcp: lookup <ECR-URI> on <Private-IP>:53: no such host
FIX: The same private subnet (where ECS task was launched) is linked to these VPC endpoints.
ERROR: CannotPullContainerError: ref pull has been retried 5 time(s): failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://prod-<region>-starport-layer-bucket.s3.<region>.amazonaws.com/
FIX: Add the following VPC Gateway endpoint:
com.amazonaws.<region>.s3
Link it to a Route Table (my RT has no NAT/Internet GW).
ERROR: ResourceInitializationError: failed to validate logger args: : signal: killed
FIX: I added the following VPC Interface endpoint & linked the same Security Group above:
com.amazonaws.<region>.logs
The ECS Task started running after this.
Comments
Post a Comment