AWS Auto Scaling Group – CodeDeploy Challenges

Posted on February 20, 2015

AWS Auto Scaling Group – CodeDeploy Challenges

First here is my setup

A single development / test server in the AWS cloud, backed by a separate Git Repository.
WHen code is completed in the development environment it is commited to the development branch (using whichever branching scheme best fits the project)
At the same time the code is merged to the test branch, and the code is available for client testing on the ‘test_stage’ site if they would like
Then on an as needed basis the code in in the test branch (on the test_stage server) is deployed to AWS using their CodeDeploy api
- git archive test -> deploy.zip
- upload the file an S3 bucket (s3cmd)
- register the zip file as a revision using the AWS Register Revision API call
This creates a file that can be deployed to any deployment group
I setup two groups in my AWS account , test and live.
When the client is ready, I run a script which deploys thes the ziped up revision to the Test server, where they are able to look atit and approve.
Then I use the same method but move it instead of the www deployment group.

(The complexities of setting this up are deeper than I am going in this article, but for future prospects, all of this programming knowledges is stored in our deploy.php file)

A couple of tricks “they” dont tell you.

Errors can be difficult to debug – if you update your code deployment to do more verbose logging it can help you to determine what some of the errors were.
- update /etc/codedeploy-agent/conf/codedeployment.yml, set verbose to yes.
- restart the service /etc/init.d/code-deployment restart (it can take several minutes to restart, this is normal)
- tail the log files to watch a deployment in real time, or investigate it after the fact (tail /var/log/aws/codedeploy-agent)
Deploying a Revision to servers while they may be going through some termination instability, may likely cause your deployment to fail when one of you servers terminates.
- To prevent this, update the deployment autoscaling plan to have a minim and a maximum of the server, and do not take it under load during the 10 – 15 minutes (up to 2 hours) issues will cause errors
- Depending on the load on your servers, your deployment could take a lot of cpu and could generate an autoscaling alert and could spin up new tasks or send you an email. There is not a correct way to deal with this, however it is a good idea to know about it before you deploy.
- Finally the item that I wrote this because of, it appears that when you attempt to deploy a revision to an autoscaling group, it can cause some failures.
  - The obvious one is that the deployment will fail if it is attempted while the server is shutting down
  - However, it seems that if you have decided to upgrade your AMI, and your Launch Configuration, that a deployment will fail. And for me, it actually caused a key failure to login as well (this could have been because of multiple server terminations and then another server took over the IPs within a few minutes) Anyway, much caution about these things.

UPDATE:

Well, the problem was actually that the by ‘afterinstall.sh’ script, was cleaning up the /opt/codedeployment/ directory (so we didn’t run out of space after a couple dozen deployments), but I was also removing the appspec.yml file.

So I updated the command that runs in the afterinstall to be

 /usr/bin/find /opt/codedeploy-agent/deployment-root/ -mindepth 2 -mtime +1 -not -path '*deployment-instruction*' -delete