diff --git a/.gitignore b/.gitignore index f248c36..3bcf9c4 100644 --- a/.gitignore +++ b/.gitignore @@ -5,3 +5,4 @@ # Module directory .terraform/ *.zip +.idea/ diff --git a/README.md b/README.md index c1f35be..25ddde1 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,5 @@ # Terraform module for automatic AMI creation -**WARNING!** AMI cleanup works not yet. - This repo contains a terraform module that creates two lambda functions that will create AMI automatically at regular intervals. It is based on the code at @@ -12,18 +10,18 @@ the code at Include this repository as a module in your existing terraform code: -Notes: -* `ami_owner` is an AWS account id. ``` module "lambda_ami_backup" { - source = "git::https://github.com/cloudposse/tf_lambda_ami_backup.git?ref=master" + source = "git::https://github.com/cloudposse/tf_ami_backup.git?ref=tags/0.1.0" name = "${var.name}" stage = "${var.stage}" namespace = "${var.namespace}" region = "${var.region}" ami_owner = "${var.ami_owner}" + instance_id = "${var.instance_id}" + retention_days = "14" } ``` @@ -37,14 +35,9 @@ module "lambda_ami_backup" { | name | `` | Name (e.g. `bastion` or `db`) | Yes | | region | `` | AWS Region where module should operate (e.g. `us-east-1`)| Yes | | ami_owner | `` | AWS Account ID which is used as a filter for AMI list (e.g. `123456789012`)| Yes | +| instance_id | `` | AWS Instance ID which is used for creating the AMI image (e.g. `id-123456789012`)| Yes | +| retention_days | `14` | Is the number of days you want to keep the backups for (e.g. `14`)| No | | backup_schedule | `cron(00 19 * * ? *)` | The scheduling expression. (e.g. cron(0 20 * * ? *) or rate(5 minutes) | No | | cleanup_schedule | `cron(05 19 * * ? *)` | The scheduling expression. (e.g. cron(0 20 * * ? *) or rate(5 minutes) | No | -## Configuring your instances to be backed up - -Tag any instances you want to be backed up with `Snapshot = true`. - -By default, old backups will be removed after 7 days, to keep them longer, set -another tag: `Retention = 14`, where 14 is the number of days you want to keep -the backups for. diff --git a/ami_backup.py b/ami_backup.py index ca1e483..1f52766 100644 --- a/ami_backup.py +++ b/ami_backup.py @@ -2,98 +2,50 @@ # # @author Robert Kozora # -# This script will search for all instances having a tag with "Backup" or "backup" -# on it. As soon as we have the instances list, we loop through each instance -# and create an AMI of it. Also, it will look for a "Retention" tag key which -# will be used as a retention policy number in days. If there is no tag with -# that name, it will use a 7 days default value for each AMI. +# "retention_days" is environment variable which will be used as a retention policy number in days. If there is no +# environment variable with that name, it will use a 14 days default value for each AMI. # -# After creating the AMI it creates a "DeleteOn" tag on the AMI indicating when +# After creating the AMI it creates a "AMIDeleteOn" tag on the AMI indicating when # it will be deleted using the Retention value and another Lambda function +from __future__ import print_function import boto3 import collections import datetime import sys import pprint +import os ec = boto3.client('ec2') -#image = ec.Image('id') +ec2_instance_id = os.environ['instance_id'] +label_id = os.environ['label_id'] + def lambda_handler(event, context): - - reservations = ec.describe_instances( - Filters=[ - {'Name': 'tag-key', 'Values': ['backup', 'Backup', 'Snapshot']}, + try: + retention_days = int(os.environ['retention']) + except ValueError: + retention_days = 14 + create_time = datetime.datetime.now() + create_fmt = create_time.strftime('%Y-%m-%d') + + AMIid = ec.create_image(InstanceId=ec2_instance_id, + Name=label_id + "-" + ec2_instance_id + "-" + create_fmt, + Description=label_id + "-" + ec2_instance_id + "-" + create_fmt, + NoReboot=True, DryRun=False) + + print("Retaining AMI %s of instance %s for %d days" % ( + AMIid['ImageId'], + ec2_instance_id, + retention_days, + )) + + delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) + delete_fmt = delete_date.strftime('%m-%d-%Y') + + ec.create_tags( + Resources=[ec2_instance_id, AMIid['ImageId']], + Tags=[ + {'Key': 'AMIDeleteOn', 'Value': delete_fmt}, ] - ).get( - 'Reservations', [] ) - - instances = sum( - [ - [i for i in r['Instances']] - for r in reservations - ], []) - - print "Found %d instances that need backing up" % len(instances) - - to_tag = collections.defaultdict(list) - - for instance in instances: - try: - retention_days = [ - int(t.get('Value')) for t in instance['Tags'] - if t['Key'] == 'Retention'][0] - except IndexError: - retention_days = 7 - - #for dev in instance['BlockDeviceMappings']: - # if dev.get('Ebs', None) is None: - # continue - # vol_id = dev['Ebs']['VolumeId'] - # print "Found EBS volume %s on instance %s" % ( - # vol_id, instance['InstanceId']) - - #snap = ec.create_snapshot( - # VolumeId=vol_id, - #) - - #create_image(instance_id, name, description=None, no_reboot=False, block_device_mapping=None, dry_run=False) - # DryRun, InstanceId, Name, Description, NoReboot, BlockDeviceMappings - create_time = datetime.datetime.now() - create_fmt = create_time.strftime('%Y-%m-%d-%H-%M-%S') - - AMIid = ec.create_image(InstanceId=instance['InstanceId'], Name="Lambda - " + instance['InstanceId'] + " from " + create_fmt, Description="Lambda created AMI of instance " + instance['InstanceId'] + " from " + create_fmt, NoReboot=True, DryRun=False) - - - pprint.pprint(instance) - #sys.exit() - #break - - #to_tag[retention_days].append(AMIid) - - to_tag[retention_days].append(AMIid['ImageId']) - - print "Retaining AMI %s of instance %s for %d days" % ( - AMIid['ImageId'], - instance['InstanceId'], - retention_days, - ) - - print to_tag.keys() - - for retention_days in to_tag.keys(): - delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) - delete_fmt = delete_date.strftime('%m-%d-%Y') - print "Will delete %d AMIs on %s" % (len(to_tag[retention_days]), delete_fmt) - - #break - - ec.create_tags( - Resources=to_tag[retention_days], - Tags=[ - {'Key': 'DeleteOn', 'Value': delete_fmt}, - ] - ) - \ No newline at end of file diff --git a/ami_cleanup.py b/ami_cleanup.py index f6ba6bb..0bffaa6 100644 --- a/ami_cleanup.py +++ b/ami_cleanup.py @@ -2,13 +2,14 @@ # # @author Robert Kozora # -# This script will search for all instances having a tag with "Backup" or "backup" -# on it. As soon as we have the instances list, we loop through each instance -# and reference the AMIs of that instance. We check that the latest daily backup +# This script will search for all AMIs having a tag with "AMIDeleteOn" +# on it. As soon as we have the AMIs list, we loop through each images +# and reference the AMIs. We check that the latest daily backup # succeeded then we store every image that's reached its DeleteOn tag's date for -# deletion. We then loop through the AMIs, deregister them and remove all the +# deletion. We loop through the AMIs, deregister them and remove all the # snapshots associated with that AMI. +from __future__ import print_function import boto3 import collections import datetime @@ -18,26 +19,13 @@ ec = boto3.client('ec2', os.environ['region']) ec2 = boto3.resource('ec2', os.environ['region']) -images = ec2.images.filter(Owners=[os.environ['ami_owner']]) +images = ec2.images.filter(Owners=[os.environ['ami_owner']], + Filters=[{'Name': 'tag-key', 'Values': ['AMIDeleteOn']}]) -def lambda_handler(event, context): - - reservations = ec.describe_instances( - Filters=[ - {'Name': 'tag-key', 'Values': ['backup', 'Backup', 'Snapshot']}, - ] - ).get( - 'Reservations', [] - ) - - instances = sum( - [ - [i for i in r['Instances']] - for r in reservations - ], []) - - print "Found %d instances that need evaluated" % len(instances) +label_id = os.environ['label_id'] +instance_id = os.environ['instance_id'] +def lambda_handler(event, context): to_tag = collections.defaultdict(list) date = datetime.datetime.now() @@ -48,53 +36,50 @@ def lambda_handler(event, context): # Set to true once we confirm we have a backup taken today backupSuccess = False - # Loop through all of our instances with a tag named "Backup" - for instance in instances: - imagecount = 0 - - # Loop through each image of our current instance - for image in images: - - # Our other Lambda Function names its AMIs Lambda - i-instancenumber. - # We now know these images are auto created - if image.name.startswith('Lambda - ' + instance['InstanceId']): - - # print "FOUND IMAGE " + image.id + " FOR INSTANCE " + instance['InstanceId'] - - # Count this image's occcurance - imagecount = imagecount + 1 - - try: - if image.tags is not None: - deletion_date = [ - t.get('Value') for t in image.tags - if t['Key'] == 'DeleteOn'][0] - delete_date = time.strptime(deletion_date, "%m-%d-%Y") - except IndexError: - deletion_date = False - delete_date = False - - today_time = datetime.datetime.now().strftime('%m-%d-%Y') - # today_fmt = today_time.strftime('%m-%d-%Y') - today_date = time.strptime(today_time, '%m-%d-%Y') - - # If image's DeleteOn date is less than or equal to today, - # add this image to our list of images to process later - if delete_date <= today_date: - imagesList.append(image.id) - - # Make sure we have an AMI from today and mark backupSuccess as true - if image.name.endswith(date_fmt): - # Our latest backup from our other Lambda Function succeeded - backupSuccess = True - print "Latest backup from " + date_fmt + " was a success" - - print "instance " + instance['InstanceId'] + " has " + str(imagecount) + " AMIs" - - print "=============" - - print "About to process the following AMIs:" - print imagesList + # Loop through each image + for image in images: + + try: + if image.tags is not None: + deletion_date = [ + t.get('Value') for t in image.tags + if t['Key'] == 'AMIDeleteOn'][0] + delete_date = time.strptime(deletion_date, "%m-%d-%Y") + except IndexError: + deletion_date = False + delete_date = False + + # Our other Lambda Function names its AMIs label_id- + # We now know these images are auto created + if image.name.startswith(label_id + '-' + instance_id): + + try: + if image.tags is not None: + deletion_date = [ + t.get('Value') for t in image.tags + if t['Key'] == 'AMIDeleteOn'][0] + delete_date = time.strptime(deletion_date, "%m-%d-%Y") + except IndexError: + deletion_date = False + delete_date = False + + today_time = datetime.datetime.now().strftime('%m-%d-%Y') + today_date = time.strptime(today_time, '%m-%d-%Y') + + # If image's DeleteOn date is less than or equal to today, + # add this image to our list of images to process later + if delete_date <= today_date: + imagesList.append(image.id) + + # Make sure we have an AMI from today and mark backupSuccess as true + if image.name.endswith(date_fmt): + # Our latest backup from our other Lambda Function succeeded + backupSuccess = True + + print("=============") + + print("About to process the following AMIs:") + print(imagesList) if backupSuccess == True: @@ -102,7 +87,7 @@ def lambda_handler(event, context): # loop through list of image IDs for image in imagesList: - print "deregistering image %s" % image + print("deregistering image %s" % image) amiResponse = ec.deregister_image( DryRun=False, ImageId=image, @@ -111,8 +96,8 @@ def lambda_handler(event, context): for snapshot in snapshots: if snapshot['Description'].find(image) > 0: snap = ec.delete_snapshot(SnapshotId=snapshot['SnapshotId']) - print "Deleting snapshot " + snapshot['SnapshotId'] - print "-------------" + print("Deleting snapshot " + snapshot['SnapshotId']) + print("-------------") else: - print "No current backup found. Termination suspended." \ No newline at end of file + print("No current backup found. Termination suspended.") diff --git a/main.tf b/main.tf index 77dfbf8..279d062 100644 --- a/main.tf +++ b/main.tf @@ -31,7 +31,11 @@ data "aws_iam_policy_document" "ami_backup" { actions = [ "ec2:DescribeInstances", "ec2:CreateImage", - "ec2:CreateTags" + "ec2:DescribeImages", + "ec2:DeregisterImage", + "ec2:DescribeSnapshots", + "ec2:DeleteSnapshot", + "ec2:CreateTags", ] resources = [ @@ -63,23 +67,31 @@ module "label_backup" { source = "git::https://github.com/cloudposse/tf_label.git?ref=tags/0.1.0" namespace = "${var.namespace}" stage = "${var.stage}" - name = "${var.name}-backup" + name = "${var.name}-backup-${var.instance_id}" } module "label_cleanup" { source = "git::https://github.com/cloudposse/tf_label.git?ref=tags/0.1.0" namespace = "${var.namespace}" stage = "${var.stage}" - name = "${var.name}-cleanup" + name = "${var.name}-cleanup-${var.instance_id}" } +module "label_role" { + source = "git::https://github.com/cloudposse/tf_label.git?ref=tags/0.1.0" + namespace = "${var.namespace}" + stage = "${var.stage}" + name = "${var.name}-${var.instance_id}" +} + + resource "aws_iam_role" "ami_backup" { - name = "${module.label.id}" + name = "${module.label_role.id}" assume_role_policy = "${data.aws_iam_policy_document.default.json}" } resource "aws_iam_role_policy" "ami_backup" { - name = "${module.label.id}" + name = "${module.label_role.id}" role = "${aws_iam_role.ami_backup.id}" policy = "${data.aws_iam_policy_document.ami_backup.json}" } @@ -87,7 +99,7 @@ resource "aws_iam_role_policy" "ami_backup" { resource "aws_lambda_function" "ami_backup" { filename = "${path.module}/ami_backup.zip" function_name = "${module.label_backup.id}" - description = "Automatically backup instances tagged with 'Snapshot: true'" + description = "Automatically backup EC2 instance (create AMI)" role = "${aws_iam_role.ami_backup.arn}" timeout = 60 handler = "ami_backup.lambda_handler" @@ -96,8 +108,11 @@ resource "aws_lambda_function" "ami_backup" { environment = { variables = { - region = "${var.region}" - ami_owner = "${var.ami_owner}" + region = "${var.region}" + ami_owner = "${var.ami_owner}" + instance_id = "${var.instance_id}" + retention = "${var.retention_days}" + label_id = "${module.label.id}" } } } @@ -105,7 +120,7 @@ resource "aws_lambda_function" "ami_backup" { resource "aws_lambda_function" "ami_cleanup" { filename = "${path.module}/ami_cleanup.zip" function_name = "${module.label_cleanup.id}" - description = "Cleanup old AMI backups" + description = "Automatically remove AMIs that have expired (delete AMI)" role = "${aws_iam_role.ami_backup.arn}" timeout = 60 handler = "ami_cleanup.lambda_handler" @@ -116,6 +131,8 @@ resource "aws_lambda_function" "ami_cleanup" { variables = { region = "${var.region}" ami_owner = "${var.ami_owner}" + instance_id = "${var.instance_id}" + label_id = "${module.label.id}" } } } @@ -159,3 +176,4 @@ resource "aws_lambda_permission" "ami_cleanup" { principal = "events.amazonaws.com" source_arn = "${aws_cloudwatch_event_rule.ami_cleanup.arn}" } + diff --git a/variables.tf b/variables.tf index 727e90d..7b7bfe0 100644 --- a/variables.tf +++ b/variables.tf @@ -16,9 +16,9 @@ variable "region" { default = "" } -variable "retention" { - default = "" -} +variable "retention_days" {} + +variable "instance_id" {} variable "name" { default = ""