Create New Document

The title of your document (will be displayed as H1)
URL-friendly name (no spaces, use dashes)
Path where to create document (optional, use forward slashes to create subdirectories)

Move/Rename Document

Current location of the document
New path for the document (including the slug)
This only changes the document's path. It does not modify the document's title (H1 heading).

Delete Document

Are you sure you want to delete this document? This action cannot be undone.

Warning: If this is a folder, all contents including subfolders and documents will be deleted.

Message

Message content goes here.

Confirm Action

Are you sure?

Attachments

Allowed file types: jpg, jpeg, png, gif, svg, webp, txt, log, csv, sfd, zip, pdf, docx, xlsx, pptx, mp4 (Max: 10MB)

Document Files

Loading attached files...

Document History

Previous Versions

Loading versions...

Preview

Select a version to preview

Wiki Settings

Language for the user interface
Number of versions to keep per document. Set to 0 to disable versioning.
Maximum allowed file size for uploads in MB.

User Management

Add New User

Leave empty to keep current password
Users with these groups can access restricted sections.

Define path-based access rules for sections of your wiki, then assign users to groups in the Users tab. Rules are evaluated in order. First match wins.

Active Rules

Import markdown files from a ZIP archive. Files will be processed and stored in the appropriate document structure. Directory structure in the ZIP (category/subcategory) will be preserved in the wiki.

Upload a ZIP file containing markdown (.md) files to import.

Create and manage backups of your wiki data. Backups include all documents, images, and configuration files.

Available Backups

Loading backups...

Add/Edit Access Rule

Selected: /

Add Column

AWS Backup Implementation Guide

Based on BACKUP-CONFIGURATION.md Specification

Account: 828879644785
CLI Profile: mnemonica
Date: 2025-10-31

This guide provides step-by-step instructions to implement a comprehensive backup strategy for S3 buckets and RDS PostgreSQL databases with dual-tier recovery capabilities and cross-region disaster recovery.

Note: This guide includes Object Lock configuration sections marked as [DEFERRABLE]. These can be implemented later as they require creating a new bucket. All other sections can be implemented immediately on existing infrastructure.


Table of Contents

  1. Prerequisites
  2. Pre-Implementation Checklist
  3. Step 0: Backup Current Configurations
  4. Current Infrastructure Status
  5. Phase 1: S3 Source Bucket Configuration
  6. Phase 2: S3 Replica Bucket Configuration
  7. Phase 3: S3 Cross-Region Replication
  8. Phase 4: IAM Security Configuration
  9. Phase 5: RDS Operational Backup Configuration
  10. Phase 6: AWS Backup for RDS
  11. Phase 7: Validation and Testing
  12. Phase 8: Monitoring and Alerting
  13. Recovery Procedures
  14. Summary Checklist
  15. [DEFERRABLE] Object Lock Implementation

Prerequisites

Production Environment

Current Metrics

Required Permissions

Your AWS user/role must have permissions to:

AWS CLI Configuration

# Verify AWS CLI is configured with mnemonica profile
aws sts get-caller-identity --no-cli-pager --profile mnemonica

# Expected output: Account "828879644785"

Pre-Implementation Checklist


Step 0: Backup Current Configurations

Before making any changes, save current configurations for rollback:

# Create backups directory if it doesn't exist
mkdir -p backups

# Backup media bucket lifecycles
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica > backups/source-bucket-lifecycle-backup.json

aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica > backups/replica-bucket-lifecycle-backup.json

# Backup vault bucket lifecycles
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica > backups/vault-source-bucket-lifecycle-backup.json

# Note: vault-replica has no lifecycle policy currently, skip backup
echo "No lifecycle policy configured for mne-vault-replica (will be configured in Phase 2B)" > backups/vault-replica-bucket-lifecycle-backup.json

# Backup current AWS Backup plan
aws backup get-backup-plan \
  --backup-plan-id e88b2f03-25d3-4bc0-a585-f6994e54cdaa \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica > backups/aws-backup-plan-backup.json

Current Infrastructure Status

What's Already Working ✅

Media Buckets:

Vault Buckets:

RDS:

What Needs Changes 🔧

Media Buckets:

Vault Buckets:

RDS:

IAM:

What's Deferred ⏸️


Phase 1: S3 Source Bucket Configuration

Bucket: mne-media-prod
Region: eu-west-1
Status: Already exists, needs lifecycle adjustment

Step 1.1: Review Current Configuration

# Check current versioning
aws s3api get-bucket-versioning \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Check current lifecycle rules
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Current State:

Step 1.2: Update Lifecycle Configuration

Goal: Change noncurrent version retention from 180 days to 35 days per plan specification.

Configuration file: configs/source-bucket-lifecycle-35d.json (already created)

Note: Current configuration was backed up in Step 0.

# Apply new lifecycle configuration
aws s3api put-bucket-lifecycle-configuration \
  --bucket mne-media-prod \
  --lifecycle-configuration file://configs/source-bucket-lifecycle-35d.json \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Step 1.3: Validate Changes

# Verify new configuration
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Expected: NoncurrentVersionExpiration: { "NoncurrentDays": 35 }

Impact:


Phase 2: S3 Replica Bucket Configuration

Bucket: mne-media-replica
Region: eu-west-3
Status: Already exists, needs lifecycle adjustment

Step 2.1: Review Current Configuration

# Check current versioning
aws s3api get-bucket-versioning \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Check current lifecycle rules
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Check Object Lock status
aws s3api get-object-lock-configuration \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

Current State:

Step 2.2: Update Lifecycle Configuration

Goal:

Design Principle: Replica bucket should mirror the source bucket. Current versions stay as long as they exist in source. Only noncurrent versions (deleted/replaced) expire after 180 days.

Configuration file: configs/replica-bucket-lifecycle-180d-mirror.json (already created)

Note: Current configuration was backed up in Step 0.

# Apply new lifecycle configuration
aws s3api put-bucket-lifecycle-configuration \
  --bucket mne-media-replica \
  --lifecycle-configuration file://configs/replica-bucket-lifecycle-180d-mirror.json \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

Step 2.3: Validate Changes

# Verify new configuration
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Expected:
# - No expiration on current versions
# - NoncurrentVersionExpiration: { "NoncurrentDays": 180 }

Impact:


Phase 2A: S3 Vault Source Bucket Configuration

Bucket: mne-vault-prod
Region: eu-west-1
Status: Already exists, needs lifecycle adjustment

Step 2A.1: Review Current Configuration

# Check current versioning
aws s3api get-bucket-versioning \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Check current lifecycle rules
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Check replication configuration
aws s3api get-bucket-replication \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Current State:

Step 2A.2: Update Lifecycle Configuration

Goal: Change noncurrent version retention from 180 days to 35 days to match media-prod retention strategy.

Configuration file: configs/vault-source-bucket-lifecycle-35d.json (already created)

Note: Current configuration was backed up in Step 0.

# Apply new lifecycle configuration
aws s3api put-bucket-lifecycle-configuration \
  --bucket mne-vault-prod \
  --lifecycle-configuration file://configs/vault-source-bucket-lifecycle-35d.json \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Step 2A.3: Validate Changes

# Verify new configuration
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Expected: NoncurrentVersionExpiration: { "NoncurrentDays": 35 }

Impact:


Phase 2B: S3 Vault Replica Bucket Configuration

Bucket: mne-vault-replica
Region: eu-north-1 (Stockholm)
Status: Needs lifecycle policy ⚠️

Step 2B.1: Review Current Configuration

# Check current versioning
aws s3api get-bucket-versioning \
  --bucket mne-vault-replica \
  --region eu-north-1 \
  --no-cli-pager --profile mnemonica

# Check current lifecycle rules (currently none)
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-vault-replica \
  --region eu-north-1 \
  --no-cli-pager --profile mnemonica

Current State:

Step 2B.2: Apply Lifecycle Configuration

Goal: Configure 180-day noncurrent version retention to match source bucket and align with Deep Archive minimum storage duration (180 days).

Design Principle: Replica bucket should mirror the source bucket. Current versions stay as long as they exist in source. Only noncurrent versions (deleted/replaced) expire after 180 days.

Configuration file: configs/vault-replica-bucket-lifecycle-180d-mirror.json (already created)

# Apply lifecycle configuration
aws s3api put-bucket-lifecycle-configuration \
  --bucket mne-vault-replica \
  --lifecycle-configuration file://configs/vault-replica-bucket-lifecycle-180d-mirror.json \
  --region eu-north-1 \
  --no-cli-pager --profile mnemonica

Step 2B.3: Validate Changes

# Verify new configuration
aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-vault-replica \
  --region eu-north-1 \
  --no-cli-pager --profile mnemonica

# Expected:
# - No expiration on current versions
# - NoncurrentVersionExpiration: { "NoncurrentDays": 180 }
# - AbortIncompleteMultipartUpload: 180 days

Impact:


Phase 3: S3 Cross-Region Replication

Status: Already configured and working ✅

Step 3.1: Verify Media Bucket Replication

# Check media bucket replication configuration
aws s3api get-bucket-replication \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Media Bucket Replication:

No changes needed - Media bucket replication is working correctly.

Step 3.2: Verify Vault Bucket Replication

# Check vault bucket replication configuration
aws s3api get-bucket-replication \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Vault Bucket Replication:

No changes needed - Vault bucket replication is working correctly.

Step 3.3: Test Replication (Optional)

Test media bucket replication:

# Upload a test file to media bucket
echo "Test media replication - $(date)" > test-media-replication-$(date +%s).txt
aws s3 cp test-media-replication-*.txt s3://mne-media-prod/ --region eu-west-1 --no-cli-pager --profile mnemonica

# Wait 2-5 minutes, then check replica
aws s3 ls s3://mne-media-replica/ --region eu-west-3 --no-cli-pager --profile mnemonica | grep test-media-replication

# Clean up
aws s3 rm s3://mne-media-prod/test-media-replication-*.txt --region eu-west-1 --no-cli-pager --profile mnemonica

Test vault bucket replication:

# Upload a test file to vault bucket
echo "Test vault replication - $(date)" > test-vault-replication-$(date +%s).txt
aws s3 cp test-vault-replication-*.txt s3://mne-vault-prod/ --region eu-west-1 --no-cli-pager --profile mnemonica

# Wait 2-5 minutes, then check replica
aws s3 ls s3://mne-vault-replica/ --region eu-north-1 --no-cli-pager --profile mnemonica | grep test-vault-replication

# Clean up
aws s3 rm s3://mne-vault-prod/test-vault-replication-*.txt --region eu-west-1 --no-cli-pager --profile mnemonica

Phase 4: IAM Security Configuration

Step 4.1: Verify Existing IAM Roles

# Check replication role
aws iam get-role \
  --role-name s3crr_role_for_mne-media-prod_to_crr-media-prod \
  --no-cli-pager --profile mnemonica

# Check AWS Backup service role
aws iam get-role \
  --role-name AWSBackupDefaultServiceRole \
  --no-cli-pager --profile mnemonica

Status: ✅ Both roles exist and are working

Step 4.2: Create Application IAM Role

Purpose: Restrict application access to source buckets only (media and vault), deny version deletion and replica bucket access.

Configuration files:

# Create the role
aws iam create-role \
  --role-name eks-mnemonica-prod-s3-role \
  --assume-role-policy-document file://configs/application-iam-role-trust-policy.json \
  --description "Restricted S3 access for eks-mnemonica-prod application" \
  --no-cli-pager --profile mnemonica

# Attach the policy
aws iam put-role-policy \
  --role-name eks-mnemonica-prod-s3-role \
  --policy-name eks-mnemonica-prod-s3-access \
  --policy-document file://configs/application-s3-policy.json \
  --no-cli-pager --profile mnemonica

Step 4.3: Validate Application Role

# Verify role exists
aws iam get-role \
  --role-name eks-mnemonica-prod-s3-role \
  --no-cli-pager --profile mnemonica

# Get role ARN for application configuration
aws iam get-role \
  --role-name eks-mnemonica-prod-s3-role \
  --query 'Role.Arn' \
  --output text \
  --no-cli-pager --profile mnemonica

# Expected: arn:aws:iam::828879644785:role/eks-mnemonica-prod-s3-role

Policy Restrictions:


Phase 5: RDS Operational Backup Configuration

Instance: eks-mnemonica-prod
Engine: PostgreSQL 16.8
Region: eu-west-1

Step 5.1: Verify Current RDS Backup Configuration

# Check current backup settings
aws rds describe-db-instances \
  --db-instance-identifier eks-mnemonica-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --query 'DBInstances[0].{BackupRetentionPeriod:BackupRetentionPeriod,LatestRestorableTime:LatestRestorableTime,PreferredBackupWindow:PreferredBackupWindow}'

Current Configuration:

No changes needed - RDS automated backups are correctly configured.

Step 5.2: Validate PITR Capability

# Verify point-in-time recovery is available
aws rds describe-db-instances \
  --db-instance-identifier eks-mnemonica-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --query 'DBInstances[0].[LatestRestorableTime,BackupRetentionPeriod]'

# Expected: Recent timestamp and 35 days

Capabilities:


Phase 6: AWS Backup for RDS

Purpose: Long-term disaster recovery snapshots with 6-hour frequency and 180-day retention

Architecture Note (Updated 2025-11-22):

Step 6.1: Review Current AWS Backup Plan

# List current backup plans
aws backup list-backup-plans \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Get details of existing plan
aws backup get-backup-plan \
  --backup-plan-id 16cae9bf-5a0c-4ac7-8fb6-6f9ae2eec630 \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Target Configuration:

Step 6.2: Update Backup Plan Configuration

Configuration File: configs/aws-backup-plan-6h-snapshots.json

This configuration contains only the 6-hour snapshot rule (PITR rule has been removed).

# Update the backup plan to use the snapshot-only configuration
aws backup update-backup-plan \
  --backup-plan-id 16cae9bf-5a0c-4ac7-8fb6-6f9ae2eec630 \
  --backup-plan file://configs/aws-backup-plan-6h-snapshots.json \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Expected Result:

Step 6.3: Validate Backup Plan

# Verify backup plan configuration
aws backup get-backup-plan \
  --backup-plan-id 16cae9bf-5a0c-4ac7-8fb6-6f9ae2eec630 \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Expected:
# - BackupPlanName: "eks-mnemonica-6h-snapshots-180d"
# - Single rule: 6-Hour-Snapshots-180-Day-Retention-DR-Copy
# - ScheduleExpression: "cron(0 */6 * * ? *)"
# - EnableContinuousBackup: false
# - Lifecycle DeleteAfterDays: 180
# - CopyActions to eu-west-3 with 180-day retention

Step 6.4: Monitor Backup Jobs

# Wait for next backup (within 6 hours), then check
aws backup list-backup-jobs \
  --by-resource-arn arn:aws:rds:eu-west-1:828879644785:db:eks-mnemonica-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --max-results 5

# Expected:
# - State: COMPLETED
# - CreatedBy.BackupRuleName: "6-Hour-Snapshots-180-Day-Retention-DR-Copy"
# - BackupVaultName: "eks-mnemonica-prod-vault"

# Check DR region copy
aws backup list-backup-jobs \
  --by-resource-arn arn:aws:rds:eu-west-1:828879644785:db:eks-mnemonica-prod \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica \
  --max-results 5

Backup Schedule:


Phase 7: Validation and Testing

Step 7.1: Validate S3 Configuration

# Source bucket validation
aws s3api get-bucket-versioning \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  | jq '.Rules[] | {ID, NoncurrentVersionExpiration}'

# Replica bucket validation
aws s3api get-bucket-versioning \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

aws s3api get-bucket-lifecycle-configuration \
  --bucket mne-media-replica \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica \
  | jq '.Rules[] | {ID, NoncurrentVersionExpiration}'

Expected Results:

Step 7.2: Test File Deletion Recovery

# 1. Upload test file
echo "Test file - $(date)" > test-recovery-$(date +%s).txt
aws s3 cp test-recovery-*.txt s3://mne-media-prod/ \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# 2. Wait for replication (5 minutes)
sleep 300

# 3. Verify replicated
aws s3 ls s3://mne-media-replica/ \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica | grep test-recovery

# 4. Delete from source (creates delete marker)
aws s3 rm s3://mne-media-prod/test-recovery-*.txt \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# 5. Verify deleted
aws s3 ls s3://mne-media-prod/ \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica | grep test-recovery
# (Should not appear)

# 6. List versions to see delete marker
aws s3api list-object-versions \
  --bucket mne-media-prod \
  --prefix test-recovery \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# 7. Remove delete marker to restore
DELETE_MARKER_ID=$(aws s3api list-object-versions \
  --bucket mne-media-prod \
  --prefix test-recovery \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --query 'DeleteMarkers[0].VersionId' \
  --output text)

aws s3api delete-object \
  --bucket mne-media-prod \
  --key test-recovery-*.txt \
  --version-id $DELETE_MARKER_ID \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# 8. Verify restored
aws s3 ls s3://mne-media-prod/ \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica | grep test-recovery
# (Should appear again)

# 9. Clean up
aws s3 rm s3://mne-media-prod/test-recovery-*.txt \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Step 7.3: Validate RDS Backup Configuration

# Check RDS automated backup
aws rds describe-db-instances \
  --db-instance-identifier eks-mnemonica-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  | jq '.DBInstances[0] | {BackupRetentionPeriod, LatestRestorableTime, PreferredBackupWindow}'

# Check AWS Backup jobs
aws backup list-backup-jobs \
  --by-resource-arn arn:aws:rds:eu-west-1:828879644785:db:eks-mnemonica-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --max-results 10

# Verify cross-region copies
aws backup list-recovery-points-by-backup-vault \
  --backup-vault-name Default \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica \
  --max-results 5

Phase 8: Monitoring and Alerting

Objective: Set up essential monitoring for backup failures, replication issues, and lifecycle policy problems.

Estimated Time: 15 minutes

Cost Impact: ~$1.70/month (SNS + CloudWatch Alarms)

Architecture Note: All CloudWatch alarms are created in eu-west-1 (same region as the SNS topic) for simplicity. CloudWatch can monitor S3 metrics from any region, so alarms in eu-west-1 can monitor buckets in eu-west-3 and eu-north-1.

Step 8.1: Create SNS Topic for Alerts

Create an SNS topic in eu-west-1 that will handle all backup and replication alerts.

# Create SNS topic
aws sns create-topic \
  --name mnemonica-backup-alerts \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Save the TopicArn from the output - you'll need it for subsequent steps.

Subscribe email address(es):

# Replace with your email address
aws sns subscribe \
  --topic-arn arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --protocol email \
  --notification-endpoint your-email@example.com \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Confirm the subscription via the email you receive

Subscribe Slack/Teams webhook:

First, create an incoming webhook in your Slack/Teams workspace:

Then subscribe the webhook to SNS:

# Replace with your webhook URL
aws sns subscribe \
  --topic-arn arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --protocol https \
  --notification-endpoint https://hooks.slack.com/services/YOUR/WEBHOOK/URL \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Note: For Slack/Teams integration, SNS sends raw JSON. You may want to add a Lambda function to format messages nicely (optional enhancement).

Step 8.2: AWS Backup Job Failure Alarms

Create EventBridge rule to detect failed AWS Backup jobs.

# Create EventBridge rule for backup failures
aws events put-rule \
  --name mnemonica-backup-job-failures \
  --event-pattern file://configs/monitoring-backup-failure-pattern.json \
  --state ENABLED \
  --description "Alert on AWS Backup job failures" \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Add SNS topic as target
aws events put-targets \
  --rule mnemonica-backup-job-failures \
  --targets "Id"="1","Arn"="arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts" \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Grant EventBridge permission to publish to SNS
aws sns set-topic-attributes \
  --topic-arn arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --attribute-name Policy \
  --attribute-value file://configs/monitoring-sns-policy.json \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Repeat for DR region (eu-west-3):

# Create rule in DR region
aws events put-rule \
  --name mnemonica-backup-job-failures-dr \
  --event-pattern file://configs/monitoring-backup-failure-pattern.json \
  --state ENABLED \
  --description "Alert on AWS Backup job failures in DR region" \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Create SNS topic in DR region
aws sns create-topic \
  --name mnemonica-backup-alerts-dr \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Subscribe same email to DR topic
aws sns subscribe \
  --topic-arn arn:aws:sns:eu-west-3:828879644785:mnemonica-backup-alerts-dr \
  --protocol email \
  --notification-endpoint your-email@example.com \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Add SNS as target
aws events put-targets \
  --rule mnemonica-backup-job-failures-dr \
  --targets "Id"="1","Arn"="arn:aws:sns:eu-west-3:828879644785:mnemonica-backup-alerts-dr" \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

Step 8.3: S3 Replication Monitoring

Monitor replication lag and failures for both media and vault buckets.

First, verify the replication rule IDs:

# Get media bucket replication rule ID
aws s3api get-bucket-replication \
  --bucket mne-media-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --query 'ReplicationConfiguration.Rules[0].ID' \
  --output text

# Get vault bucket replication rule ID
aws s3api get-bucket-replication \
  --bucket mne-vault-prod \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --query 'ReplicationConfiguration.Rules[0].ID' \
  --output text

# Save the outputs and replace rule IDs in the commands below

Create replication monitoring alarms for media buckets:

# IMPORTANT: Replace "paris-replica" with your actual media bucket rule ID

# Media bucket: replication lag alarm
aws cloudwatch put-metric-alarm \
  --alarm-name mnemonica-media-replication-lag \
  --alarm-description "Alert when media bucket replication takes more than 15 minutes" \
  --metric-name ReplicationLatency \
  --namespace AWS/S3 \
  --statistic Maximum \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 900000 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=SourceBucket,Value=mne-media-prod Name=DestinationBucket,Value=mne-media-replica Name=RuleId,Value=paris-replica \
  --alarm-actions arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --treat-missing-data notBreaching \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Media bucket: replication failures alarm
aws cloudwatch put-metric-alarm \
  --alarm-name mnemonica-media-replication-failures \
  --alarm-description "Alert when media bucket replication operations fail" \
  --metric-name OperationsFailedReplication \
  --namespace AWS/S3 \
  --statistic Sum \
  --period 300 \
  --evaluation-periods 1 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --dimensions Name=SourceBucket,Value=mne-media-prod Name=DestinationBucket,Value=mne-media-replica Name=RuleId,Value=paris-replica \
  --alarm-actions arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --treat-missing-data notBreaching \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Create replication monitoring alarms for vault buckets:

# IMPORTANT: Replace "vault-replica-rule" with your actual vault bucket rule ID

# Vault bucket: replication lag alarm
aws cloudwatch put-metric-alarm \
  --alarm-name mnemonica-vault-replication-lag \
  --alarm-description "Alert when vault bucket replication takes more than 15 minutes" \
  --metric-name ReplicationLatency \
  --namespace AWS/S3 \
  --statistic Maximum \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 900000 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=SourceBucket,Value=mne-vault-prod Name=DestinationBucket,Value=mne-vault-replica Name=RuleId,Value=vault-replica-rule \
  --alarm-actions arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --treat-missing-data notBreaching \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Vault bucket: replication failures alarm
aws cloudwatch put-metric-alarm \
  --alarm-name mnemonica-vault-replication-failures \
  --alarm-description "Alert when vault bucket replication operations fail" \
  --metric-name OperationsFailedReplication \
  --namespace AWS/S3 \
  --statistic Sum \
  --period 300 \
  --evaluation-periods 1 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --dimensions Name=SourceBucket,Value=mne-vault-prod Name=DestinationBucket,Value=mne-vault-replica Name=RuleId,Value=vault-replica-rule \
  --alarm-actions arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --treat-missing-data notBreaching \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Step 8.4: Lifecycle Policy Monitoring

Monitor for unexpected object deletions in replica buckets (potential lifecycle misconfiguration).

Note: These alarms are created in eu-west-1 (same region as SNS topic) but monitor buckets in other regions. CloudWatch supports cross-region monitoring for S3 metrics.

Media replica bucket (mne-media-replica in eu-west-3):

# Create anomaly detection alarm for media replica bucket object count
# Alarm is in eu-west-1, monitoring bucket in eu-west-3
aws cloudwatch put-metric-alarm \
  --alarm-name mnemonica-media-replica-object-count-anomaly \
  --alarm-description "Alert when media replica bucket object count deviates from expected baseline" \
  --comparison-operator LessThanLowerThreshold \
  --evaluation-periods 1 \
  --metrics '[
    {
      "Id": "m1",
      "ReturnData": true,
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/S3",
          "MetricName": "NumberOfObjects",
          "Dimensions": [
            {"Name": "BucketName", "Value": "mne-media-replica"},
            {"Name": "StorageType", "Value": "AllStorageTypes"}
          ]
        },
        "Period": 86400,
        "Stat": "Average"
      }
    },
    {
      "Id": "ad1",
      "Expression": "ANOMALY_DETECTION_BAND(m1, 2)",
      "Label": "Media Replica Object Count Anomaly Detection"
    }
  ]' \
  --threshold-metric-id ad1 \
  --alarm-actions arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --treat-missing-data notBreaching \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Vault replica bucket (mne-vault-replica in eu-north-1):

# Create anomaly detection alarm for vault replica bucket object count
# Alarm is in eu-west-1, monitoring bucket in eu-north-1
aws cloudwatch put-metric-alarm \
  --alarm-name mnemonica-vault-replica-object-count-anomaly \
  --alarm-description "Alert when vault replica bucket object count deviates from expected baseline" \
  --comparison-operator LessThanLowerThreshold \
  --evaluation-periods 1 \
  --metrics '[
    {
      "Id": "m1",
      "ReturnData": true,
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/S3",
          "MetricName": "NumberOfObjects",
          "Dimensions": [
            {"Name": "BucketName", "Value": "mne-vault-replica"},
            {"Name": "StorageType", "Value": "AllStorageTypes"}
          ]
        },
        "Period": 86400,
        "Stat": "Average"
      }
    },
    {
      "Id": "ad1",
      "Expression": "ANOMALY_DETECTION_BAND(m1, 2)",
      "Label": "Vault Replica Object Count Anomaly Detection"
    }
  ]' \
  --threshold-metric-id ad1 \
  --alarm-actions arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --treat-missing-data notBreaching \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Benefits of Anomaly Detection:

Step 8.5: Validation

Test SNS topic delivery:

# Send test notification
aws sns publish \
  --topic-arn arn:aws:sns:eu-west-1:828879644785:mnemonica-backup-alerts \
  --subject "Test: Backup Monitoring Alert" \
  --message "This is a test notification from your backup monitoring system. If you received this, notifications are working correctly." \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Check email and Slack/Teams channel to confirm receipt.

Verify alarms exist:

# List all alarms (all are in eu-west-1)
aws cloudwatch describe-alarms \
  --alarm-names mnemonica-media-replication-lag mnemonica-media-replication-failures mnemonica-vault-replication-lag mnemonica-vault-replication-failures mnemonica-media-replica-object-count-anomaly mnemonica-vault-replica-object-count-anomaly \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Check alarm states (should be OK or INSUFFICIENT_DATA initially)
aws cloudwatch describe-alarms \
  --state-value ALARM \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Verify EventBridge rules:

# List rules
aws events list-rules \
  --name-prefix mnemonica-backup \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Check rule targets
aws events list-targets-by-rule \
  --rule mnemonica-backup-job-failures \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Recovery Procedures

Scenario 1: Deleted File Recovery (0-35 days)

Objective: Restore a deleted file from source bucket with perfect consistency

RTO: 30-60 minutes
RPO: <1 minute

Steps:

  1. Identify deletion timestamp:
# List object versions to find when it was deleted
aws s3api list-object-versions \
  --bucket mne-media-prod \
  --prefix path/to/file.ext \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica
  1. Restore RDS to exact timestamp via PITR:
# Restore DB to specific point in time
aws rds restore-db-instance-to-point-in-time \
  --source-db-instance-identifier eks-mnemonica-prod \
  --target-db-instance-identifier eks-mnemonica-prod-restored-$(date +%Y%m%d-%H%M) \
  --restore-time "2025-10-30T15:30:00Z" \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica
  1. Restore S3 file from noncurrent version:
# Get the version ID before deletion
VERSION_ID=$(aws s3api list-object-versions \
  --bucket mne-media-prod \
  --prefix path/to/file.ext \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica \
  --query 'Versions[0].VersionId' \
  --output text)

# Copy version to restore it as current
aws s3api copy-object \
  --copy-source mne-media-prod/path/to/file.ext?versionId=$VERSION_ID \
  --bucket mne-media-prod \
  --key path/to/file.ext \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Scenario 2: Deleted File Recovery (35-180 days)

Objective: Restore file from replica bucket Deep Archive

RTO: 12-48 hours (Deep Archive restoration time)
RPO: Up to 6 hours (AWS Backup snapshot frequency)

Steps:

  1. Initiate restoration from Deep Archive:
# Start restore operation (takes 12-48 hours with Standard tier)
aws s3api restore-object \
  --bucket mne-media-replica \
  --key path/to/file.ext \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}' \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica
  1. Check restoration status:
aws s3api head-object \
  --bucket mne-media-replica \
  --key path/to/file.ext \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica \
  --query 'Restore'
  1. Download restored file (after restoration completes):
aws s3 cp s3://mne-media-replica/path/to/file.ext ./restored-file.ext \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica
  1. Restore RDS from closest AWS Backup snapshot:
# List recovery points near the desired time
aws backup list-recovery-points-by-backup-vault \
  --backup-vault-name Default \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

# Restore from recovery point
aws backup start-restore-job \
  --recovery-point-arn <recovery-point-arn> \
  --metadata '{"DBInstanceIdentifier":"eks-mnemonica-prod-restored"}' \
  --iam-role-arn arn:aws:iam::828879644785:role/service-role/AWSBackupDefaultServiceRole \
  --region eu-west-1 \
  --no-cli-pager --profile mnemonica

Scenario 3: Complete Region Failure (eu-west-1)

Objective: Failover to eu-west-3 for disaster recovery

RTO: 12-48 hours
RPO: Up to 6 hours

Steps:

  1. Restore RDS from eu-west-3 snapshot:
# List recovery points in DR region
aws backup list-recovery-points-by-backup-vault \
  --backup-vault-name Default \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Restore RDS instance in eu-west-3
aws backup start-restore-job \
  --recovery-point-arn <recovery-point-arn-in-eu-west-3> \
  --metadata '{"DBInstanceIdentifier":"eks-mnemonica-prod-dr"}' \
  --iam-role-arn arn:aws:iam::828879644785:role/service-role/AWSBackupDefaultServiceRole \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica
  1. Access S3 replica bucket:
# Replica bucket is already in eu-west-3
# Update application configuration to use replica bucket
# S3 endpoint: s3.eu-west-3.amazonaws.com
  1. Restore S3 objects from Deep Archive (if needed):

Option A: Restore individual files (see Scenario 2 for single-file restore commands)

Option B: Bulk restore using S3 Batch Operations (recommended for large-scale recovery):

S3 Batch Operations allows you to restore thousands of objects in parallel from Deep Archive.

Step 3a: Create S3 Inventory (if not already configured)

# Create inventory configuration for the replica bucket
aws s3api put-bucket-inventory-configuration \
  --bucket mne-media-replica \
  --id mnemonica-replica-inventory \
  --inventory-configuration file://configs/s3-inventory-config.json \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

# Example inventory config (save as configs/s3-inventory-config.json):
# {
#   "Destination": {
#     "S3BucketDestination": {
#       "AccountId": "828879644785",
#       "Bucket": "arn:aws:s3:::mne-media-replica",
#       "Format": "CSV",
#       "Prefix": "inventory/"
#     }
#   },
#   "IsEnabled": true,
#   "Id": "mnemonica-replica-inventory",
#   "IncludedObjectVersions": "Current",
#   "Schedule": {
#     "Frequency": "Daily"
#   }
# }

Step 3b: Create IAM role for Batch Operations (first time only)

# Create trust policy (save as configs/batch-ops-trust-policy.json):
# {
#   "Version": "2012-10-17",
#   "Statement": [{
#     "Effect": "Allow",
#     "Principal": {"Service": "batchoperations.s3.amazonaws.com"},
#     "Action": "sts:AssumeRole"
#   }]
# }

aws iam create-role \
  --role-name S3BatchOperationsRole \
  --assume-role-policy-document file://configs/batch-ops-trust-policy.json \
  --no-cli-pager --profile mnemonica

# Attach policy with S3 permissions
aws iam attach-role-policy \
  --role-name S3BatchOperationsRole \
  --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess \
  --no-cli-pager --profile mnemonica

Step 3c: Create Batch Restore Job

# Create manifest with objects to restore (use inventory CSV or create custom manifest)
# Manifest format: bucket,key
# Example: mne-media-replica,path/to/file1.jpg

# Create batch job (example for restoring all objects)
aws s3control create-job \
  --account-id 828879644785 \
  --operation '{
    "S3InitiateRestoreObject": {
      "ExpirationInDays": 7,
      "GlacierJobParameters": {
        "Tier": "Bulk"
      }
    }
  }' \
  --manifest '{
    "Spec": {
      "Format": "S3InventoryReport_CSV_20211130",
      "Fields": ["Bucket", "Key"]
    },
    "Location": {
      "ObjectArn": "arn:aws:s3:::mne-media-replica/inventory/manifest.json",
      "ETag": "MANIFEST_ETAG"
    }
  }' \
  --report '{
    "Bucket": "arn:aws:s3:::mne-media-replica",
    "Format": "Report_CSV_20180820",
    "Enabled": true,
    "Prefix": "batch-restore-reports/",
    "ReportScope": "AllTasks"
  }' \
  --priority 10 \
  --role-arn arn:aws:iam::828879644785:role/S3BatchOperationsRole \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica \
  --description "Bulk restore from Deep Archive for DR"

# Monitor job progress
aws s3control describe-job \
  --account-id 828879644785 \
  --job-id <JOB_ID_FROM_CREATE_OUTPUT> \
  --region eu-west-3 \
  --no-cli-pager --profile mnemonica

Important Notes:

Documentation:

  1. Update application DNS/endpoints to point to eu-west-3

Summary Checklist


What's NOT Implemented (Deferred)

⏸️ S3 Object Lock on replica bucket


[DEFERRABLE] Object Lock Implementation

Status: ⏸️ DEFERRED - Requires creating new bucket or risk acceptance

Object Lock provides immutability (WORM) protection for disaster recovery data. This section can be implemented later as a separate project.

Why Object Lock?

Protection Against:

Requirement from Plan: GOVERNANCE mode, 180-day default retention on replica bucket

Current Limitation

Object Lock is NOT enabled on mne-media-replica

Problem: Object Lock can ONLY be enabled at bucket creation time. Cannot be added to existing buckets.

Implementation Options

Option A: Create New Bucket with Object Lock (Full Compliance)

Steps:

  1. Create new bucket mne-media-replica-v2 with Object Lock enabled
  2. Configure Object Lock: GOVERNANCE mode, 180-day retention
  3. Update replication configuration to point to new bucket
  4. Choose migration strategy:
    • Parallel operation: Keep both buckets, let old one expire naturally (recommended)
    • Copy data: Restore from Deep Archive and copy (expensive: ~$2,570)

Pros:

Cons:

Cost:

Option B: Enhanced IAM Policies (Partial Protection)

Steps:

  1. Apply strict bucket policies denying deletion
  2. Use MFA delete on bucket
  3. Restrict IAM permissions

Pros:

Cons:

Option C: Accept Risk (Document Only)

Steps:

  1. Document risk acceptance
  2. Rely on IAM access controls
  3. Implement in future when convenient

Pros:

Cons:

Recommendation

For production environments with compliance requirements: Option A (New bucket)

Implementation guide available separately when ready to proceed with Object Lock.

Cost-benefit analysis: See COST-COMPARISON.md for detailed scenarios


Maintenance Tasks

Weekly

Monthly

Quarterly


Implementation Guide Version: 2.0
Last Updated: 2025-10-31
Customized for: Account 828879644785 (mnemonica profile)

Attached Files

Loading attached files...

Comments

No comments yet. Be the first to comment!

Search Results