< Back
Overview
This plan outlines the recovery procedures for our services in the event of downtime, data loss, or platform outages.
1. Services & Infrastructure
1.1 Application
-
Platform: Heroku
-
Stack: Ruby on Rails
1.2 Database
-
Service: Heroku Postgres
-
Backups: Automated daily backups via Heroku
1.3 Website
-
Platform: Netlify
-
Type: Static site (e.g. marketing/docs)
2. Backup & Restore
2.1 Heroku Postgres
-
Backups: Daily via Heroku PGBackups + offsite backups to AWS S3
-
Retention: Based on Heroku plan tier; S3 backups per configured retention
-
Monitoring: Verify backup status monthly
Restore Procedure:
heroku pg:backups:restore '<backup_url>' DATABASE_URL --app <app-name>
2.2 App Code (Heroku)
-
Source of Truth: GitHub repository
-
Recovery: Redeploy latest stable version via Git
2.3 Website (Netlify)
-
Source of Truth: GitHub repository
-
Recovery: Redeploy via Netlify UI, CLI, or push to main branch
3. Failure Scenarios & Responses
3.1 Heroku App Down
- Check: https://status.heroku.com
- Inspect logs:
heroku logs --tail
- Redeploy if needed
- If caused by a prolonged Heroku outage, communicate status to users and wait for platform recovery (Heroku manages all infrastructure)
3.2 Database Failure / Data Loss
- Restore most recent daily backup
- Validate on staging if possible
- Communicate with users if downtime is required
3.3 Netlify Website Down
- Check: https://www.netlifystatus.com
- Re-deploy site from GitHub
- If Netlify is completely down, wait for platform recovery or consider temporary alternative static hosting
4. Communication Plan
Internal
- Notify team via Slack or email
- Use incident channel if needed
External
- Use status page, Twitter, or email list
Incident Template:
We’re currently experiencing downtime due to [brief reason]. The team is actively working on a resolution and will share updates shortly. Thank you for your patience.
5. Regular Checks
- Check
heroku pg:backups monthly
- Test restoring backup to staging monthly
- Review deploy logs and alerts regularly
6. Optional Improvements
- Setup uptime monitoring (Pingdom, UptimeRobot)
- Add fallback static hosting (e.g. alternative static host)