Need to act right now?
The Action Plan
Confirm the Outage
Before panicking, verify it's a real outage and not your connection:
- Check from a different device or network (mobile data vs WiFi)
- Use a free tool to check from an external location
- Check your monitoring dashboard if you have one
- Ask a colleague to try
→ Check your site from our servers
If it's just you: Clear your DNS cache, try incognito mode, check your VPN. Not a real outage.
If it's down for everyone: Continue to Step 2.
Communicate Immediately
Don't wait until you know the cause. Acknowledge the issue now.
- Update your status page: "We're investigating reports of [site/service] being unavailable"
- If you don't have a status page, create an emergency one now
- Post on social media if your users expect it
- Notify your team via Slack/Discord
→ Create emergency status page (no signup, instant)
Template message:
"We're aware that [site/service name] is currently experiencing issues. Our team is investigating and we'll provide updates every 15 minutes. We apologize for the inconvenience."
Quick Diagnostic Checklist
Work through this list to narrow down the cause:
Recent Changes
- ☐ Was there a deployment in the last few hours?
- ☐ Were any configuration changes made?
- ☐ Did you update DNS, SSL, or hosting recently?
Server & Infrastructure
- ☐ Can you SSH into the server?
- ☐ Check server resource usage (CPU, memory, disk)
- ☐ Check error logs for recent entries
- ☐ Is the database responding?
- ☐ Check your hosting provider's status page
DNS & Network
- ☐ Does the domain resolve correctly? (dig/nslookup)
- ☐ Are DNS records pointing to the right server?
- ☐ Did DNS changes propagate yet?
SSL
- ☐ Is the SSL certificate valid and not expired?
- ☐ Is the browser showing a certificate error?
Third-Party Services
- ☐ Check status pages of services you depend on (CDN, payment, auth, database)
- ☐ Is the issue with your code or a dependency?
Fix by Cause
| If the cause is... | Try this first |
|---|---|
| Bad deployment | Roll back to the previous version immediately |
| Server out of memory/CPU | Restart the service, then scale up resources |
| Database down | Restart database, check connection limits |
| Disk full | Clear logs and temp files, expand storage |
| SSL expired | Renew certificate (Let's Encrypt: certbot renew) |
| DNS misconfigured | Fix records in registrar, wait for propagation |
| Hosting provider outage | Wait (check their status page) or failover if possible |
| DDoS attack | Enable Cloudflare "Under Attack" mode, contact hosting |
| Third-party service down | Wait, implement fallback, or disable the failing feature |
| Domain expired | Renew immediately at registrar, may take hours to restore |
Verify Recovery
Don't assume it's fixed. Verify thoroughly:
- Check the site from multiple locations/devices
- Test login functionality
- Test the feature that was broken (not just the homepage)
- Check your monitoring dashboard—is it showing green?
- Verify related services (API, webhooks, email)
Post-Recovery Communication
- Update status page to "Resolved"
- Post brief summary of what happened and what you did
- Thank users for patience
- If appropriate, mention what you'll do to prevent it recurring
Template message:
"[Site/service] is back online. The issue was caused by [brief cause]. We've [brief fix]. Total downtime was approximately [duration]. We apologize for the disruption and will be implementing [prevention measure] to avoid this in the future."
More on this: Incident communication guide.
Post-Mortem (Within 48 Hours)
While the details are fresh, document what happened, why, and how to prevent it. This is how you actually improve reliability.
If You Don't Have Monitoring Yet
You found out your site was down from a customer. Or you checked manually and discovered it. That's the last time you should find out this way.
Set up monitoring now—before you close this page. It takes 5 minutes:
- 1 Sign up for a monitoring tool (free tier is fine)
- 2 Add your main URL
- 3 Configure email + Slack/Discord alerts
- 4 Create a status page
- 5 Done
Full setup guide: What is uptime monitoring or jump to the monitoring checklist.
Frequently Asked Questions
How do I check if my website is down for everyone or just me?
Use an external checking tool that tests your site from a different network and location. If the external tool shows your site is up but you can't access it, the issue is on your end (DNS cache, VPN, ISP). If the external tool also shows it as down, it's a real outage.
Should I tell customers about downtime or wait until it's fixed?
Communicate immediately. Users can already see the problem—silence makes it worse. A quick "We're aware and working on it" is far better than letting customers wonder if you know. Update every 15 minutes during active incidents.
My site is down and I can't access my server. What do I do?
If you can't SSH in, the issue is likely at the infrastructure level. Check your hosting provider's status page for known outages. Try accessing via their control panel. If it's a VPS, try a reboot from the provider's dashboard. If nothing works, contact their support.
How long does it take to recover from downtime?
It varies enormously by cause. A bad deployment can be rolled back in minutes. An expired SSL certificate can be renewed in minutes. A hosting provider outage may take hours (out of your control). An expired domain can take 24-48 hours to restore. Having a response plan and monitoring in place dramatically reduces recovery time for issues you can control.
Be Prepared, Not Panicked
Downtime is stressful enough without scrambling to figure out what to do. Bookmark this page. Set up monitoring so you know before customers do. Create a status page before you need one.
The best time to prepare for downtime is before it happens.
Know About Downtime Before Your Customers Do
PerkyDash monitors your site 24/7 from 12 global regions and alerts you within minutes. Plus instant status pages for when things go wrong.