Today started with two errors that looked related, but were not.
Codex was complaining about an old compacted chat record. GitHub Actions was complaining about deploy:
ssh-keyscan failed after 5 attempts
The scary part was that Oracle Cloud still showed the VM as Running. Same public IP. Same VNIC. Same route table. Same ingress rules for 22, 80, and 443.
But from the outside, nothing answered:
- SSH timed out.
- HTTP timed out.
- HTTPS timed out.
- GitHub Actions could not even fetch the SSH host key.
The site itself was fine. The Astro build passed locally. DNS still pointed to the right IP. The problem was lower than the website: the VM was alive according to the cloud dashboard, but unreachable from the internet.
The fix was refreshingly boring:
- Reboot the Oracle instance.
- Wait a minute.
- Confirm the site was back.
- Re-run the deploy workflow.
After the reboot, zinchuk.online responded again and the GitHub deploy passed.
The useful follow-up was adding a separate daily health check workflow. It does one simple thing:
curl -fsS https://zinchuk.online/ > /dev/null
If that fails, GitHub marks the workflow red and sends a direct email alert through SMTP.
The lesson is small but worth keeping: Running is not the same as reachable. For tiny personal infrastructure, a boring external check is more useful than trusting the cloud console alone.