Troubleshooting

This page is a starting point when something's wrong with a SkyNode install. It covers the subsystems that produce useful diagnostic signal, the recovery flow for unclean restarts, and a handful of recurring failure modes.

If you're hitting something not listed here, the fastest path to a diagnosis is usually:

Check the SkyNode log viewer (dashboard → log viewer) for the most recent errors.
Check the constraint snapshots — most "task didn't run" issues resolve to a failing constraint.
Check the hub publisher status — if the hub connection is degraded, snapshot freshness on the website is your symptom.
If the backend won't start at all, jump straight to the recovery flow.

Subsystems to know

A few SkyNode modules are particularly useful when diagnosing:

Module	Useful for
`sync_monitor`	Watches hub connectivity; logs reconnect attempts, hardware-graph staleness, token-refresh failures
`connection_manager`	Owns the HTTPS / WS connection to the hub; surfaces transport-level errors
`token_manager`	Manages access + refresh tokens; logs token-refresh failures
`hub_publisher`	Publishes snapshots and events to the hub; backlog growth here means upstream is unhealthy
`device_manager`	Owns device lifecycle; surfaces driver-init errors
`resource_manager`	Health/thermal/battery/storage state for the host and devices
`crash_reporting`	Captures unhandled exceptions and writes a crash report to disk
`logging_setup`	Configures the log layout; check here if you need a different log destination or verbosity

The dashboard surfaces summarized state from most of these in real time; the underlying log file has the detail.

Recovery flow

When SkyNode comes up after an unclean shutdown — power loss, OS crash, force-quit — it enters a recovery flow instead of going straight to the dashboard. The Tauri shell routes you to the recovery/ page, which:

Inspects the persisted state. Identifies the bound telescope/observatory, the last known device states, any in-progress task, and any unreported task results sitting on disk.
Re-pulls the hardware graph from the hub. Confirms the bound records still exist and that the install token is still valid.
Validates device configurations. Each device's driver is loaded with its persisted config; any driver that fails to initialize is flagged.
Surfaces unresolved work. If a task was running when the crash happened, you decide whether to mark it canceled, retry, or accept and upload any partial results.
Returns to the dashboard. Once recovery succeeds the install resumes normal operation.

If recovery itself fails — usually because of corrupted config or a broken driver dependency — the recovery page shows the error and offers the option to roll back to the previous config snapshot or to re-pair the install from scratch.

Common failure modes

Hub connection looks fine but tasks don't run

Almost always a constraint failure. Open dashboard → constraints and look for any in non-OK state. The most common culprits:

Weather constraint still tripped after weather clears — some sensor drivers latch a failed reading until the sensor reports fresh data. Restart the sensor or wait for the next reading cycle.
Sun altitude constraint — observing window hasn't opened yet (or has closed).
Storage / battery constraint — host disk filled up, or battery on a portable rig dropped below threshold.
MountSunSeparation constraint — the mount happens to be parked near the sun's current position and the constraint is keeping it put.

You can force-clear a constraint from the GUI when you know it's a false positive (stuck sensor, post-maintenance state). Use sparingly — a force-cleared constraint stays cleared until the next evaluation cycle.

Hardware graph looks stale

If the website shows configuration changes you've made but the SkyNode local GUI doesn't — or vice versa — the hardware graph cache in SkyNodeConfig is stale.

Check last_sync_at on the configuration page.
Check the sync monitor's log lines for failed pulls.
Force a refresh from the configuration page; the sync monitor will retry immediately.

Devices keep dropping offline

Driver-level. The log viewer is your friend; filter to the affected device. Common patterns:

ASCOM driver: the COM proxy is dying, often because the underlying ASCOM driver app crashed or was force-quit. Restarting the ASCOM driver host usually fixes it. If the COM ProgID has changed, update the driver config.
Serial / TCP driver: the connection string is wrong or the device endpoint moved. Reconnect from the device editor; verify cabling.
Vendor SDK driver (MaxIm, Forte, etc.): the SDK process is in a bad state. Restart the SDK before restarting SkyNode.

Tasks complete but don't appear on the hub

File upload back-pressure. Check the hub publisher's backlog count and any upload errors in the log. Common causes:

Token-refresh failure — the install's refresh token expired or was revoked. Re-pair the install.
Network failure — site connectivity is down. Tasks complete locally but results queue until the hub is reachable again.

MQTT subscribers see stale data

If local MQTT consumers (status displays, monitoring tools) report stale state:

Check local_broker.enabled — easy to leave off after a config edit.
Check that the MQTT broker process is up.
Confirm the LWT topic still shows the SkyNode install as online — if it's offline, the SkyNode process died or the broker thinks it did (keep-alive failure).

Backend won't start at all

Drop into the recovery flow. If recovery also fails, the fastest stable diagnosis is to:

Look at the most recent crash report from crash_reporting on disk.
Try starting with a known-good config (the recovery page can roll back).
As a last resort, re-pair the install from scratch (the SkyNodeInstallation record on the hub is recreated, and the install_id changes — note this if you have MQTT consumers that filter on installation_uid).

Where to get help

Log files — local path is per-platform; the log viewer in the GUI is the fastest way to find them.
Crash reports — written by crash_reporting to a directory in the SkyNode data dir.
Telescope log book on the hub — TelescopeLogEntry records capture significant events (manual takeovers, shutdowns, device failures) and are persisted to the hub even when the local logs rotate away.

If you've checked all of the above and you still don't know what's wrong, capture a recent log slice and the relevant crash report and flag it to the Skynet ops team.