Retries, Timeouts, and Large Files
Long-running jobs fail in real life. Good retry and timeout policy turns random failures into recoverable events.
Reliability Flags
| Flag | Role |
|---|---|
--retries | Retry failed transfers |
--retries-sleep | Backoff interval |
--low-level-retries | Retry backend API operations |
--timeout | Network inactivity timeout |
--contimeout | Connection setup timeout |
Mapping
Hardened Transfer Example
resilient-sync.sh
rclone sync /srv/large-data remote-prod:archive/large-data \
--retries 8 \
--retries-sleep 10s \
--low-level-retries 20 \
--timeout 5m \
--contimeout 30s \
--log-file /var/log/rclone-resilient.log
Large File Guidance
| Scenario | Practical adjustment |
|---|---|
| WAN with intermittent packet loss | Increase retries and timeout |
| Provider with strict API limits | Lower transfers, add TPS limit |
| Multi-GB objects | Run during low-traffic windows |
info
If large transfers repeatedly fail, test with smaller batches first to isolate network vs provider behavior.
Common Pitfalls
| Pitfall | Result | Better approach |
|---|---|---|
| Tiny timeout values | Frequent false failures | Use realistic timeout for link quality |
| Infinite retries without alerting | Hidden stuck jobs | Cap retries and monitor exit code |
| One giant monolithic job | Hard restart and debugging | Segment path sets by domain |