corso/website/docs/setup/maintenance.md
ashmrtn 9767e08b39
Add recommendations on when to run maintenance (#4398)
#### Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No

#### Type of change

- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [x] 🗺️ Documentation
- [ ] 🤖 Supportability/Tests
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

#### Test Plan

- [x] 💪 Manual
- [ ]  Unit test
- [ ] 💚 E2E
2023-09-28 18:19:14 +00:00

58 lines
2.4 KiB
Markdown

---
description: "Repository maintenance."
---
# Repository maintenance
Repository maintenance helps optimize the Corso repository as backups are created and possibly deleted by the user.
Maintenance can also free up space by removing data no longer referenced by any backups from the repository.
It's safe to run maintenance concurrently with backup, restore, and backup deletion operations. However, it's not safe
to run maintenance operations concurrently on the same repository. Corso uses file locks and the idea of a repository
owner to try to detect concurrent maintenance operations.
## Repository owner
The repository owner is set to the user and hostname of the machine that runs maintenance on the repo the first time.
If the user and hostname of the machine running maintenance can change, use either the `--force` flag or the `--user`
and `--host` flags.
The `--force` flag updates the repository owner and runs maintenance.
The `--user` and `--host` flags act as if the given user/hostname owns the repository for the maintenance operation but
doesn't update repo owner info.
:::danger
If any of these flags are passed the user must make sure no concurrent maintenance operations run on the same
repository. Concurrent maintenance operations a repository may result in data loss.
:::
## Maintenance types
Corso allows for two different types of maintenance: `metadata` and `complete`.
Metadata maintenance runs quickly and optimizes indexing data. Complete maintenance takes more time but compacts data in
backups and removes unreferenced data from the repository.
As Corso allows concurrent backups during maintenance, running complete maintenance immediately after deleting a backup
may not result in a reduction of objects in the storage service Corso is backing up to.
Deletion of old objects in the storage service depends on both wall-clock time and running maintenance.
Later maintenance runs on the repository will remove the data.
## Maintenance guidelines
For the best experience, the recommendation is to run metadata maintenance every
20–30 backups. Complete maintenance should be run every 1–2 weeks
depending on how many backups are deleted from the repo. More backup deletions
means that complete maintenance should be run more often so that unneeded blobs
in storage get deleted.
Not running maintenance exactly according to the recommendations won't impact
the correctness of the data in the repo, but could result in decreased
performance.