## Description
Use the Reasons a snapshot was selected to retrieve only the metadata
corresponding to those reasons. This will avoid having multiple versions
of metadata for the same (resource owner, service, category) tuple as
well as pulling in more metadata than required for some backups.
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#1829
## Test Plan
<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
- [ ] ⚡ Unit test
- [ ] 💚 E2E
## Description
Merge directory layouts between the passed in collections and the base snapshot(s). Also add unit tests to ensure the output kopia hierarchy looks as expected. (CLI) user observable behavior is not affected by this PR
This PR does not address:
* selecting subtrees for specific data categories in base snapshots
* not clobbering more recent info if multiple snapshots have subtrees for the same data category (ties into above)
* file deletions for services that can only report item deletions at a global level (e.x. OneDrive file deletions)
Viewing individual commits in PR may make changes easier to review
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* #1740
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Fixes two bugs: 1. building the correc tag key
when retrieving the prior backup id from a
manifest. 2. returning ErrNotFound when a
snapshot entry isn't found for restore. This
can occur for metadata if the service has
a prior backup that does not contain the
expected metadata files.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🐛 Bugfix
## Issue(s)
* #1823
## Test Plan
- [x] 💪 Manual
## Description
For each snapshot manifest returned when looking for previous snapshots for a given set of owners cats, return the reason the snapshot was selected. This will allow other code to select the correct metadata and base snapshot subtree(s) when making incremental backups.
An example of when all metadata and directories in a base snapshot may not be needed is
```text
backup create exchange --data email,contacts --users user1 -> B1
// uses B1 as the base
backup create exchange --data email --users user1 -> B2
// uses B1 as the base for contacts and B2 as the base for email
backup create exchange --data email,contacts --users user1 -> B3
```
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#1779
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Instead of relying on KopiaWrapper to create the OwnersCats for a
backup, have BackupOp create them from the selector and pass them in.
This is necessary as incremental backups will no longer see all the data
in the backup, meaning it cannot accurately create the OwnersCats
because some data categories or owners in the backup may not have had
changes.
OwnersCats are eventually converted to tags on a kopia snapshot and used
to lookup snapshots when trying to find base snapshots for incrementals.
Additional minor changes:
* use pointers instead of values when passing parameters
* set backup details OwnersCats to nil
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#1781
## Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
In a backup operation, begins the operation by
retrieving all backup manifests and metadata
from prior operations.
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1725
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
This PR uses a separate snapshot to store backup details instead of a Corso model (i.e. a kopia manifest).
Introduces a `StreamStore` that can be leveraged to store larger metadata objects. We can also leverage this
for incrementals or restartable backups going forward.
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #1735
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Pull code specific to making snapshots in kopia into a separate file. Factor out code specific to handling collections in preparation for having to deal with kopia items as well during incremental backups. Apart from code movement and factoring into functions no other changes have been made
Viewing by commit will make changes easier to see
## Type of change
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor
## Issue(s)
* #1740
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Expand interfaces for `GraphConnector.DataCollections` and `kopia.Wrapper.BackupCollections` to include parameters that will be needed during incremental backups. This patch only expands the interfaces, it does not add any extra functionality and the passed parameters are currently ignored.
In the future, passing nil for any of the new parameters should result in the current "full backup" behavior that Corso has. Passing values in these parameters should enable delta token-based incremental backups (assuming all the required data is there for the incremental backup)
## Type of change
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor
## Issue(s)
* closes#1700
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Begin persisting Exchange delta tokens for data categories that support delta queries. Tokens are stored in a single file as a `map[M365 container ID]token` where both the container ID and token are of type `string`. The file is located in the kopia snapshot that has all the other backup data at the path `tenant-id/{service}Metadata/user/category/delta`. No information about the delta token file is stored in backup details.
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#1685
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Set the mod time of uploaded files to either the mod time of the item (if it has one) or the current time (if it does not have one). Also add tests to check caching in kopia works properly.
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#621
part of:
* #547
merge after:
* #1427
* #1430
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Tell kopia about previous snapshots during the backup
so it can make use of them to skip uploading some data.
Currently only selects the most recent completed snapshot
and most recent incomplete snapshot that contains the
information being backed up. This can be tuned later if
it is not working good enough. However, increasing the
number of previous snapshots passed in may increase memory
usage during backup
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#1404
merge after:
* #1427
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Helper functions that allow finding the most recent complete
(and maybe incomplete) snapshots for a given set of tags. The
returned snapshots can later be used to do more efficient kopia
snapshots by allowing kopia to determine it's already uploaded
a file and skip uploading it again.
The number of most recent snapshots to return can be tuned (right
now it returns 1 most recent), but may cause more memory usage
during backups.
Kopia currently has some oddities when getting snapshot
manifests via tags:
* tag values cannot be empty or the comparison returns always
true, selecting all snapshot manifests
* kopia does not currently tag snapshot manifests made during
checkpoints. A patch to upstream kopia is needed to fix this.
All other manifests that match the tags will be returned though
Added code is not currently connected to any backup logic.
Viewing by commit may help as there was a tad of lift'n'shift code
movement for the sake of organization
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* #1404
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Add tags to each kopia snapshot that include all service/category pairs in the snapshot and all resource owners in the snapshot. This allows future snapshots to lookup existing snapshots by those tags so they can be fed into the snapshot function. Feeding previous snapshots into the snapshot function enables kopia to detect previously uploaded files and skip uploading the data again
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* #1404
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Add aws xray observability tooling, plus ways to quickly and
easily run the tools/load tests. Next steps are 1/ running the
daemon container alongside the load tests within github, and
2/ creating client credentials in aws's xray service so that
gathered trace info is propagated from the daemon proxy
to the xray service.
## Type of change
- [x] 🤖 Test
- [x] 💻 CI/Deployment
## Issue(s)
* #902
## Test Plan
- [x] 💪 Manual
- [x] 💚 E2E
## Description
Adds a tag describing the service within the backup model and the backup details model. Adds a filter builder pattern to the store wrapper for filtering results according to certain tags. Finally, adds a filter builder to specify the service type when listing backups.
## Type of change
- [x] 🐛 Bugfix
## Issue(s)
* #1226
## Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
## Description
As there's not a 1:1 relationship between backup details items and data stored in kopia as files, version all the data stored in kopia files. This will allow us to change the serialized format of this data in the future independently of the version in the backup details models. This is also less error prone than associating the version in the models with the layout of the kopia file data
See this [comment](https://github.com/alcionai/corso/issues/284#issuecomment-1255346379) for more context on why kopia layout versioning is useful and a sketch of the implemented solution
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* #284
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
<!-- Insert PR description-->
Currently does not throw an error. Open to other suggestions on return values.
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* closes#844
* #1000
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Adds some go runtime diagnostics tracking to
load testing, including some trace regioning.
Unfortunately, I couldn't find any third party trace library that didn't depend on a sidecar server
to sample against the application on. Therefore,
just starting with something basic.
## Type of change
- [x] 🤖 Test
## Issue(s)
* #902
## Test Plan
- [x] 💪 Manual
- [x] 💚 E2E
## Description
Fixes a couple of regressions that got introduced on the restore path via code review changes and
kopia wrapper refactor.
Found while writing the OneDrive restore test. Would have been prevented if we had said test.
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Test Plan
<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Track the count of bytes read and written in
kopia. For backups, this means the count of
bytes fed into kopia (hashed bytes), and the
amount written after compression and dedupe
(total file bytes). For restore, this is the count of bytes in all files read.
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #894
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Moves the `path` package to the `pkg` package so other code outside of Corso can use it if they need it
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor
## Issue(s)
* closes#908
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Extend the kopia metrics to include the snapshot's TotalFileSize parameter. This value also gets used as the metric for BytesWritten at the end of a
backup.
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #894
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Encode all paths given to kopia with base64URL to ensure no special characters that kopia can't handle end up in there. The encoded paths are not stored in backup details nor are they ever surfaced to the user. This also works around the previous limitation where Corso was unable to properly backup or restore exchange email layouts that had folders containing `/` characters
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
* closes#865
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
The wrapper.go file was not properly grouping items during restore, instead returning a bunch of single item data.Collections. Update this to properly group items by folder. Also update the tests to be more strict about checking for this.
Remove dead code around restoring a single item as it was unused. Lose some visibility into what happens if the code is unable to open the reader for a given item.
Generally moves some logic around in wrapper.go
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* closes#879
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Allows kopia data collection consumers to query stream size.
This is required for OneDrive that needs to set the file size prior to uploading file data.
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor
## Issue(s)
* #668
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Other components may need to rebuild the directory hierarchy of items. As the paths Corso deals with can be hard to properly parse at times, store that information in the Corso backup details. The hierarchy can be rebuilt by following the `ParentRef` fields of items. The item at the root of the hierarchy has an empty `ParentRef` field.
Also hide these folders from end-users. They are not displayed during backup list nor are they eligible as a target for restore
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* closes#862
* closes#861
* closes#818
merge after:
* #869
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Short refs allow the user to type less to select an item to restore. This PR
* Adds a short ref field to the BackupDetails entries
* Populates the short ref field during backup
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #572
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
* Make data.Collection.FullPath return path.Path
* Fixup graph connector code for path struct
* Add Elements call to Path interface
Still should not be used in place of the named functions to get specific
path elements (e.x. ResourceOwner())
* Fixup kopia wrapper path handling
Mostly removing shim code and minor fixup for building the directory
tree.
* All the test fixes
* Helper function to append elements to a path
Kopia wrapper will need to create the complete path of an item by
joining the collection path and the item name. This allows it to do so
without having to drop to a path Builder (not service/category safe) and
go back to a resource path. Right now it does not handle escaping.
* Use path struct while streaming entries
Use new path struct while streaming entries to kopia. Preparation for
FullPath returning a path struct.
* Update tests to use valid path structures
* Use path struct in kopia DataCollection
Does not change the external API of the DataCollection any, just updates
internals in preparation for switching support of
data.Collection.FullPath.
* Expand Path interface slightly
kopia.Wrapper needs some extra functionality from paths, mostly along
the lines of directly manipulating the elements in the path. This gives
access to those functions.
* Use path struct in kopia.Wrapper for restore
Pass path structs to the newly created collections during restore.
* Add tests for new path functionality
swaps the corso go module from github.com/
alcionai/corso to github.com/alcionai/corso/src
to align with the location of the go.mod and
go.sum files inside the repo.
All other changes in the repository update the
package imports to the new module path.
Introduces manual deletion of existing backups. The delete
includes: the modelStore backup, modelStore details, and
the kopia snapshot of the backup itself.
* Update to kopia with required callback
* Support structs for materializing backup details
Kopia will not allow us to pass data to it that should be passed back to
us in the `FinishedFile` callback. To work around this, create a small
thread-safe support struct that handles information about files kopia is
currently processing. Entries are removed from the set when kopia is
done with them and if no error occurred, the item's info will be added
to the BackupDetails.
* Switch to best attempt for iterating through files
Defaulting to "best-attempt" error handling where all data that didn't
result in an error is handed to kopia and then all errors encountered
are returned at the end.
* Test for uploads that have an error
Simple error reading a file. BackupDetails should not contain
information about the file that had the error (needs update to kopia
code and this code to pass). All other files should be present in kopia
and in BackupDetails.
Co-authored-by: Danny <danny@alcion.ai>
* Enable line width linter
Set to 120 which should be long enough to not be annoying but keep
things from getting "too long." Adding to get rid of the subjectiveness
of what is "too long." Tabs count as a single character.
* Remove artificial limit on kopia directories
Originally did not allow a directory to have both child directories and
items. Remove that limit and move logic to execute callbacks on static
items to the iteration function.
* Update tests for new kopia directory structure
* Turn on revive linter, ignore only a few things
* Fix lint errors
Ignore shadowing of 'suite' in tests for now. Also move some constants
that had the same value to tester.
Kopia maintainer mentioned this was a better way to deal with it than
the old version. Mostly superficial changes to logic.
Always safe to flush the write session during an upload. That will
ensure write sessions are closed. Validity of the uploaded data will
be via reachability of the data in the repo.
Drawback is if there's a problem flushing the write session it will
return that error instead of the original error (if there was one)
* store backup operation results in the backup manifest
Adds backup operation metadata like the outcome
statistics and selector definitions to the backup manifest
entry. These additional details will appear when users
call `corso backup list`.
* add e2e backup-restore integration test
Adds an e2e integration test that starts by backing up
data, and ends with restoring it. Also makes various
amendments to other code where necessary to
facilitate this exercise.