adds logical changes required for sharepoint permission handling, both backup and restore.
---
#### Does this PR need a docs update or release note?
- [x] ✅ Yes, it's included
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #3135
#### Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
When the user's mailbox is full, we cannot make use of delta apis. This adds initial changes needed to create separate delta and non delta pagers for all of exchange.
*I would suggest looking commit wise when reviewing the PR.*
---
#### Does this PR need a docs update or release note?
- [x] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [ ] ⛔ No
#### Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Supportability/Tests
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
#### Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #<issue>
#### Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
Adds a context passdown that allows GC to define the service being queried at a high level, and the rate limiter to utilize different rate limiters based on that info. Malformed or missing limiter config uses the default limiter.
---
#### Does this PR need a docs update or release note?
- [x] 🕐 Yes, but in a later PR
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2951
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
introduces a new type wrapping a nested map so that aggregation of globally excluded items in driveish services don't need to manage the map updates themselves.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🧹 Tech Debt/Cleanup
#### Issue(s)
* #2340
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
a handful of errant cleanups to help the PR for adding incremental support to sharepoint.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🧹 Tech Debt/Cleanup
#### Issue(s)
* #3136
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
#### Does this PR need a docs update or release note?
- [x] 🕐 Yes, but in a later PR
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2825
#### Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
The fetch paralellism checks and logs occur on
every item streamed from GC. This is a bit chatty,
and has been moved upstream in the process for
a more centralized behavior.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🧹 Tech Debt/Cleanup
#### Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
In a case where the current manifests and the
fallback contain both distinct and overlapping
categories, the end result contains multiple
bases for the same category. Verifydistinctresons
didn't catch this because it includes the resource owner.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🐛 Bugfix
#### Issue(s)
* #2825
#### Test Plan
- [x] ⚡ Unit test
Adds a lookup step to graph connector to find
an owner's id and name given some identifier.
The identifier, for either sites or users, can be a
well formed id or name.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2825
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
Adds an interface for GetUserInfo, which gets
corso-specific metadata for a single user. Also
does some refactoring around discovery for
better interface consistency, both as variables
and as access layers.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🌻 Feature
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
Refactors the sharepoint site lookup to use the
id-to-name maps. This has a momentary regression
that will get solved in the next PR: we no longer
match on weburl suffixes, and instead require
a complete match.
Also, migrates the sharepoint lookup code out
of GC and into Discovery.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2825
* #1995
* #1547
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
Adds a common interface: idNamer, which is used
to pass around tuples of an id and a name for some
resource. Also adds compliance to this iface in
selectors, where a selector's ID and Name are the
DiscreteOwner values.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2825
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
Replaces the operations graphConnector reference
with an interface. Restore and Backups have
separate, unique interfaces.
<!-- PR description-->
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2825
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
in order to avoid errors cascading from graph API's
populated 204 responses containing `0/r/n/r/n`,
in particular in response to deletes, this change
hacks in the creation of a unique http client for
each delete call.
This is dangerous to do, since we already know
that generating clients per calls can leak
consumption of system resources such as sockets.
This change presumes that runtime deletes are
minimally, if ever, called during standard backup
and restore. The permission deletion in onedrive
is the stand-out exception that might need some
additional investigation.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🤖 Supportability/Tests
#### Issue(s)
* #2707
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
Updates and corrects input aliasing according to
the following rules (in priority order):
1. if the library name is usable, use it
2. if not, alias to the package name
3. if the package name is weird, alias sensibly
4. in case of collision, alias more distant imports
5. aliases should be consistent throughout
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🧹 Tech Debt/Cleanup
#### Issue(s)
* #1970
#### Test Plan
- [x] ⚡ Unit test
Mostly find/replace on errors.N and errors.W. Also turns all wrapf into wrap, and removes as many errorf calls as possible.
Might follow up with a linter to enforce this change.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🧹 Tech Debt/Cleanup
#### Issue(s)
* #1970
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
Ensures:
* context clues are added to logged messages
* kv-pairs added with `log.With` are added to logged messages
* add log messages for how many details items were sourced from a base snapshot during merging
* ensure log messages for base selection have all the info they should
---
#### Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
#### Type of change
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Supportability/Tests
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
#### Test Plan
- [x] 💪 Manual
- [ ] ⚡ Unit test
- [ ] 💚 E2E
In kopia select the longest prefix's exclude set. Has undeterministic
behavior if there are somehow prefixes of the same length.
In OneDrive, add a prefix that contains the drive ID to all excludes.
This makes incrementals safe even if two items in different drives
somehow have the same ID.
---
#### Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
#### Type of change
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
#### Issue(s)
* closes#2759
#### Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
Renaming the funcs in the fault
package to be more clear about
their purpose and behavior. Largely
just find&replace changes, except
for fault.go and the fault examples.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [ ] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Creates two files. One for Permissions and another for onedrive/data_collections.
1. Is a file used to hold all the permission specific functions for `OneDrive`. These may need to be leveraged in the future for SharePoint.
2. Previously, the backup function for `OneDrive` was housed within the graph connector data collections file. All other applications have their code housed within their respective directories. Moved the increment graph connector messages outside of the function to match the other applications.
<!-- Insert PR description-->
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🧹 Tech Debt/Cleanup
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Use the check to see if a user has access to OneDrive/Exchange to also see if the user exists. This does not stop the population of GraphConnector users every time an instance is created, but does move us one step closer to that
Manually tested the following cases:
* backup "*"
* backup single existing user without error
* backup multiple existing users without error
* error when attempting to backup user that doesn't exist in the same domain
* error when attempting to backup user that doesn't exist in a different domain
Sample error output (progress disabled):
```text
No backups available
Error: 1 error occurred:
* Failed to run Exchange backup for user foo@other.onmicrosoft.com: doing backup: producing backup data collections: resource owner [foo@other.onmicrosoft.com] not found within tenant
```
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1547
## Test Plan
- [x] 💪 Manual
- [ ] ⚡ Unit test
- [ ] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
- parentReference for deleted items will not have path.
- GC was not responding with deleted state
- Deleted items are not streamed by kopia
Borrows some changes from https://github.com/alcionai/corso/pull/2503/files as that might be delayed with on merge. Will rebase that if this gets merged fist.
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [x] 🕐 Yes, but in a later PR
- [ ] ⛔ No
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* https://github.com/alcionai/corso/issues/2117
* fixes https://github.com/alcionai/corso/issues/2517
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Does this PR need a docs update or release note?
- [ ] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Finalize the backup workflow for `SharePoint.Pages.`
Populate functions parallelizes
Fix for Incorrect Status during backup
<!-- Insert PR description-->
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* closes #2071<issue>
* closes#2257
* related to #2173
## Test Plan
- [x] ⚡ Unit test
## Description
Refactors error handling in graph_connector.
Also begins some error refactoring in support by
moving StackTraceErrror style funcs into a more
normalized handler in graph/errors.go. And
removes the (Non)Recoverable error wraps which
we weren't using anyway.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Split the collection interface into stuff used during backup and stuff used during restore. Does not change other code beyond fixing types
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* closes#1944
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Create helper functions to deserialize OneDrive metadata during subsequent backups. Currently deserialized data is not passed to the function that generates Collections nor is metadata passed in even though it's wired through GraphConnector
Additional changes to BackupOp and operations/manifests.go are required to begin passing in metadata
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
## Issue(s)
* closes#2122
## Test Plan
- [x] 💪 Manual
- [ ] ⚡ Unit test
- [ ] 💚 E2E
## Description
Return an item exclude list from GraphConnector to BackupOp. BackupOp does not yet pass this to kopia wrapper.
Returned list is set to nil (eventually) by all components so even if this were wired to kopia wrapper it wouldn't change the current behavior of the system
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
## Issue(s)
* #2243
merge after:
* #2143
## Test Plan
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Description
This PR reintroduces the changes from #2266 with a change to *not* reset the transport
when initializing the shared client.
Doing so was removing the retry and other middlewares
and also resulting in throttled requests being masked as success
Also - we now decorate our download traffic with an ISV tag as recommended [here](https://learn.microsoft.com/en-us/sharepoint/dev/general-development/how-to-avoid-getting-throttled-or-blocked-in-sharepoint-online#how-to-decorate-your-http-traffic)
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [x] 🕐 Yes, but in a later PR
- [ ] ⛔ No
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #2266
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [ ] ⚡ Unit test
- [x] 💚 E2E
## Description
onedrive currently constructs a new http client
for every file it downloads. This causes the OS
to generate extra sockets, and hang onto them
after the download is complete. Replacing these
one-off clients with a singular, re-used client-
which is the behavior and standard suggested
for golang http clients- minimizes system
resource consumption.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🐛 Bugfix
## Issue(s)
* closes#2262
## Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Adds the api client pkg pattern to the connector/
discovery package. Most code changes are plain
lift-n-shift, with minor clean-ups along the way.
User retrieval is now filtered to only include
member and on-premise accounts.
## Does this PR need a docs update or release note?
- [x] ✅ Yes, it's included
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #2094
## Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
## Description
This commit adds logic in discovery and backup to check whether the specified user has
an exchange mailbox that is available/enabled.
If so - the backup is short-circuited to succeed but with "no data"
Going forward - we should be able to move the logic in the OneDrive connector that checks
for a valid drive and license in here.
## Does this PR need a docs update or release note?
- [x] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [ ] ⛔ No
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
## Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #2145
## Test Plan
<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
- [ ] ⚡ Unit test
- [ ] 💚 E2E
## Description
Now that resource owners are identified via
the selector itself, rather than each scope, we
can remove the resource owner data from
scope production and data.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1617
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
DataCollections validation step was still using the full resourceOwner list in the selector to validate every backup, rather than checking only the DiscreteOwner.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🐛 Bugfix
## Issue(s)
* #1617
## Test Plan
- [x] ⚡ Unit test
## Description
Migrates code away from pulling the resource
owner from each scope, and instead usees the
selector as the canon identifier of the resource
owner.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1617
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Switches the CLI from calling `DiscreteScopes` to `SplitByResourceOwner`
on the selector itself. This func will take the original selector and produce
a slice of selectors, each one with a DiscreteOwner (the single user involved
in usage of that selector) and all include/filter scopes in that selector re-rooted
to that discrete owner.
Does not yet solve the per-category tuple, since we are still pivoting on the
scopes inside the selector. That comes as a later change.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1617
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
selector creation now includes a parameter for
a slice of resource owners (users or sites). This
is step one in migrating resource owner lists out
of scopes and into the selector. next step is to
have the selector utilize the primary list instead
of the per-scope list.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1617
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
moves the DataCollections producer out of
collections and into exchange, along with the
integration tests. The only changes are the
code shuffles, passing down required values,
and the unexporting of funcs that were only
exported for the old design.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🐹 Trivial/Minor
## Issue(s)
* #1727
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Parses the previousPaths metadata collections
along with deltas, and hands paths down to
exchange backup collection producers. Does
not yet scrutinize previous/current path diffs.
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1726
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Relocates the graphService struct to graph as
the Service struct. Replaces GC's embedded
graphService with a graph.Servicer reference.
## Type of change
- [x] 🐹 Trivial/Minor
## Issue(s)
* #1725
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Configuration and attenion to the graphService
failFast is haphazard and has shared ownership.
This change removes that property from the
service, along with the ErrPolicy func, in favor of passing around a control.Options struct.
## Type of change
- [x] 🐹 Trivial/Minor
## Issue(s)
* #1725
* #302
## Test Plan
- [x] ⚡ Unit test
## Description
When backing up exchange data, parse the
metadata collection of delta urls from prior runs
(if any exist) and pass those tokens along to the
fetch functions for re-use.
## Type of change
- [x] 🌻 Feature
## Issue(s)
* #1725
## Test Plan
- [x] ⚡ Unit test