30 Commits

Author SHA1 Message Date
ashmrtn
387f8e8cd7
Deserialize OneDrive metadata during backup (#2263)
## Description

Create helper functions to deserialize OneDrive metadata during subsequent backups. Currently deserialized data is not passed to the function that generates Collections nor is metadata passed in even though it's wired through GraphConnector

Additional changes to BackupOp and operations/manifests.go are required to begin passing in metadata

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* closes #2122

## Test Plan

- [x] 💪 Manual
- [ ]  Unit test
- [ ] 💚 E2E
2023-01-31 22:48:30 +00:00
ashmrtn
2458508969
Wire up GraphConnector <-> BackupOp item exclude list (#2245)
## Description

Return an item exclude list from GraphConnector to BackupOp. BackupOp does not yet pass this to kopia wrapper.

Returned list is set to nil (eventually) by all components so even if this were wired to kopia wrapper it wouldn't change the current behavior of the system

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* #2243

merge after:
* #2143 

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-01-30 19:44:06 +00:00
ashmrtn
57accfc9c4
Use mocks to test getting drive info from Graph API (#2306)
## Description

Make a separate interface for fetching drive information from Graph API. Interface allows for better testing via mocks

Merges SharePoint and OneDrive code for getting drives.

Also fixes potential bug where not all drives would be fetched. This could have occurred because the previous implementation for both SharePoint and OneDrive were not checking for paginated results

Viewing by commit is recommended

## Does this PR need a docs update or release note?

- [x]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [ ]  No 

## Type of change

- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [x] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🧹 Tech Debt/Cleanup

## Issue(s)

* #2264

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-01-27 22:48:30 +00:00
Keepers
d529d145cb
scrub pii from observe logs (#2285)
## Description

This is a quick hack to satisfy a primary case of PII scrubbing.  We expect to revisit it in the future.

## Does this PR need a docs update or release note?

- [x]  No 

## Type of change

- [x] 🌻 Feature

## Issue(s)

* #2284

## Test Plan

- [x] 💪 Manual
- [x]  Unit test
2023-01-27 00:31:57 +00:00
Vaibhav Kamra
bdb7f2b109
Re-use client for OneDrive item downloads (#2276)
## Description

This PR reintroduces the changes from #2266 with a change to *not* reset the transport
when initializing the shared client. 

Doing so was removing the retry and other middlewares
and also resulting in throttled requests being masked as success

Also - we now decorate our download traffic with an ISV tag as recommended [here](https://learn.microsoft.com/en-us/sharepoint/dev/general-development/how-to-avoid-getting-throttled-or-blocked-in-sharepoint-online#how-to-decorate-your-http-traffic)

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [x] 🕐 Yes, but in a later PR
- [ ]  No 

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #2266 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [ ]  Unit test
- [x] 💚 E2E
2023-01-26 15:05:12 +00:00
Vaibhav Kamra
e9354a8429
Revert "add itemClient for re-usable od item downloads" (#2273)
Reverts alcionai/corso#2266. Am seeing issues on data restore that need to be debugged.
2023-01-26 07:52:56 +00:00
Keepers
bda8a5c60c
add itemClient for re-usable od item downloads (#2266)
## Description

onedrive currently constructs a new http client
for every file it downloads.  This causes the OS
to generate extra sockets, and hang onto them
after the download is complete.  Replacing these
one-off clients with a singular, re-used client-
which is the behavior and standard suggested
for golang http clients- minimizes system
resource consumption.

## Does this PR need a docs update or release note?

- [x]  No 

## Type of change

- [x] 🐛 Bugfix

## Issue(s)

* closes #2262

## Test Plan

- [x] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2023-01-26 04:27:52 +00:00
ashmrtn
a4f2f0a437
Add deleted files to OneDrive excluded file list (#2250)
## Description

Begin populating a global exclude set with files deleted from OneDrive. The set contains the IDs of files that have been deleted.

This is actually safe to return from GraphConnector now (though it's not returned) as it uses item IDs instead of item names.

Future PRs will need to address handling of (potentially) moved files

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* #2242 

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-01-25 16:57:30 +00:00
ashmrtn
bb7f54b049
Update OneDrive metadata with folder moves (#2193)
## Description

Track folder moves and update the persisted metadata accordingly. Moves need to take into account subtree changes because OneDrive will not notify us of subfolders moving if nothing else was updated on the subfolder.

Handle updates by making a copy of the folder map for ease of updating and later on ease of finding out which collection paths have changed

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* #2120

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-01-20 19:18:59 +00:00
ashmrtn
6651305d4f
Update OneDrive metadata with folder deletions (#2192)
## Description

Track folder and package deletions in OneDrive metadata for incremental backups. Add test for deletions

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* #2120

## Test Plan

- [x] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-01-20 17:02:23 +00:00
Danny
936789b6b4
GC: Backup: Sharepoint libraries to omit site drive. (#2098)
## Description
Branch treats Drive `Site Pages` as a restricted directory. All  `pages` within the directory, will be backed up via the Site Pages API.  This adds an additional call to per collection to obtain the parent `driveName`. 
<!-- Insert PR description-->
TODO: Create a Pipeline that distinguishes between SitePages, Libraries, and List for SharePoint
## Does this PR need a docs update or release note?

- [x] 🕐 Yes, but in a later PR
The documentation should state the assumption is that only `.aspx` files are to be found in the s
## Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature


## Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* closes #2072<issue>

## Test Plan
- [x]  Unit test
2023-01-19 03:19:15 +00:00
ashmrtn
45874abf7e
Begin persisting OneDrive/SharePoint library metdata (#2144)
## Description

Start persisting the folder path maps and delta URLs for backed up OneDrive/SharePoint drives. Delta URLs are saved in a map[drive ID]deltaURL while folder IDs are in a map[driveID]map[folder ID]folder path

Needs another patch to properly save the path for folders that match the selector, currently the selector comparison is only on the parent of an item

Later PRs can get the new folder map by taking the map from the previous backup and making changes to it when folder deletions/moves are encountered

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* #2120 

## Test Plan

- [x] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-01-17 20:50:24 +00:00
Keepers
5980c4dca0
log observed messages (#2073)
## Description

add logging to the observe package, assume that
every instance where a message is observed, it
also gets logged.

Merger may want to wait until logging to a file is the standard behavior, else the terminal might get messy/buggy.

## Does this PR need a docs update or release note?

- [x]  No 

## Type of change

- [x] 🧹 Tech Debt/Cleanup

## Issue(s)

* closes #2061

## Test Plan

- [x] 💪 Manual
2023-01-13 19:34:01 +00:00
ashmrtn
df086bd6f4
Move OneDrive path helper func to path package (#1880)
## Description

Later code in details package will need access to this function so it can update item details. However, importing the onedrive package in the details package leads to an import cycle. Moving it to the path package breaks the cycle while still allowing the code to be accessed.

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor

## Issue(s)

* #1800 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-12-21 23:32:47 +00:00
Abin Simon
a4147f5498
Speedup OneDrive backup by pulling urls in delta API (#1842)
## Description

This brings in another huge improvement in the backup speed dropping from ~15m to ~1.5m for an account with ~5000 files. This one also drastically reduces the number of requests we have to make for the same account from 5505 to just ~55. This also means that we don't get throttled anymore and we can easily run multiple backup jobs in parallel before we hit the 1024 limit.

~The code works as of now, but I have to double check the numbers as well as fix an issue with us opening too many files and causing program to crash with 'too many open files' when we bump up the numbers. With the current numbers, it works, but I wanna double check and optimize them. Plus some code cleanup.~

## Does this PR need a docs update or release note?

- [x]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [ ]  No 

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor

## Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #<issue>

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2022-12-20 09:34:28 +00:00
Keepers
e2775aeb95
Refactor service failfast (#1789)
## Description

Configuration and attenion to the graphService
failFast is haphazard and has shared ownership.
This change removes that property from the
service, along with the ErrPolicy func, in favor of passing around a control.Options struct.

## Type of change

- [x] 🐹 Trivial/Minor

## Issue(s)

* #1725
* #302

## Test Plan

- [x]  Unit test
2022-12-13 23:06:27 +00:00
Keepers
fc5f42545f
rename graph.Service to Servicer (#1787)
## Description

`graph.Service -> graph.Servicer`,  no other changes.

More compliant with golang naming standards,
and will allow us to eventually migrate the
Service struct out of connector and into graph.

## Type of change

- [x] 🐹 Trivial/Minor

## Issue(s)

* #1725
2022-12-13 17:37:03 +00:00
Keepers
4c976298d4
populate sharepoint itemInfo (#1687)
## Description

Currently, all drive backup and restore actions
populatet a details.OneDriveInfo struct.  This
change branches that struct between one-
drive and sharepoint info, depending on the
current source.

## Type of change

- [x] 🌻 Feature

## Issue(s)

* #1616

## Test Plan

- [x]  Unit test
2022-12-08 17:26:56 +00:00
Keepers
5b0549fb32
migrate sharepoint cat from file to library (#1555)
## Description

Adds a LibraryCategory in paths, and changes
the current sharepoint code to use that cat
instead of the onedrive files cat.  This is mostly to keep sharepoint library drives separate from
onedrive and sharepoint site list items.

## Type of change

- [x] 🌻 Feature

## Issue(s)

* #1506

## Test Plan

- [x]  Unit test
2022-11-22 19:01:05 +00:00
Keepers
10acf0ccf6
generic drive retrieval for sharepoint (#1536)
## Description

Adapts the graph onedrive library to handle
access to drive data across both onedrive and
sharepoint services.

## Type of change

- [x] 🌻 Feature

## Issue(s)

* #1506

## Test Plan

- [x] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2022-11-22 18:05:47 +00:00
Vaibhav Kamra
562c468a91
Progress bar improvements (#1302)
## Description

This adds the following functionality to the CLI progress output:
- Add a message e.g. to describe a completed stage
- Add a message describing a stage in progress (using a spinner) and trigger completion
- Add a progress tracker (as a counter) for a specified number of items (when the # items is known up front
- Improves `ItemProgress` by aligning the columns a bit better and also removing the progress bar and replacing
  with a byte counter

Finally - the above are used in the `backup` and `backup create onedrive` flows. Follow up PRs will wire these up for
exchange and the restore flows also.
 
## Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor

## Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #1278 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-10-24 19:52:48 +00:00
Vaibhav Kamra
1cbb34e403
Scope OneDrive backup - for testing (#1109)
## Description

<!-- Insert PR description-->

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor

## Issue(s)

* closes #1108 
* closes #1137 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-10-13 17:24:42 +00:00
Vaibhav Kamra
03bb63f52d
Use selectors in OneDrive CLI (#996)
## Description

Adds the following selectors to OneDrive details/restore :
- `file-name`, `folder`, `file-created-after`, `file-created-before`, `file-modified-after`, `file-modified-before`

Also includes a change where we remove the `drive/<driveID>/root:` prefix from parent path entries in details. This
is to improve readability. We will add drive back as a separate item in details if needed later.

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor

## Issue(s)

* #627 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-10-03 07:23:30 +00:00
ashmrtn
ff2db0c553
Export path package to other codebases (#912)
## Description

Moves the `path` package to the `pkg` package so other code outside of Corso can use it if they need it

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor

## Issue(s)

* closes #908 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2022-09-26 18:34:19 +00:00
ashmrtn
7850272a5e
Apply wsl linter to onedrive package (#911)
## Description

Finish enabling wsl linter on Corso repo by removing the exception for onedrive package

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor

## Issue(s)

* closes #481

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [ ]  Unit test
- [ ] 💚 E2E
2022-09-20 20:02:35 +00:00
ashmrtn
0e6ae32fc3
Add OneDrive path base structs (#833)
* Constants for OneDrive stuff

* Tests and constructor for OneDrive paths

* Populate onedrive path struct in data collection (#835)

* Helper function to make path structs for onedrive

* Use path struct in onedrive data collection

Does not change the external API at all, just the internals of how
FullPath functions and what is stored for the path.

* Wire up making data collections with path struct

Requires addition of tenant as input to Collections().

* Fixup onedrive Collections tests

* Wire up call to onedrive.NewCollections()

Just requires adding the tenant ID to the call.
2022-09-13 14:48:03 -07:00
Vaibhav Kamra
1246f51b22
OneDrive Backup Operation (#738)
## Description

Wires up the OneDrive collection logic to `operation.Backup`

Includes an integration test that runs against the test domain

Two bug fixes:
- Skip the "root" item that is returned by the delta query
- Fix incorrect usage of the `filepath.SplitList` function which does
  not split a path into components. Instead use `strings.Split`. This
  is ok because the paths returned here are not OS specific.
  Regardless - this logic will be refactored when we use the `path` pkg.

## Type of change

Please check the type of change your PR introduces:
- [x] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 🐹 Trivial/Minor

## Issue(s)
#548 

## Test Plan

<!-- How will this be tested prior to merging.-->

- [ ] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2022-09-07 23:47:12 +00:00
Keepers
6cdc691c6f
correct the corso module (#749)
swaps the corso go module from github.com/
alcionai/corso to github.com/alcionai/corso/src
to align with the location of the go.mod and
go.sum files inside the repo.

All other changes in the repository update the
package imports to the new module path.
2022-09-07 15:50:54 -06:00
Vaibhav Kamra
ed1a4bebce
Use a WaitGroup for AwaitStatus (#716)
## Description

This builds on the `MergeStatus` proposal and `WaitGroup` discussion proposed in #494 
Required for OneDrive where we are operating on multiple collections

## Type of change

Please check the type of change your PR introduces:
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 🐹 Trivial/Minor

## Issue(s)
#494 

## Test Plan

<!-- How will this be tested prior to merging.-->

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-09-02 17:17:17 +00:00
Vaibhav Kamra
8634b1b1ac
Helpers to create OneDrive Collections (#692)
## Description

This adds the logic that materializes collections for a specified user

The collection paths are currently derived from OneDrive metadata but a future
PR will introduce `paths` pkg support to create these.

## Type of change

Please check the type of change your PR introduces:
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 🐹 Trivial/Minor

## Issue(s)

#388 

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-08-31 18:39:18 +00:00