25 Commits

Author SHA1 Message Date
Keepers
0a947386e6
rename data pkg Stream to Item (#3966)
A Stream is a continuous transmission of data.
An item is a single structure.  Crossing the two
definitions generates confusion.

Primarily code movement/renaming.  Though there
is also some reduction/replacement of structs
where we'd made a variety of testable Item implementations
instead of re-using the generic mock.

---

#### Does this PR need a docs update or release note?

- [x]  No

#### Type of change

- [x] 🧹 Tech Debt/Cleanup

#### Test Plan

- [x]  Unit test
- [x] 💚 E2E
2023-08-11 19:33:33 +00:00
neha_gupta
09f3196ee0
Onedrive restore- handle unknown path error (#3643)
<!-- PR description-->
 In case of Onedrive Restore give a proper message if the path provided do not exist or restore is done on empty backup.

---

#### Does this PR need a docs update or release note?

- [x]  No

#### Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🐛 Bugfix

#### Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* https://github.com/alcionai/corso/issues/3579

#### Test Plan

<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
2023-07-04 11:37:16 +00:00
Keepers
ce72acbcc1
auto-log recoverable errors with stack (#3598)
automatically log when we add a recoverable error or a skipped item to fault.  This log will include a stack trace of the call from the location of the logged recoverable.  Clues does not have a method for pulling a stack trace out of an error yet; that can be added at a future date.

---

#### Does this PR need a docs update or release note?

- [x]  No

#### Type of change

- [x] 🤖 Supportability/Tests

#### Test Plan

- [x]  Unit test
- [x] 💚 E2E
2023-06-13 22:18:18 +00:00
Abin Simon
7d1d9295eb
Read items within a collection from kopia on demand (#3506)
Don't read all the files that we need to restore, just the collections in the beginning. The rest of them are read using Fetch on demand when they will be restored.

I haven't tested for any changes in memory consumption, but this brings down time taken for "Enumerating items in repository" to <5s even for huge number of files.

<!-- PR description-->

---

#### Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [x] 🕐 Yes, but in a later PR
- [ ]  No

#### Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Supportability/Tests
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

#### Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* https://github.com/alcionai/corso/issues/3011
* closes https://github.com/alcionai/corso/issues/3440
* closes https://github.com/alcionai/corso/issues/3537

#### Test Plan

<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
- [ ]  Unit test
- [x] 💚 E2E
2023-06-05 11:02:39 +00:00
Keepers
cbbc8d2f6c
drive-focused api refactoring prep (#3471)
setting up the drive api calls into files/spaces that will cascade naturally to the addition of an api client for users and sites.  contains some partial implementation of these clients, which will get completed in the next pr.

---

#### Does this PR need a docs update or release note?

- [x]  No

#### Type of change

- [ ] 🧹 Tech Debt/Cleanup

#### Issue(s)

* #1996

#### Test Plan

- [x]  Unit test
- [x] 💚 E2E
2023-06-02 20:16:47 +00:00
ashmrtn
bcde15689f
Smarter item load ordering during ProduceRestoreCollections (#3294)
Optimize loading items for restore a little bit by first grouping items
by directory, loading the directory once, and then loading all items
from the loaded directory. This brings item loading on my local machine
(communicating with remote S3) down to ~1.5min/1k items

Future improvements could lazily load items as they're returned in the
Items() call of each collection but doing so would change the semantics
of ProduceRestoreCollections() (specifically item not found errors would
be returned during Items() instead of during
ProduceRestoreCollections())

The kopia data collection has also been updated to hold onto a
reference to the folder it corresponds to. This folder reference is
used to service Fetch() calls

---

#### Does this PR need a docs update or release note?

- [x]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [ ]  No

#### Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Supportability/Tests
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

#### Issue(s)

* #3293

#### Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2023-05-04 21:02:34 +00:00
Keepers
9e783efe3a
fault package funcs rename (#2583)
## Description

Renaming the funcs in the fault
package to be more clear about
their purpose and behavior.  Largely
just find&replace changes, except
for fault.go and the fault examples.

## Does this PR need a docs update or release note?

- [x]  No 

## Type of change

- [x] 🧹 Tech Debt/Cleanup

## Issue(s)

* #1970

## Test Plan

- [x]  Unit test
- [x] 💚 E2E
2023-02-25 03:29:02 +00:00
Keepers
1459e1406c
1970 11 tracker fails redo (#2624)
#### Type of change

- [x] 🐛 Bugfix

#### Issue(s)

* #1970

#### Test Plan

- [x]  Unit test
- [x] 💚 E2E
2023-02-24 15:57:50 +00:00
Keepers
28ad304bb7
update items to accept ctx, fault.Errors (#2493)
## Description

In order for corso to track recoverable errors,
we need to pass a fault.Errors struct into the
items stream.  As long as we're doing that, we
might as well pass along the available ctx as well.

## Does this PR need a docs update or release note?

- [x]  No 

## Type of change

- [x] 🧹 Tech Debt/Cleanup

## Issue(s)

* #1970

## Test Plan

- [x]  Unit test
- [x] 💚 E2E
2023-02-16 19:09:20 +00:00
ashmrtn
129d6b0b0c
Add Fetch() to RestoreCollection (#2434)
## Description

Add a function to fetch a file from the collection synchronously. This will help avoid data dependencies on the restore path created by splitting item information across multiple kopia files

Fetch function is currently unoptimized, though deeper analysis of memory footprint should be done before changing

Viewing by commit will help reduce chaff from updating tests to comply with the new interface

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

## Issue(s)

* #1535

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2023-02-08 22:05:38 +00:00
ashmrtn
373f0458a7
Split collection interface (#2415)
## Description

Split the collection interface into stuff used during backup and stuff used during restore. Does not change other code beyond fixing types

## Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No 

## Type of change

- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🧹 Tech Debt/Cleanup

## Issue(s)

* closes #1944

## Test Plan

- [ ] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2023-02-07 22:15:48 +00:00
Keepers
aacb013b60
add doNotMerge func to collections (#1919)
## Description

Adds a new func to the data.Collection iface:
DoNotMergeItems() tells kopia to skip the
retention of items from prior snapshots when
generating the new snapshot for this collection.
A primary use case for this flag is when a delta
token expires, preventing an incremental lookup
and forcing gc to re-discover all items in the
container.

## Does this PR need a docs update or release note?

- [x]  No 

## Type of change

- [x] 🌻 Feature

## Issue(s)

* #1914

## Test Plan

- [x]  Unit test
2022-12-22 21:27:11 +00:00
ashmrtn
09c48c1ec9
Expand Stream and Collection interfaces for delta token-based incrementals (#1710)
## Description

Add functions to Collection and Stream interfaces that allow for getting
more information about the difference between the previous backup and
the currently in-progress one. These will allow delta token-based
incremental backups to determine how the state has evolved.

Current code does not use these functions and return values for them
are "default" values that should result in full backups even if
KopiaWrapper is updated to start checking the values and GraphConnector
still pulls all items

These functions are not used during restore and can return "default"
values

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [x] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🐹 Trivial/Minor

## Issue(s)

* #1700 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-12-07 22:20:59 +00:00
ashmrtn
ff2db0c553
Export path package to other codebases (#912)
## Description

Moves the `path` package to the `pkg` package so other code outside of Corso can use it if they need it

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor

## Issue(s)

* closes #908 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [x] 💚 E2E
2022-09-26 18:34:19 +00:00
Vaibhav Kamra
ac638d3162
Provide interface to get stream size (#876)
## Description

Allows kopia data collection consumers to query stream size.

This is required for OneDrive that needs to set the file size prior to uploading file data.

## Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [ ] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [x] 🐹 Trivial/Minor

## Issue(s)

* #668 

## Test Plan

<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x]  Unit test
- [ ] 💚 E2E
2022-09-16 18:15:43 +00:00
ashmrtn
5ae8259fd5
Change return value of data.Collection.FullPath to path.Path (#841)
* Make data.Collection.FullPath return path.Path

* Fixup graph connector code for path struct

* Add Elements call to Path interface

Still should not be used in place of the named functions to get specific
path elements (e.x. ResourceOwner())

* Fixup kopia wrapper path handling

Mostly removing shim code and minor fixup for building the directory
tree.

* All the test fixes
2022-09-14 10:01:11 -07:00
ashmrtn
226489c58f
use path struct in kopia DataCollection (#827)
* Use path struct in kopia DataCollection

Does not change the external API of the DataCollection any, just updates
internals in preparation for switching support of
data.Collection.FullPath.

* Expand Path interface slightly

kopia.Wrapper needs some extra functionality from paths, mostly along
the lines of directly manipulating the elements in the path. This gives
access to those functions.

* Use path struct in kopia.Wrapper for restore

Pass path structs to the newly created collections during restore.

* Add tests for new path functionality
2022-09-13 14:18:00 -07:00
Keepers
6cdc691c6f
correct the corso module (#749)
swaps the corso go module from github.com/
alcionai/corso to github.com/alcionai/corso/src
to align with the location of the go.mod and
go.sum files inside the repo.

All other changes in the repository update the
package imports to the new module path.
2022-09-07 15:50:54 -06:00
ashmrtn
c4e9046870
Fix wsl lint errors in kopia package (#650) 2022-08-26 08:13:25 -07:00
ashmrtn
da66cc4c2f
Enable more rigorous version of gofmt linting (#488)
* Enable a stricter linter

* Fix new lint errors
2022-08-05 13:33:20 -07:00
Danny
34a7a1a80c
Data Collection --> Collection refactor (#415)
DataCollection changed to Collection in the repository. All associated imports changed to reflect the change.
2022-07-27 12:04:31 -04:00
Danny
3e792e69eb
Move DataCollection to data package (#414)
DataCollection moved to `/src/internal/data` directory and imports changed to recognize the move.
2022-07-27 08:04:35 -04:00
ashmrtn
1143a33ce6
Allow multiple items in DataCollection from kopia (#296)
Use a slice to back the data instead of adding directly to the channel
for two reasons (this may change in the future though):
  * kopia loads all data about a directory at the same time
  * consumers of the DataCollection may not pull items from the channel
    at a fast rate, which could block adding to the channel. This could
    lead to delays in discovering other directories to traverse in
    multi-threaded scenarios
2022-07-07 14:54:46 -07:00
ashmrtn
1c6f0994e4
Switch DataCollection to return a channel (#244)
* Change DataCollection to return channel directly

Precursor to restoring multiple items from kopia. Allows one to keep a
DataCollection open until all items are processed without blocking
consumers of the DataCollection (they can use a select-block if needed).

* Update tests for new DataCollection interface

* Handle context cancellation with DataCollection
2022-06-28 12:48:24 -07:00
ashmrtn
622e8eab95
kopia code to translate single item to DataCollection (#216)
* DataCollection and DataStream structs for kopia

* Helper function to get a single item of data

Returns a DataCollection with a single item.

* Test for backing up and restoring a single item
2022-06-22 09:50:17 -07:00