http/2 stream error failures don't allow
us to leverage client middleware. As a
result we're lacking a retry handler. This
change adds retries for http_wrapper
in the explicit case of a http/2 stream
error.
---
#### Does this PR need a docs update or release note?
- [x] ✅ Yes, it's included
#### Type of change
- [x] 🐛 Bugfix
#### Test Plan
- [x] 💪 Manual
- [x] 💚 E2E
automatically log when we add a recoverable error or a skipped item to fault. This log will include a stack trace of the call from the location of the logged recoverable. Clues does not have a method for pulling a stack trace out of an error yet; that can be added at a future date.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🤖 Supportability/Tests
#### Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
prevent fault items and skips from clobbering eachother by adding a namespace to the fault item that defines a deduplication boundary. This allows multiple items in a single service operation to have identical ids so long as their namespace differs.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🐛 Bugfix
- [x] 🤖 Supportability/Tests
#### Issue(s)
* #3283
#### Test Plan
- [x] 💪 Manual
- [x] ⚡ Unit test
- [x] 💚 E2E
<!-- PR description-->
---
#### Does this PR need a docs update or release note?
- [ ] ⛔ No
#### Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🧹 Tech Debt/Cleanup
#### Issue(s)
<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #<issue>
#### Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
A last-second change in 2708 caused us to
pass along the wrong fault.Errors into backup
persistence, thus slicing the count of skipped
items. That's been fixed, along with improved
end-of-operation logging of fault errors.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🐛 Bugfix
#### Issue(s)
* #2708
#### Test Plan
- [x] 💪 Manual
- [x] 💚 E2E
Adds a new streamstore controller for fault.Errors. This provides large scale, extensible file storage for fault errors to be persisted, much like we do for backup details.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [x] 🌻 Feature
#### Issue(s)
* #2708
#### Test Plan
- [x] 💚 E2E
Adds the item struct to the fault package for tracking serializable and dedupliatable error sources.
---
#### Does this PR need a docs update or release note?
- [x] ⛔ No
#### Type of change
- [ ] 🌻 Feature
#### Issue(s)
* #2708
#### Test Plan
- [x] ⚡ Unit test
## Description
Renaming the funcs in the fault
package to be more clear about
their purpose and behavior. Largely
just find&replace changes, except
for fault.go and the fault examples.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Realized we had a race condition: in an async
runtime it's possible for an errs.Err() to be
returned by multiple functions, even though that
Err() was only sourced by one of them. The
addition of a tracker contains the returned
error into the scope of that func so that only
the error produced in the current iteration is
returned.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
The error type does not get marshaled which can lead to errors when we try to un-marshal a manifest
that had errors stored in it.
Repro here: https://go.dev/play/p/tgj8oq5CGFd
For now - this disables JSON marshaling and also fixes the error un-marshaling. I have added a regression test
to verify both behaviors.
As a follow-up, I believe we can implement a custom marshaler.
## Does this PR need a docs update or release note?
- [ ] ✅ Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x] ⛔ No
## Type of change
<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Test
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup
## Test Plan
<!-- How will this be tested prior to merging.-->
- [ ] 💪 Manual
- [x] ⚡ Unit test
- [ ] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Refactors error handling in graph_connector.
Also begins some error refactoring in support by
moving StackTraceErrror style funcs into a more
normalized handler in graph/errors.go. And
removes the (Non)Recoverable error wraps which
we weren't using anyway.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Description
Begins updating operations/backup with the new
error handling procedures. For backwards
compatibility, errors are currently duplicated in
the old stats.Errs and the new Errors struct.
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test
- [x] 💚 E2E
## Does this PR need a docs update or release note?
- [x] ⛔ No
## Type of change
- [x] 🧹 Tech Debt/Cleanup
## Issue(s)
* #1970
## Test Plan
- [x] ⚡ Unit test