Don't dump response body while checking for malware (#3592)

Corso detects malware in a few places for OD/SP.

1. During item enumeration, using [malware property](https://learn.microsoft.com/en-us/graph/api/resources/driveitem?view=graph-rest-1.0#properties) which is set by graph - [code](ed47c134b0/src/internal/connector/onedrive/collections.go (L658)) 

2. During item fetch using download URL - [code](ed47c134b0/src/internal/connector/onedrive/item.go (LL53C18-L53C18))
- Graph seems to return 403 with `{"error":{"code":"malwareDetected","message":"Malware detected"}}`. Tested with eicar.

3. By analyzing http response headers in `IsMalwareResp`  Don't see these headers documented by graph. This is for [sharepoint protocol compatibility](https://learn.microsoft.com/en-us/openspecs/sharepoint_protocols/ms-wdvmoduu/6fa6d4a9-ac18-4cd7-b696-8a3b14a98291) it seems.

4. By analyzing response body in `IsMalwareResp` . We check for malwareDetected string inside http response body.

We are accidentally dumping entire file ( http response dump) while checking for malware in 4. This is leading to high mem utilization/OOMs, especially while processing very large files.

I don't think we need 4 at all. If graph is the entity which detects malware, I think we will never get past 2. This PR removes item 4. 

---

#### Does this PR need a docs update or release note?

- [ ]  Yes, it's included
- [ ] 🕐 Yes, but in a later PR
- [x]  No

#### Type of change

<!--- Please check the type of change your PR introduces: --->
- [ ] 🌻 Feature
- [x] 🐛 Bugfix
- [ ] 🗺️ Documentation
- [ ] 🤖 Supportability/Tests
- [ ] 💻 CI/Deployment
- [ ] 🧹 Tech Debt/Cleanup

#### Issue(s)

<!-- Can reference multiple issues. Use one of the following "magic words" - "closes, fixes" to auto-close the Github issue. -->
* #<issue>

#### Test Plan

<!-- How will this be tested prior to merging.-->
- [x] 💪 Manual
- [ ]  Unit test
- [ ] 💚 E2E
This commit is contained in:
Abhishek Pandey 2023-06-12 11:15:19 -07:00 committed by GitHub
parent ea5eaf3aaf
commit bc7792744e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -4,7 +4,6 @@ import (
"context"
"fmt"
"net/http"
"net/http/httputil"
"net/url"
"os"
"strings"
@ -18,7 +17,6 @@ import (
"github.com/alcionai/corso/src/internal/common/ptr"
"github.com/alcionai/corso/src/pkg/fault"
"github.com/alcionai/corso/src/pkg/filters"
"github.com/alcionai/corso/src/pkg/logger"
)
// ---------------------------------------------------------------------------
@ -169,27 +167,11 @@ func IsMalware(err error) bool {
}
func IsMalwareResp(ctx context.Context, resp *http.Response) bool {
if resp == nil {
return false
}
// https://learn.microsoft.com/en-us/openspecs/sharepoint_protocols/ms-wsshp/ba4ee7a8-704c-4e9c-ab14-fa44c574bdf4
// https://learn.microsoft.com/en-us/openspecs/sharepoint_protocols/ms-wdvmoduu/6fa6d4a9-ac18-4cd7-b696-8a3b14a98291
if len(resp.Header) > 0 && resp.Header.Get("X-Virus-Infected") == "true" {
return true
}
respDump, err := httputil.DumpResponse(resp, true)
if err != nil {
logger.Ctx(ctx).Errorw("dumping http response", "error", err)
return false
}
if strings.Contains(string(respDump), string(malwareDetected)) {
return true
}
return false
return resp != nil &&
len(resp.Header) > 0 &&
resp.Header.Get("X-Virus-Infected") == "true"
}
func IsErrFolderExists(err error) bool {