fix(data): handle SSL certificate verification failures in dataset download#3510
fix(data): handle SSL certificate verification failures in dataset download#3510Lidang-Jiang wants to merge 1 commit intoopen-edge-platform:mainfrom
Conversation
…wnload Users behind corporate proxies that perform TLS interception see SSLCertVerificationError when anomalib tries to download datasets. The raw urllib traceback gives no guidance on how to resolve it. Changes: - Catch SSLCertVerificationError and re-raise with an actionable message listing three resolution options (CA trust store, SSL_CERT_FILE, ANOMALIB_NO_VERIFY_SSL). - Add _ssl_context() context manager that temporarily disables SSL verification when ANOMALIB_NO_VERIFY_SSL=1, restoring it on exit. - Add unit tests for both the context manager and the error path. Closes open-edge-platform#3477 Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
|
@AlexanderBarabanov what do you think is the security implication of this? Personally I feel people should install the right certificates in their system for SSL to work. My concern is that we might be introducing unsafe download option that might compomize any downstream library. |
|
@ashwinvaidya17 Yes, I agree. Also, I propose to double-check submitted issues, as in #3477 (Ubuntu + Python package), based on provided logs, it was possible to download pretrained weights from Hugging Face Hub: and there was an issue with mvtecad dataset. In #3492 (Windows + Windows App) - download from HF was unsuccessful. |
Summary
Fixes #3477. Also addresses #3492 (same root cause on Windows).
When downloading datasets behind a corporate proxy that performs TLS interception,
urlretrievethrows a bareSSLCertVerificationErrorwith no guidance. This PR adds:SSLCertVerificationErrorand re-raises asRuntimeErrorwith three resolution options (CA trust store,SSL_CERT_FILE,ANOMALIB_NO_VERIFY_SSL)._ssl_context()context manager — whenANOMALIB_NO_VERIFY_SSL=1is set, temporarily disables SSL verification for the download block only, restoring the original context on exit.Changed files
src/anomalib/data/utils/download.py_ssl_context(), catchSSLCertVerificationErrorwith guidance, minor cleanup (startswithtuple)tests/unit/data/utils/test_download.pyBefore (upstream main) — raw SSL traceback with no guidance
```
Traceback (most recent call last):
...
File ".../urllib/request.py", line 1344, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED]
certificate verify failed: unable to get local issuer certificate
(_ssl.c:1000)>
```
User has no idea how to fix this.
After (this PR) — actionable RuntimeError
```
RuntimeError: SSL certificate verification failed while downloading MVTecAD:
('[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed',)
If you are behind a corporate proxy, you can either:
```
After (this PR) — ANOMALIB_NO_VERIFY_SSL=1 bypass works
```python
Unit tests
```
tests/unit/data/utils/test_download.py::TestSSLContext::test_default_does_not_change_ssl PASSED
tests/unit/data/utils/test_download.py::TestSSLContext::test_no_verify_disables_ssl PASSED
tests/unit/data/utils/test_download.py::TestSSLContext::test_no_verify_true_string PASSED
tests/unit/data/utils/test_download.py::TestSSLContext::test_no_verify_false_keeps_ssl PASSED
tests/unit/data/utils/test_download.py::TestSSLContext::test_context_restores_on_exception PASSED
tests/unit/data/utils/test_download.py::TestDownloadSSLError::test_ssl_error_gives_actionable_message PASSED
tests/unit/data/utils/test_download.py::TestDownloadSSLError::test_invalid_scheme_raises PASSED
======================== 7 passed in 0.03s =========================
```
Test plan