Skip to content

Comments

fix(sites): remediate false positive for Archive.org#2789

Open
The-Khan-Labs wants to merge 3 commits intosherlock-project:masterfrom
The-Khan-Labs:patch-1
Open

fix(sites): remediate false positive for Archive.org#2789
The-Khan-Labs wants to merge 3 commits intosherlock-project:masterfrom
The-Khan-Labs:patch-1

Conversation

@The-Khan-Labs
Copy link

Adds backend error message matching to correctly identify non-existent Archive.org accounts and prevent false positives.

Adds backend error message matching to correctly identify non-existent
Archive.org accounts and prevent false positives.
@github-actions
Copy link
Contributor

Automatic validation of changes

Target F+ Check F- Check
Archive.org ❌   Fail ✔️   Pass

Failures were detected on at least one updated target. Commits containing accuracy failures will often not be merged (unless a rationale is provided, such as false negatives due to regional differences).

@The-Khan-Labs
Copy link
Author

Thanks for the check.

Archive.org returns a 200 response with a backend error message for non-existent users instead of a 404, which causes the F+ check to flag a failure.

I manually verified multiple real and random usernames and confirmed that the message "could not fetch an account with user item identifier" only appears for non-existent accounts, while valid profiles are still detected correctly.

@ppfeister
Copy link
Member

Hey @The-Khan-Labs, just an FYI - it doesn't appear that your PR-ed commit includes any changes to the actual check - just to the comment attribute

Something may have been left behind in devel

@The-Khan-Labs
Copy link
Author

Good catch — you’re right. The previous commit only updated the comment.
I’ve pushed a follow-up that updates the Archive.org errorMsg matcher to include the backend account_extra_info error

Copy link
Member

@ppfeister ppfeister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing comma throws a parsing error, also just fixing alignment

@github-actions
Copy link
Contributor

Automatic validation of changes

Target F+ Check F- Check
Archive.org ❌   Fail ✔️   Pass

Failures were detected on at least one updated target. Commits containing accuracy failures will often not be merged (unless a rationale is provided, such as false negatives due to regional differences).

@ppfeister
Copy link
Member

Seems that this endpoint isn't returning the expected error messages due to a JS requirement --- different outcome on your end?

(sherlock-project-py3.13) ➜  sherlock git:(v0.16.0) ✗ sherlock -j 2789 --site Archive\.org uehrighoierhg --dump-response
[*] Checking username uehrighoierhg on:

+++++++++++++++++++++
TARGET NAME   : Archive.org
USERNAME      : uehrighoierhg
TARGET URL    : https://archive.org/details/@uehrighoierhg
TEST METHOD   : message
Results...
RESPONSE CODE : 200
ERROR TEXT    : ['backend_request account_extra_info', 'could not fetch an account with user item identifier', 'The resource could not be found', 'Internet Archive services are temporarily offline']
>>>>> BEGIN RESPONSE TEXT
<!DOCTYPE html>
<html lang="en">
<!--  __ _ _ _ __| |_ (_)__ _____
     / _` | '_/ _| ' \| |\ V / -_)
     \__,_|_| \__|_||_|_| \_/\___|-->

<head>
  <!-- <base href="/"> is used by `router-slot` for routing -->
  <base href="/">
  <meta charset="utf-8" />
  <link rel="icon" href="/offshoot_assets/favicon.ico" />
  <link rel="preconnect" href="https://analytics.archive.org" />
  <meta name="viewport" content="width=device-width, initial-scale=1" />
  <meta name="google-site-verification" content="<--rem-->" />
  <!-- bpjKvUv is for Wayback Gsheets -->
  <meta name="google-site-verification" content="<--rem-->" />

  <!-- Don't cache the html, otherwise browsers may try to load outdated javascript chunks, see:
          https://raphael-leger.medium.com/react-webpack-chunkloaderror-loading-chunk-x-failed-ac385bd110e0
        -->
  <meta http-equiv="Pragma" content="no-cache" />
  <meta http-equiv="cache-control" content="no-cache, no-store, must-revalidate" />

  <link rel="stylesheet" type="text/css" href="/offshoot_assets/index.css" />
  <title>Internet Archive: Digital Library of Free &amp; Borrowable Texts, Movies, Music &amp; Wayback Machine</title>

  <script src="/offshoot_assets/vendor/lit/polyfill-support.js"></script>
  <script src="/offshoot_assets/vendor/@webcomponents/webcomponentsjs@2.6.0/webcomponents-loader.js"></script>

  <!-- <script>
    /* @licstart  The following is the entire license notice for the
    * JavaScript code in this page.
    *
    * This program is free software: you can redistribute it and/or modify
    * it under the terms of the GNU Affero General Public License as published by
    * the Free Software Foundation, either version 3 of the License, or
    * (at your option) any later version.
    *
    * This program is distributed in the hope that it will be useful,
    * but WITHOUT ANY WARRANTY; without even the implied warranty of
    * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    * GNU Affero General Public License for more details.
    *
    * You should have received a copy of the GNU Affero General Public License
    * along with this program.  If not, see <http://www.gnu.org/licenses/>.
    *
    * @licend  The above is the entire license notice
    * for the JavaScript code in this page.
    */
  </script> -->
  <script type="module" crossorigin src="/offshoot_assets/index-S6Q7Kd8r.js"></script>
</head>

<body>
  <app-root></app-root>

  <noscript>
    <h2>Javascript is required for this site. Consider enabling Javascript or upgrading to a modern browser.</h2>
  </noscript>

</body>

</html>

<<<<< END RESPONSE TEXT
VERDICT       : Claimed
+++++++++++++++++++++
[+] Archive.org: https://archive.org/details/@uehrighoierhg

[*] Search completed with 1 results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants