feat: x_get_likes (GraphQL) and x_discover_likes with human-like deep reads#24
Open
nj-io wants to merge 2 commits intonirholas:mainfrom
Open
feat: x_get_likes (GraphQL) and x_discover_likes with human-like deep reads#24nj-io wants to merge 2 commits intonirholas:mainfrom
nj-io wants to merge 2 commits intonirholas:mainfrom
Conversation
Rewrites scrapeThread and adds scrapePost using X's TweetDetail GraphQL
API instead of DOM scraping. Introduces shared infrastructure for all
GraphQL-based scrapers.
New tools:
- x_read_post: read any tweet with full rich data, recursive QT resolution
Shared helpers:
- fetchTweetDetail: GraphQL API caller with retry/backoff on rate limits
- parseTweetResult: rich data extraction (text, media, articles, cards,
URLs, engagement)
- parseThreadFromEntries: self-reply thread chain detection
- checkAuth: post-navigation auth guard
- randomDelay: log-normal distribution with distraction spikes
- newTab: per-call tab isolation (shared browser, separate pages)
scrapeThread rewrite:
- Uses GraphQL API instead of DOM scraping
- Gets full_text (no truncation), note_tweet support
- screen_name from user.core (X moved it from user.legacy)
scrapePost:
- Handles single posts and threads
- Recursive quote tweet resolution (up to 5 levels)
- Each tweet: text, media, articles, cards, external URLs, engagement
- Error surfacing: returns { thread: [], error: "..." } on failure
Multi-tab isolation:
- x_read_post and x_get_thread each create their own browser tab
- Tabs share cookies/auth, don't conflict on concurrent calls
- 60s default timeout per tab
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two new tools for scraping and deeply reading liked tweets. x_get_likes — fast GraphQL-based likes index: - Likes GraphQL API with cursor pagination (50 in 14s, 200 in 49s) - JSONL output to ~/.xactions/exports/ - from/to timestamp filtering with early exit - Rich data via parseTweetResult x_discover_likes — interleaved fetch + deep read: - Fetches likes via API, deep-reads each via scrapePost - Human-like pacing: 3-8s between pages, 2-5s before reads, 5-15s after - Produces two JSONL files: likes index + deep reads - ~38s per tweet average Both use multi-tab isolation (newTab) for concurrent safety. Removes x_get_likes from xeepyTools, deletes old DOM handler. Depends on: nirholas#23 (shared infrastructure) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@nj-io is attempting to deploy a commit to the kaivocmenirehtacgmailcom's projects Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two new tools for scraping and deeply reading liked tweets via GraphQL API.
x_get_likes — fast likes index
~/.xactions/exports/— progress survives crashesfrom/totimestamp filtering with early exitparseTweetResultx_discover_likes — interleaved fetch + deep read
Fetches likes and deep-reads each one with human-like pacing:
wc -l) for progressArchitecture
scrapeLikedTweets()— resolves userId via UserByScreenName, paginates via Likes GraphQL endpoint (data.user.result.timeline.timeline.instructions)discoverLikes()— callsscrapeLikedTweetspattern for fetching,_scrapePostRecursivefor deep readsnewTab()for multi-tab isolationx_get_likesfromxeepyTools, deletes old DOM handlerDepends on
Test plan
from/totimestamp filteringx_discover_likes— 5 likes with full deep reads in 190s🤖 Generated with Claude Code