Skip to content

feat: x_get_likes (GraphQL) and x_discover_likes with human-like deep reads#24

Open
nj-io wants to merge 2 commits intonirholas:mainfrom
nj-io:feat/graphql-likes-discover
Open

feat: x_get_likes (GraphQL) and x_discover_likes with human-like deep reads#24
nj-io wants to merge 2 commits intonirholas:mainfrom
nj-io:feat/graphql-likes-discover

Conversation

@nj-io
Copy link
Copy Markdown

@nj-io nj-io commented Apr 7, 2026

Summary

Two new tools for scraping and deeply reading liked tweets via GraphQL API.

x_get_likes — fast likes index

Metric DOM scraping (old) GraphQL (new)
50 tweets ~8 min, capped at 25 14s
200 tweets impossible 49s
  • Cursor-based pagination via Likes GraphQL API
  • JSONL output to ~/.xactions/exports/ — progress survives crashes
  • from/to timestamp filtering with early exit
  • Rich data via parseTweetResult

x_discover_likes — interleaved fetch + deep read

Fetches likes and deep-reads each one with human-like pacing:

  • 3-8s between pages, 2-5s before tapping in, 5-15s after reading
  • Plus 8% distraction spikes on all delays
  • Produces two JSONL files: likes index + deep reads (full thread/QT data)
  • ~38s per tweet average
  • Long-running — check JSONL files on disk (wc -l) for progress

Architecture

  • scrapeLikedTweets() — resolves userId via UserByScreenName, paginates via Likes GraphQL endpoint (data.user.result.timeline.timeline.instructions)
  • discoverLikes() — calls scrapeLikedTweets pattern for fetching, _scrapePostRecursive for deep reads
  • Both use newTab() for multi-tab isolation
  • Removes x_get_likes from xeepyTools, deletes old DOM handler

Depends on

Test plan

  • 50 likes in 14s, 200 in 49s
  • from/to timestamp filtering
  • x_discover_likes — 5 likes with full deep reads in 190s
  • JSONL written incrementally
  • Multi-tab isolation — concurrent calls safe

🤖 Generated with Claude Code

nj-io and others added 2 commits April 7, 2026 06:02
Rewrites scrapeThread and adds scrapePost using X's TweetDetail GraphQL
API instead of DOM scraping. Introduces shared infrastructure for all
GraphQL-based scrapers.

New tools:
- x_read_post: read any tweet with full rich data, recursive QT resolution

Shared helpers:
- fetchTweetDetail: GraphQL API caller with retry/backoff on rate limits
- parseTweetResult: rich data extraction (text, media, articles, cards,
  URLs, engagement)
- parseThreadFromEntries: self-reply thread chain detection
- checkAuth: post-navigation auth guard
- randomDelay: log-normal distribution with distraction spikes
- newTab: per-call tab isolation (shared browser, separate pages)

scrapeThread rewrite:
- Uses GraphQL API instead of DOM scraping
- Gets full_text (no truncation), note_tweet support
- screen_name from user.core (X moved it from user.legacy)

scrapePost:
- Handles single posts and threads
- Recursive quote tweet resolution (up to 5 levels)
- Each tweet: text, media, articles, cards, external URLs, engagement
- Error surfacing: returns { thread: [], error: "..." } on failure

Multi-tab isolation:
- x_read_post and x_get_thread each create their own browser tab
- Tabs share cookies/auth, don't conflict on concurrent calls
- 60s default timeout per tab

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two new tools for scraping and deeply reading liked tweets.

x_get_likes — fast GraphQL-based likes index:
- Likes GraphQL API with cursor pagination (50 in 14s, 200 in 49s)
- JSONL output to ~/.xactions/exports/
- from/to timestamp filtering with early exit
- Rich data via parseTweetResult

x_discover_likes — interleaved fetch + deep read:
- Fetches likes via API, deep-reads each via scrapePost
- Human-like pacing: 3-8s between pages, 2-5s before reads, 5-15s after
- Produces two JSONL files: likes index + deep reads
- ~38s per tweet average

Both use multi-tab isolation (newTab) for concurrent safety.
Removes x_get_likes from xeepyTools, deletes old DOM handler.

Depends on: nirholas#23 (shared infrastructure)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nj-io nj-io requested a review from nirholas as a code owner April 7, 2026 06:04
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 7, 2026

@nj-io is attempting to deploy a commit to the kaivocmenirehtacgmailcom's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant