Skip to content

StephanAkkerman/x-timeline-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

26 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

X-Timeline Scraper

A Python client to scrape tweets from X (formerly Twitter) timelines using a cURL command.


Supported versions pypi downloads License Code style: black

Introduction

This project provides a Python client to scrape tweets from X (formerly Twitter) timelines using a cURL command. It leverages asynchronous programming for efficient data retrieval and includes features for parsing tweet data.

Table of Contents ๐Ÿ—‚

Installation โš™๏ธ

To install the X-Timeline Scraper, you can use pip:

pip install xtimeline

Usage โŒจ๏ธ

To use the X-Timeline Scraper, you need to provide a cURL command that accesses the desired X timeline. The instructions can be found in curl_example.txt. Then, you can use the XTimelineClient class to fetch and parse tweets.

Fetching tweets once

import asyncio
from xclient import XTimelineClient

async def main():
    async with XTimelineClient("curl.txt") as xc:
        tweets = await xc.fetch_tweets()
        for t in tweets:
            print(t.to_markdown())

asyncio.run(main())

Streaming new tweets

import asyncio
from xclient import XTimelineClient

async def main():
    async with XTimelineClient(
        "curl.txt", persist_last_id_path="state/last_id.txt"
    ) as xc:
        async for t in xc.stream():
            print(t.to_markdown())

asyncio.run(main())

By default, stream() now polls every ~30 seconds with built-in jitter (fuzzy interval) so requests do not follow an identical cadence.

# 30s base with +-20% jitter (default)
async for t in xc.stream():
    process(t)

# Custom base interval and jitter
async for t in xc.stream(interval_s=45.0, jitter_ratio=0.15):
    process(t)

# Disable jitter if you need a fixed cadence
async for t in xc.stream(interval_s=30.0, jitter_ratio=0.0):
    process(t)

Fetch modes

Both fetch_tweets() and stream() accept a mode parameter that controls which tweets are returned:

Mode Behaviour
"new_only" (default) Only returns tweets newer than the last-seen cursor. The cursor advances so the same tweet is never emitted twice.
"all" Returns every tweet in each response. Nothing is filtered. Useful when your own store (e.g. a SQLite database) handles deduplication.
"with_updates" Returns new tweets and re-emits previously seen tweets whenever their metrics change (likes, retweets, views). Re-emitted tweets have is_update=True.
# Hand all deduplication to your own store
async for t in xc.stream(mode="all"):
    upsert_to_db(t)

# Only new tweets, cursor persisted across restarts
async with XTimelineClient(
    "curl.txt", persist_last_id_path="state/last_id.txt"
) as xc:
    async for t in xc.stream(mode="new_only"):
        process(t)

# New tweets + engagement updates
async for t in xc.stream(mode="with_updates"):
    if t.is_update:
        update_metrics_in_db(t)
    else:
        insert_new_tweet(t)

Tweet fields

Each Tweet object contains:

Field Type Description
id int Tweet ID
text str Full text, HTML entities unescaped, t.co links expanded, long-form tweets supported
user_name str Display name
user_screen_name str @handle (without @)
user_img str Profile image URL
url str Canonical tweet URL
created_at str Post time in ISO 8601 format (2026-04-01T19:15:49Z)
likes int Like count
retweets int Retweet count
replies int Reply count
views int View count
media list[MediaItem] Attached photos/videos
tickers list[str] Uppercased $TICKER symbols
hashtags list[str] Uppercased hashtags
title str Human-readable summary, e.g. "TraderSZ retweeted Jelle"
is_update bool True if this tweet was seen in a previous fetch this session

Citation โœ๏ธ

If you use this project in your research, please cite as follows:

@misc{project_name,
  author  = {Stephan Akkerman},
  title   = {X-Timeline Scraper},
  year    = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/StephanAkkerman/x-timeline-scraper}}
}

Contributing ๐Ÿ› 

Contributions are welcome! If you have a feature request, bug report, or proposal for code refactoring, please feel free to open an issue on GitHub. We appreciate your help in improving this project.
https://github.com/StephanAkkerman/x-timeline-scraper/graphs/contributors

License ๐Ÿ“œ

This project is licensed under the MIT License. See the LICENSE file for details.

About

Scrapes X (formerly Twitter) timeline using your account

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Sponsor this project

 

Packages

 
 
 

Contributors

Languages

Generated from StephanAkkerman/template