A Python-based scraper built with Telethon to collect data from Telegram private chats, groups, and channels.
The tool extracts messages, metadata, user information, and media, and provides an option for automatic translation of messages using deep_translator (Google Translate API or other supported services).
All collected data is stored in a pandas DataFrame and can be exported as CSV and/or JSON for further analysis.
- 📥 Scrape messages, metadata, media, and user details from Telegram
- 🌍 Automatic message translation (Google Translate, DeepL, etc. via
deep_translator) - 📊 Export data to CSV or work directly with a pandas DataFrame
- ⚡ Simple configuration and flexible usage
Clone the repository and install dependencies. It is recommended to use a Python virtual environment:
# Clone the repository and change into the directory
git clone https://github.com/sergejlembke/telegram-scraper.git
cd telegram-scraper
# Create and activate a virtual environment (recommended)
python -m venv venv
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtRequired Python version: 3.10 or higher
-
Get your Telegram API credentials
- Create an app at my.telegram.org
- Copy your
api_idandapi_hash
-
Prepare your configuration
- Copy the provided
config.example.jsontoconfig.jsonin the project root:cp config.example.json config.json # On Windows: copy config.example.json config.json - Open
config.jsonand enter- your own Telegram API credentials
- the desired translation settings
- the export options (file format and append mode)
- the chat IDs you want to scrape
- Example
config.json:{ "api_id": "YOUR_API_ID", "api_hash": "YOUR_API_HASH", "phone_number": "YOUR_PHONE_NUMBER", "translation": { "translate": true, "source_language": "auto", "target_language": "en" }, "export_option": { "append": true, "format": ["json", "csv"] }, "chats": [ {"chat_name": "Example Private Channel", "chat_id": -100123456789}, {"chat_name": "Example Public Channel", "chat_id": "@ExamplePublicChannelName"}, {"chat_name": "Example Group Chat1", "chat_id": -100987654321}, {"chat_name": "Example Group Chat2", "chat_id": "@ExampleGroupName"}, {"chat_name": "Example Regular Chat1", "chat_id": 123456789}, {"chat_name": "Example Regular Chat2", "chat_id": "@ExampleUserName"} ] } - To get the chat ID, login to the web version of Telegram and open the desired chat.
- Click on the chat name at the top to open the chat info, where you can find the chat ID.
- Insert the chat ID in the config.json in the following formats:
- Private channels: '-100' followed by the channel ID (e.g., -100123456789)
- Public channels: '@channel-username' (e.g., @PublicChannelName)
- Group chats: '-100' followed by the group ID (e.g., -100987654321) or '@GroupName'
- Regular chats: user ID or username as a string (e.g., 123456789 or '@UserName')
- Copy the provided
-
Run the scraper
python main.py
CSV file structure:
| SENDER_NAME | SENDER_ID | MESSAGE_ID | DATE | MESSAGE | TRANSLATED_MESSAGE | MEDIA_PATH |
|---|---|---|---|---|---|---|
| PythonDev | 123456789 | 742 | 2025-09-02 10:44:22+00:00 | Ich arbeite mit Python | I work with Python | |
| JohnDoe | 987654321 | 743 | 2025-09-03 15:30:10+00:00 | Ja, ich auch, sieh dir meinen Screenshot an. | Yes, me too, have a look at my screenshot. | ./data/PythonLover/PythonLover_photo_742.jpg |
This project is licensed under the AGPL-3.0 License – see the LICENSE file for details.
This project uses the following third-party dependencies:
This tool is intended for educational and personal use only.
Please ensure that scraping and storing Telegram data complies with the Telegram Terms of Service and local privacy regulations.
Do not use this project for unauthorized data collection.
Pull requests, feature suggestions, and bug reports are welcome!
Feel free to fork the repository and adapt the tool to your needs.