src | ||
.gitignore | ||
README.md |
Pastebin Client
An exercise in Python package development and web scraping.
Because Pastebin does not offer an API unless you have a Pro account, this package scrapes HTML for its data.
The PastebinAPI.get_public_paste_list
method does not download full paste text to avoid
hammering the site. When the paste list is fetched, it will return a list of Paste
objects
with the following fields:
title
- Paste titlehref
- URL of the pastelang
- Paste languagefetched
- True if the full text and metadata have been fetched
To fetch the full text and metadata of a paste, pass the paste as the argument to
PastebinAPI.get_paste
. This will return a Paste
object with the following fields populated:
author
- Paste authorpub_date
- Publication datecategory
- Paste categorytext
- Full paste text
This workflow will change once I figure out a decent method to create a full Pastebin Client class that manages an internal paste list.
Usage
git clone https://git.juggalol.com/agatha/pastebin-client
cd pastebin-client/src
pip install -r requirements.txt
python main.py
Example in src/main.py simply fetches the public paste list and tests out the scraping functions.