pastebin-client/README.md
2023-09-12 14:05:27 -04:00

37 lines
1.2 KiB
Markdown

# Pastebin Client
An exercise in Python package development and web scraping.
Because Pastebin does not offer an API unless you have a Pro account, this package scrapes
HTML for its data.
The `PastebinAPI.get_public_paste_list` method does not download full paste text to avoid
hammering the site. When the paste list is fetched, it will return a list of `Paste` objects
with the following fields:
- `title` - Paste title
- `href` - URL of the paste
- `lang` - Paste language
- `fetched` - True if the full text and metadata have been fetched
To fetch the full text and metadata of a paste, pass the paste as the argument to
`PastebinAPI.get_paste`. This will return a `Paste` object with the following fields populated:
- `author` - Paste author
- `pub_date` - Publication date
- `category` - Paste category
- `text` - Full paste text
This workflow will change once I figure out a decent method to create a full Pastebin Client
class that manages an internal paste list.
## Usage
```shell
git clone https://git.juggalol.com/agatha/pastebin-client
cd pastebin-client/src
pip install -r requirements.txt
python main.py
```
Example in [src/main.py](src/main.py) simply fetches the public paste list and
tests out the scraping functions.
## Notes