37 lines
1.2 KiB
Markdown
37 lines
1.2 KiB
Markdown
# Pastebin Client
|
|
An exercise in Python package development and web scraping.
|
|
|
|
Because Pastebin does not offer an API unless you have a Pro account, this package scrapes
|
|
HTML for its data.
|
|
|
|
The `PastebinAPI.get_public_paste_list` method does not download full paste text to avoid
|
|
hammering the site. When the paste list is fetched, it will return a list of `Paste` objects
|
|
with the following fields:
|
|
- `title` - Paste title
|
|
- `href` - URL of the paste
|
|
- `lang` - Paste language
|
|
- `fetched` - True if the full text and metadata have been fetched
|
|
|
|
To fetch the full text and metadata of a paste, pass the paste as the argument to
|
|
`PastebinAPI.get_paste`. This will return a `Paste` object with the following fields populated:
|
|
- `author` - Paste author
|
|
- `pub_date` - Publication date
|
|
- `category` - Paste category
|
|
- `text` - Full paste text
|
|
|
|
This workflow will change once I figure out a decent method to create a full Pastebin Client
|
|
class that manages an internal paste list.
|
|
|
|
## Usage
|
|
```shell
|
|
git clone https://git.juggalol.com/agatha/pastebin-client
|
|
cd pastebin-client/src
|
|
pip install -r requirements.txt
|
|
python main.py
|
|
```
|
|
|
|
Example in [src/main.py](src/main.py) simply fetches the public paste list and
|
|
tests out the scraping functions.
|
|
|
|
## Notes
|