37 lines
1.2 KiB
Markdown
37 lines
1.2 KiB
Markdown
|
# Pastebin Client
|
||
|
An exercise in Python package development and web scraping.
|
||
|
|
||
|
Because Pastebin does not offer an API unless you have a Pro account, this package scrapes
|
||
|
HTML for its data.
|
||
|
|
||
|
The `PastebinAPI.get_public_paste_list` method does not download full paste text to avoid
|
||
|
hammering the site. When the paste list is fetched, it will return a list of `Paste` objects
|
||
|
with the following fields:
|
||
|
- `title` - Paste title
|
||
|
- `href` - URL of the paste
|
||
|
- `lang` - Paste language
|
||
|
- `fetched` - True if the full text and metadata have been fetched
|
||
|
|
||
|
To fetch the full text and metadata of a paste, pass the paste as the argument to
|
||
|
`PastebinAPI.get_paste`. This will return a `Paste` object with the following fields populated:
|
||
|
- `author` - Paste author
|
||
|
- `pub_date` - Publication date
|
||
|
- `category` - Paste category
|
||
|
- `text` - Full paste text
|
||
|
|
||
|
This workflow will change once I figure out a decent method to create a full Pastebin Client
|
||
|
class that manages an internal paste list.
|
||
|
|
||
|
## Usage
|
||
|
```shell
|
||
|
git clone https://git.juggalol.com/agatha/pastebin-client
|
||
|
cd pastebin-client/src
|
||
|
pip install -r requirements.txt
|
||
|
python main.py
|
||
|
```
|
||
|
|
||
|
Example in [src/main.py](src/main.py) simply fetches the public paste list and
|
||
|
tests out the scraping functions.
|
||
|
|
||
|
## Notes
|