Update README.md
This commit is contained in:
parent
1af8a63717
commit
668a77ade0
63
README.md
63
README.md
@ -1,6 +1,16 @@
|
||||
# Harvester
|
||||
Python package for harvesting commonly available data, such as free proxy servers.
|
||||
|
||||
## Running the Demo
|
||||
If you just want proxies, just run the demo code in [main.py](main.py):
|
||||
|
||||
```shell
|
||||
git clone https://git.juggalol.com/agatha/harvester
|
||||
pip install -r requirements.txt
|
||||
mkdir proxies
|
||||
python main.py
|
||||
```
|
||||
|
||||
## Modules
|
||||
### Proxy
|
||||
#### fetch_list
|
||||
@ -10,56 +20,11 @@ It functions by running a regular expression against the HTTP response, looking
|
||||
strings that match a `username:password@ip:port` pattern where username and password
|
||||
are optional.
|
||||
|
||||
```python
|
||||
from harvester.proxy import fetch_list
|
||||
|
||||
|
||||
URLS = [
|
||||
'https://api.openproxylist.xyz/socks4.txt',
|
||||
'https://api.openproxylist.xyz/socks5.txt',
|
||||
'https://api.proxyscrape.com/?request=displayproxies&proxytype=socks4',
|
||||
]
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point."""
|
||||
for url in URLS:
|
||||
proxies = fetch_list(url)
|
||||
print(proxies)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
|
||||
```
|
||||
|
||||
#### fetch_all
|
||||
Proxies can be fetched from multiple source URLs by using the `fetch_all` function.
|
||||
|
||||
It takes a list of URLs and an optional `max_workers` parameter. Proxies will be fetched from
|
||||
the source URLs concurrently using a `ThreadPoolExecutor`:
|
||||
|
||||
```python
|
||||
from harvester.proxy import fetch_all
|
||||
|
||||
|
||||
URLS = [
|
||||
'https://api.openproxylist.xyz/socks4.txt',
|
||||
'https://api.openproxylist.xyz/socks5.txt',
|
||||
'https://api.proxyscrape.com/?request=displayproxies&proxytype=socks4',
|
||||
]
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point."""
|
||||
proxies = fetch_all(URLS)
|
||||
print(proxies)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
|
||||
```
|
||||
the source URLs concurrently using a `ThreadPoolExecutor`.
|
||||
|
||||
#### validate_socks
|
||||
SOCKS5 proxies can be tested with the `validate_socks` method. The method takes a proxy
|
||||
@ -69,8 +34,14 @@ with no issues, otherwise it will raise an exception and the caller can decide h
|
||||
For an example implementation, see [main.py](main.py).
|
||||
|
||||
## Testing
|
||||
I was trying to get into the habit of writing unit tests, but god damn I hate them. There are
|
||||
a few, but I don't plan on continuing any time soon.
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
pip install -r requirement-dev.txt
|
||||
pytest -v
|
||||
```
|
||||
|
||||
## Greets
|
||||
Shoutouts to [acidvegas](https://git.supernets.org/acidvegas/). This project was inspired by
|
||||
[proxytools](https://git.supernets.org/acidvegas/proxytools)
|
Loading…
Reference in New Issue
Block a user