How to rotate proxies for web crawler with python

While scraping data you might need a large list of proxies, to avoid blocks. We will show how you can use our API to scrape easily while rotating proxies with our service. In this example we will use requests library. To install this library we will use pip.

import requests
url = 'https://api.proxypage.io/v1/list?='
payload = 'type=HTTPS&limit=5&latency=500&ssl=True&country=US'
headers = {
  'Content-Type': 'application/x-www-form-urlencoded',
  'api_key': 'YOUR_API_KEY'
}
response = requests.request('GET', url, headers = headers, data = payload, allow_redirects=False, timeout=undefined, allow_redirects=false)
print(response.status_code)

If you see 201 that means you request was successful

This request will get you 5 US proxies with ssl, with latency lower than 500ms
You can pass your api_key either as header or in your body.
You can play around with /list endpoint by constructing your own request
https://api.proxypage.io/v1/list?api_key=YOUR_API_KEY&limit=1

For example you need to access google through a proxy, you can use

proxy_temp = response.json()[0]['ip']+':'+str(response.json()[0]['port'])
proxyDict = {"https": proxy_temp}
r = requests.get('https://google.com', proxies=proxyDict)

With our /v1/list endpoint you can set different parameters for proxies that you need.

You can set type which is just your proxy type, either HTTP, HTTPS, SOCKS4, SOCKS5, CONNECT:25 (which is smtp proxy)

for limit set an integer that will tell us how many proxies you will need. Our users usually set a limit to avoid using too many credits.

With latency you can set an integer which will filter out all proxies that have a latency higher then specified.

ssl is a boolean parameter, you can filter out proxies that support ssl or don't

is_anonymous is also a boolean statemet where you can filter anonymous proxies

country is a parameter that you can use to set a country that you want.

To bring it all together, lets imagine a situation where you need a simple function to find proxy for you, such that everytime you call it, it will give you a single US HTTPS proxy with latency below 500 as a string.

import requests

def get_us_ssl():
    api_endpoint = 'https://api.proxypage.io/v1/list?='
    headers = {
        'Content-Type': 'application/x-www-form-urlencoded',
        'api_key': 'YOUR_API_KEY'
    }
    payload = 'type=HTTPS&limit=1&latency=500&ssl=True&country=US'
    response = requests.get(url, headers = headers, data = payload)
    proxy_string = response.json()[0]['ip']+':'+str(response.json()[0]['port'])
    return proxy_string
    
    

r = requests.get('https://proxypage.io', proxies={"https": get_us_ssl})
r2 = requests.get('https://proxypage.io', proxies={"https": get_us_ssl})
r3 = requests.get('https://proxypage.io', proxies={"https": get_us_ssl})
    

Thus with this simple function we can easily scrape while rotating our proxies

If you take advantage of our API you can easily scrape data without getting blocked. Our proxy list is continuously updating itself where only working proxies remain. You can also subscribe to our alien plan to simplify this process further as your dedicated feed forward proxy will rotate them for you.