微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

即使我在 Python 中使用旋转代理和用户代理,大部分时间也会获取状态代码 403

如何解决即使我在 Python 中使用旋转代理和用户代理,大部分时间也会获取状态代码 403

一个链接 https://et.interac.ca/sh/d7f807a0e70,每当我在浏览器中运行它时,它都会给出状态代码 200。它不会在 2-3 次尝试后运行,我需要更改代理以再次访问该站点。 我使用了高级旋转代理 + 在 python 中添加了不同的用户代理来使用 url 访问站点的状态代码。无论我尝试多少次,我仍然不断收到错误代码 403。 更新 - 有时它也会给出状态代码 200,但大概 10 次左右就有 1 次,其余时间它一直显示 403。 这是代码-

import requests
import re
from collections import OrderedDict


username = "xxxxxxx"
password = "xxxxxxx"
PROXY_RACK_DNS = "xxxxxxx"
proxy = {"http":"http://{}:{}@{}".format(username,password,PROXY_RACK_DNS)}

headers_list = [
    # Firefox 77 Mac
     {
        "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/89.0.4389.90 Safari/537.36","Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8","Accept-Language": "en-US,en;q=0.5","Referer": "https://www.google.com/","DNT": "1","Connection": "keep-alive","Upgrade-Insecure-Requests": "1"
    },# Firefox 77 Windows
    {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0","Accept-Encoding": "gzip,deflate,br",# Chrome 83 Mac
    {
        "Connection": "keep-alive","Upgrade-Insecure-Requests": "1","User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/83.0.4103.97 Safari/537.36",image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9","sec-fetch-site": "none","sec-fetch-mode": "navigate","Sec-Fetch-Dest": "document","Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8"
    },# Chrome 83 Windows 
    {
        "Connection": "keep-alive","User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,"sec-fetch-site": "same-origin","Sec-Fetch-User": "?1",en;q=0.9"
    }]
# Create ordered dict from Headers above
ordered_headers_list = list()
for headers in headers_list:
    h = OrderedDict()
    for header,value in headers.items():
        h[header]=value
    ordered_headers_list.append(h)
    
    
string1="https://et.interac.ca/sh/d7f807a0e70"
error_message = 'nothing found'

s = 'Request unsuccessful'


try:
    print(string1)
    headers = random.choice(headers_list)
    response = requests.get(string1,headers=headers,proxies=proxy)
#     print(response.text)
    print(response.status_code)
#     result = re.search(s,response.text)
#     print("result",result)
    if (response.status_code==200):
        result = re.search(s,response.text)
        print(string1)
        print(response.status_code)
        print("result",result)
        if error_message in response.text or not response.text:
            print('Bad response')
        else: 
            print(response.text)

except requests.ConnectionError:
    print("Failed to connect")

print("END")

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。