如何解决在 Python 中抓取带有登录和重定向的页面
我正在尝试登录金融服务我是客户,以使用 Python requests
自动检索一些数据。
我受到了this page的启发:
import requests
from typing import Dict
def get_payload(username:str,password:str) -> Dict[str,str]:
"""Return dictionary for credentials"""
return {
"USERNAME": username,"PASSWORD": password,"option": "login"
}
session_requests = requests.session()
result_login = session_requests.post(
URL,data = get_payload("myusername","MyPasswordSuperSafe"),headers = dict(referer=URL)
)
tree = html.fromstring(result.text)
我可以发送用户名和密码并发送登录信息。但是,该系统正在使用我认为的某种安全系统:它使用一些自动重定向(请参见屏幕截图)。
然而,我不知道如何处理它,我的 Python 网页抓取程序导致超时。
<!DOCTYPE html>
<!-- saved from url=(0059)https://somewebsite.com/scripts/customer.cgi?option=login -->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><Meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<script language="JavaScript">
function redirect() {
top.location.href = 'https://somewebsite.com/scripts/customer.cgi/SC/';
}
</script>
<Meta http-equiv="X-UA-Compatible" content="IE=edge">
<Meta name="author" content="RL360">
<Meta name="copyright" content="RL360">
<link href="./Online services redirection_files/screen.css" rel="styleSheet" media="screen">
<link href="./Online services redirection_files/print.css" rel="styleSheet" media="print">
<!--[if lte IE 8]>
<link href="https://somewebsite.com/scripts/customer.cgi/SF/stylesheets/desktop/ie8fix.css" rel="stylesheet" type="text/css" />
<![endif]-->
<script>
function setCookie(cname,cvalue,exdays,path) {
var d = new Date();
d.setTime(d.getTime() + (exdays * 24 * 60 * 60 * 1000));
var expires = "expires="+d.toUTCString();
document.cookie = cname + "=" + cvalue + ";" + expires + ";path=" + path;
}
function getCookie(cname) {
var name = cname + "=";
var ca = document.cookie.split(';');
for(var i = 0; i < ca.length; i++) {
var c = ca[i];
while (c.charat(0) == ' ') {
c = c.substring(1);
}
if (c.indexOf(name) == 0) {
return c.substring(name.length,c.length);
}
}
return "";
}
</script>
<title>Online services redirection</title>
<link href="./Online services redirection_files/css" rel="stylesheet"></head><span id="warning-container"><i data-reactroot=""></i></span>
<body onload="redirect();" style="background-color: #ffffff;">
<div id="mainarea">
<div id="title"></div>
<!-- main content -->
<form action="https://somewebsite.com/scripts/customer.cgi/SC/" name="redirform" method="POST">
<div class="level1" style="width: 700px; margin-left: 123px; height: auto;"><h2>Online services redirection</h2>
<p><a href="https://somewebsite.com/scripts/customer.cgi/SC/" target="_top">Attempting to redirect,please click here if nothing happens after 30 seconds.</a></p>
</div>
</form>
</div>
</body></html>
我该如何处理这种重定向?
我愿意使用 requests
、mechanize
、BeautifulSoup
或任何其他解决方案(但希望尽可能避免使用 selenium
)。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。