如果希望登录状态一直保持,则需要进行Cookie处理。进行Cookie处理的一种常用思路如下:
- 导入Cookie处理模块http.cookiejar
- 使用http.cookiejar.CookieJar()创建CookieJar对象
- 使用HTTPCookieProcessor创建cookie处理器,并以其为参数构建opener对象
- 创建全局默认的opener对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
|
import urllib.request import urllib.parse import http.cookiejar
url1 = "http://bbs.chinaunix.net/member.php?mod=logging&action=login&loginsubmit=yes&loginhash=LhGEr" postdata = urllib.parse.urlencode({ "username": "", "password": "" }).encode("utf-8") req = urllib.request.Request(url1, postdata) req.add_header("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/604.4.7 (KHTML, like Gecko) Version/11.0.2 Safari/604.4.7")
cjar = http.cookiejar.CookieJar()
cookie = urllib.request.HTTPCookieProcessor(cjar) opener = urllib.request.build_opener(cookie)
urllib.request.install_opener(opener)
data1 = opener.open(req).read() file1 = open("/Users/matianyao/Desktop/login.html", "wb") file1.write(data1) file1.close()
url2 = "http://bbs.chinaunix.net" data2 = urllib.request.urlopen(url2).read() file2 = open("/Users/matianyao/Desktop/crawler.html", "wb") file2.write(data2) file2.close()
|
近期评论