assignment 12.3

使用urllib重复之前的练习:( 1) 从URL中检索文档;( 2) 显示3000个字符;( 3) 统计文档的字符总数。 这里不必担心头部信息, 只显示文档内容中前3000个字符即可

代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import urllib
import re

url = raw_input('Enter - ')
if (re.search('^http://[a-zA-Z0-9]+.[a-zA-Z0-9]+.[a-zA-Z0-9]+/',url)):
try:
web = urllib.urlopen(url)
except:
print(url, ' is not a correct server')
exit()
count = 0
while True:
data = web.read(3000)
if (len(data) < 1) : break
count = count + len(data)
if (count <= 3000):
print (data.decode('utf-8'))
print("The total count of this web is", count)
else:
print("The URL that you input is bad format")