HTTP80端口用来与浏览器沟通html
1 mysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)#like file open 2 #AF_INET refer i'm make an internet socket 3 #STREAM refer i'm make an stream socket 4 mysock.connect(('www.py4inf.com',80)) 5 #在咱们这个程序和www.py4inf.com的80端口间创建一个Sockets
http://www.dr-chuck.com/page1.htm
python
protocol host document浏览器
Click the Second Page is just a socketsocket
用telnet 加 GET去获取网页内容(Win7 默认不带telnet)学习
每次访问网页都是十几二十个GET,GET html、GET CSS、GET image....编码
1 import socket 2 mysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)#like file open 3 #AF_INET refer i'm make an internet socket 4 #STREAM refer i'm make an stream socket 5 mysock.connect(('www.py4inf.com',80)) 6 #在咱们这个程序和www.py4inf.com的80端口间创建一个Sockets 7 toSend='GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n' 8 mysock.send(toSend.encode('ascii')) 9 whileTrue: 10 data = mysock.recv(65)#65是buf长度,此处用来设置显示数据时的长度 11 if(len(data)<1): 12 break 13 print(data) 14 mysock.close()
使用encode 进行如下类型转换便可url
1 toSend='GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n' 2 mysock.send(toSend.encode('ascii'))
socket比url更加接近底层,也就是说url更加简单。spa
socket是 Transport Layer , url是 Application Layercode
注:2.x版本python使用import urllib,但3.x版本python使用的是import urllib.requestorm
1 import urllib.request 2 fhand=urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt') 3 for line in fhand: 4 print(line.strip())
1 import urllib.request 2 fhand=urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt') 3 counts=dict() 4 for line in fhand: 5 words=line.split() 6 for word in words: 7 counts[word]=counts.get(word,0)+1 8 print(counts)
subtlety 微妙