今天才知道51cto网博客得切换到富文本编辑器以后，再先选语言，再贴代码才是缩进正常

时间 2020-01-06

标签今天知道 51cto cto 博客切换文本编辑器以后语言代码才是缩进正常栏目职业生涯繁體版

原文原文链接

#!/usr/bin/python3
#coding=UTF-8
import requests
from bs4 import BeautifulSoup

'''
需求：【python小项目】抓取编程网收费vip文章的非vip用户观看界面的url! 例如收费文章http://c.biancheng.net/view/vip_6005.html对应非收费地址是http://c.biancheng.net/view/5315.html这个网站老是有一些vip文章  可是vip文章经过百度标题是能够搜索到的，我想爬取全部这样的文章标题和网页的地址！后期看到一个vip文章，你能够经过检索标题获得非vip的观看连接地址
编写日期：2019-10-18
做者：xiaoxiaohui
说明：python3程序 并且最好在linux运行 windows下有gbk那个编码问题
'''

def get_biaoti(url):
	response = requests.get(url)
	response.encoding='utf-8' #若是不设置成utf8则中文乱码或者报错 参考https://www.cnblogs.com/supery007/p/8303472.html
	soup = BeautifulSoup(response.text,'html.parser')
	links_div = soup.find_all('h1')
	return links_div[0].text

f = open("a1.txt", 'a')
for yema in range(1,500):
	url = 'http://c.biancheng.net/view/'+str(yema)+'.html'
	biaoti = get_biaoti(url)
	print(url,biaoti) 
	f.write(url+'\t'+biaoti+'\n')
f.close()

上面就是切换到富文本编辑器以后---->再先选语言----->再贴代码才是缩进正常html