Github Trending基本上是我天天都会浏览的网页,上面会及时发布一些GIthub上比较有潜力的项目,或者说每日Star数增量排行榜。javascript
不过因为Github Trending常常会实时更新,即便你访问得再勤,不免仍是会错过一些你感兴趣的项目,为此很多人都想出了本身的解决办法,例如
josephyzhou ,他的 github-trending 项目获得了众多人的青睐,我仔细阅读了他的源码 (Go),发现实现也较为简单, 就用Python 重写了一下,发现代码少了好多,详见 个人 github-trending。html
主要是创建一个Job,而后分三步:java
建立一个以 日期.md
的文件python
访问Github-Trending 页面 而后抓取关注语言的Trending List 写入 md文件git
Git Add + Commit + Pushgithub
def job(): strdate = datetime.datetime.now().strftime('%Y-%m-%d') filename = '{date}.md'.format(date=strdate) # create markdown file createMarkdown(strdate, filename) # write markdown scrape('python', filename) scrape('swift', filename) scrape('javascript', filename) scrape('go', filename) # git add commit push git_add_commit_push(strdate, filename)
def createMarkdown(date, filename): with open(filename, 'w') as f: f.write("###" + date + "\n")
def scrape(language, filename): HEADERS = { 'User-Agent' : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20100101 Firefox/11.0', 'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'Accept-Encoding' : 'gzip,deflate,sdch', 'Accept-Language' : 'zh-CN,zh;q=0.8' } url = 'https://github.com/trending/{language}'.format(language=language) r = requests.get(url, headers=HEADERS) assert r.status_code == 200 d = pq(r.content) items = d('ol.repo-list li') # codecs to solve the problem utf-8 codec like chinese with codecs.open(filename, "a", "utf-8") as f: f.write('\n####{language}\n'.format(language=language)) for item in items: i = pq(item) title = i("h3 a").text() owner = i("span.prefix").text() description = i("p.col-9").text() url = i("h3 a").attr("href") url = "https://github.com" + url f.write(u"* [{title}]({url}):{description}\n".format(title=title, url=url, description=description))
def git_add_commit_push(date, filename): cmd_git_add = 'git add {filename}'.format(filename=filename) cmd_git_commit = 'git commit -m "{date}"'.format(date=date) cmd_git_push = 'git push -u origin master' os.system(cmd_git_add) os.system(cmd_git_commit) os.system(cmd_git_push)
代码写完了,而后就能够部署了,固然你能够放在本身的电脑上跑。可是这是个天天的定时任务,因此不能关机比较尴尬。比较好的办法是部署到VPS,具体主机商就不推荐了,反正就这几家,你们随意。部署以前记得先将VPS 的 SSH key 添加到Github 的信任列表,这样这个代码就能够顺利跑起来啦!web
$ git clone https://github.com/bonfy/github-trending.git $ cd github-trending $ pip install -r requirements.txt $ python scraper.py
还有个好处,偷偷告诉大家,这代码是天天定时跑的,因此天天都会Commit 到Github上,想象一下吧,一年以后你的Github下面的Commit 一栏将是多么的美观啊!因此赶快去Star 个人项目,行动起来吧,少年!swift
项目地址: https://github.com/bonfy/github-trending
欢迎你们Starbash