Wait no longer! Create RSS feeds for all websites you care about and read them from the comfort of your feed reader.
如今愈来愈多的网站都不支持 RSS 订阅了,而做为 RSS 的忠实粉丝,仍是但愿有个工具能够将本身关注的网站内容聚合在一块儿,而后实时推送到手机上,及时获取最新消息和新闻动态。php
因此今天就让咱们用 2 个小时,撸一个 RSS 生成器。html
本文的主角仍然是 Laravel。node
因为须要有一个后台,添加咱们关注的网站,因此咱们仍是沿用 laravel-damin 插件。laravel
// 1. 建立 Laravel 5.5版本项目 composer create-project --prefer-dist laravel/laravel:5.5 lrss cd lrss cp .env.example .env php artisan key:generate // 2. 使用 laravel-admin 插件 composer require encore/laravel-admin "1.5.*" php artisan vendor:publish --provider="Encore\Admin\AdminServiceProvider" php artisan admin:install
注:如出现问题:SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes git
解决方案:在 AppServiceProvider.php
加入默认字符串长度github
use Illuminate\Support\Facades\Schema; public function boot() { Schema::defaultStringLength(191); }
刚巧,咱们想借助 Symfonys 提供的 DomCrawler
插件,来解析网站的 xpath 信息,发现 laravel-admin
插件有引入:web
以前有想借助 huginn 这个神器来生成咱们的 RSS Feed,主要参看文章:让全部网页变成RSS —— Huginn http://git.huginn.cn/docs/%E8%AE%A9%E6%89%80%E6%9C%89%E7%BD%91%E9%A1%B5%E5%8F%98%E6%88%90RSS%E2%80%94%E2%80%94Huginn.htmljson
但在实际使用中,发现一直出现 Huginn 无端宕机,或者后台 jobs 动不动就失败。这才有了本身撸个工具的想法。数组
但 Huginn 给了我灵感,能够利用解析 XPath 来生成 RSS Feed。bash
为了验证输入的 XPath 信息的准确性,咱们能够参考 Huginn,
首先在 Huginn 测试 XPath 的效果,在建立 WebsiteAgent 界面,输入以下信息:
{ "expected_update_period_in_days": "2", "url": "http://www.woshipm.com/", "type": "html", "mode": "on_change", "extract": { "title": { "xpath": "//div[@class=\"postlist-item u-clearfix\"]/div[2]/h2/a/text()", "value": "normalize-space(.)" }, "desc": { "xpath": "//div[@class=\"postlist-item u-clearfix\"]/div[2]/p/text()", "value": "normalize-space(.)" }, "url": { "xpath": "//div[@class=\"postlist-item u-clearfix\"]/div[2]/h2/a", "value": "@href" } } }
而后点 「Dry Run」便可测试:
最后根据 Huginn 填入的信息,咱们来建立 Xpath Controller
// bash php artisan make:model Xpath -m // migration public function up() { Schema::create('xpaths', function (Blueprint $table) { $table->increments('id'); // url $table->string('url', 250); $table->string("urldesc", 250); // title $table->string('titlexpath', 250); $table->string('titlevalue', 100) ->nullable(); // desc $table->string('descxpath', 250); $table->string('descvalue', 100) ->nullable(); // url $table->string("preurl", 50)->nullable(); $table->string('urlxpath', 250); $table->string('urlvalue', 100) ->nullable(); $table->timestamps(); }); } // migrate php artisan migrate // 建立 admin/Controller php artisan admin:make XpathController --model=App\\Xpath // 创建 route $router->resource('xpaths', XpathController::class); // 加入到 admin 的 menu 中 // 略
注: 能够参考以前的文章:推荐一个 Laravel admin 后台管理插件
有了 laravel-admin 插件,操做 XPath 信息就好简单了,直接看代码:
<?php namespace App\Admin\Controllers; use App\Xpath; use Encore\Admin\Form; use Encore\Admin\Grid; use Encore\Admin\Facades\Admin; use Encore\Admin\Layout\Content; use App\Http\Controllers\Controller; use Encore\Admin\Controllers\ModelForm; class XpathController extends Controller { use ModelForm; /** * Index interface. * * @return Content */ public function index() { return Admin::content(function (Content $content) { $content->header('header'); $content->description('description'); $content->body($this->grid()); }); } /** * Edit interface. * * @param $id * @return Content */ public function edit($id) { return Admin::content(function (Content $content) use ($id) { $content->header('header'); $content->description('description'); $content->body($this->form()->edit($id)); }); } /** * Create interface. * * @return Content */ public function create() { return Admin::content(function (Content $content) { $content->header('header'); $content->description('description'); $content->body($this->form()); }); } /** * Make a grid builder. * * @return Grid */ protected function grid() { return Admin::grid(Xpath::class, function (Grid $grid) { $grid->id('ID')->sortable(); $grid->column('url'); $grid->column('urldesc', "描述"); $grid->column('titlexpath'); $grid->column('titlevalue'); $grid->column('descxpath'); $grid->column('descvalue'); $grid->column('preurl'); $grid->column('urlxpath'); $grid->column('urlvalue'); $grid->created_at(); $grid->updated_at(); }); } /** * Make a form builder. * * @return Form */ protected function form() { return Admin::form(Xpath::class, function (Form $form) { $form->display('id', 'ID'); // url $form->text('url', '连接') ->placeholder('请输入解析的网址') ->rules('required|min:5|max:250'); $form->text('urldesc', '一句话描述') ->placeholder('一句话描述') ->rules('required|min:5|max:250'); // title $form->divide(); $form->text('titlexpath', 'title xpath') ->placeholder('请输入标题 xpath') ->rules('required|min:5|max:250'); $form->text('titlevalue', 'title value 默承认以不填') ->default('') ->rules('max:100'); // desc $form->divide(); $form->text('descxpath', 'desc xpath') ->placeholder('请输入详情 xpath') ->rules('required|min:5|max:250'); $form->text('descvalue', 'desc value,默承认以不填') ->default('') ->rules('max:100'); // url $form->divide(); $form->text('preurl', 'url 前缀') ->placeholder('请输入文章的url 前缀') ->rules('max:50'); $form->text('urlxpath', 'url xpath') ->placeholder('请输入文章的url xpath') ->rules('required|min:5|max:250'); $form->text('urlvalue', 'url value 默承认以不填') ->default('') ->rules('max:100'); $form->divide(); $form->display('created_at', 'Created At'); $form->display('updated_at', 'Updated At'); }); } }
添加两个网站信息试试:
1. 根据填入的 Xpath 信息,解析内容:
public static function analysis(XpathModel $model) { $html = file_get_contents($model->url); $crawler = new Crawler($html); $titlenodes = $crawler->filterXPath($model->titlexpath); $titles = self::getValueByNodes($titlenodes, $model->titlevalue); $descnodes = $crawler->filterXPath($model->descxpath); $desces = self::getValueByNodes($descnodes, $model->descvalue); $urlnodes = $crawler->filterXPath($model->urlxpath); $urls = self::getValueByNodes($urlnodes, $model->urlvalue); return RssFeeds::feeds($model, $titles, $desces, $urls); } // 经过规则获取 nodes 的值 public static function getValueByNodes(Crawler $crawler, $key = null) { return $crawler->each(function (Crawler $node) use ($key) { if (empty($key)) { return trim($node->text()); } else { return $node->attr($key); } }); }
2. 将得到 title、desc 和 url 数组装入 Feed Item 中,构建 RSS。
public static function feeds(Xpath $xpath, $titles = [], $desces = [], $urls = []) { if (!empty($xpath->preurl)) { $preurl = $xpath->preurl; $urlss = collect($urls)->map(function ($url, $key) use ($preurl) { return $preurl.trim($url); }); } else { $urlss = collect($urls); } return response() ->view('rss', [ 'xpath' => $xpath, 'titles' => $titles, 'desces' => $desces, 'urls' => $urlss->toArray(), 'pubDate' => Carbon::now() ]) ->header('Content-Type', 'text/xml'); }
3. 编写 blade 模板
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>{{ $xpath->url or ' title' }}</title> <description>{{ $xpath->urldesc or '描述' }}</description> <link>{{ $xpath->url }}</link> <atom:link href="{{ url("/feed/$xpath->id") }}" rel="self" type="application/rss+xml"/> <pubDate>{{ $pubDate }}</pubDate> <lastBuildDate>{{ $pubDate }}</lastBuildDate> <generator>coding01</generator> @foreach ($titles as $key => $title) <item> <title>{{ $title }}</title> <link>{{ $urls[$key] }}</link> <description>{{ $desces[$key] }}</description> <pubDate>{{ $pubDate }}</pubDate> <author>coding01</author> <guid>{{ $urls[$key] }}</guid> <category>{{ $title }}</category> </item> @endforeach </channel> </rss>
4. 最终来看看效果吧,为每个网站作成一个 RSS:
至此,当前的 Laravel 代码告一段落了,但为了达到及时推送内容到手机的目标,我借助了两个工具:
- Tiny Tiny RSS
- IFTTT + 钉钉
把制做好的 RSS 连接加入 Tiny Tiny RSS 上,每隔半个小时,更新一次,获取最新的内容:
而后借助 IFTTT 绑定钉钉的群机器人 Webhook:
最后在手机钉钉或者在 PC 上就能及时收到最新资讯和信息了:
今天花了 2 个小时,主要是借助 laravel-amin 和 symfony/dom-crawler 插件来本身动手搭建一个 RSS Feed 生成工具Demo。
接下来还有待于继续优化,如向 https://feed43.com/ 那样,输入 Web URL 就能生成 RSS Feed,又能根据实际须要本身设定更新时间等。
最后,代码已放在 github 上,可供参考: https://github.com/fanly/lrss
「未完待续」