译文来源javascript
欢迎阅读如何使用 TypeScript, React, ANTLR4, Monaco Editor 建立一个自定义 Web 编辑器系列的第二章节, 在这以前建议您阅读使用 TypeScript, React, ANTLR4, Monaco Editor 建立一个自定义 Web 编辑器(一)html
在本文中, 我将介绍如何实现语言服务, 语言服务在编辑器中主要用来解析键入文本的繁重工做, 咱们将使用经过Parser生成的抽象语法树(AST)来查找语法或词法错误, 格式文本, 针对用户键入文本对TODOS语法作只能提示(本文中我不会实现语法自动完成), 基本上, 语言服务暴露以下函数:java
format(code: string): string
validate(code: string): Errors[]
autoComplete(code: string, currentPosition: Position): string[]
我将引入ANTLR库并增长一个根据TODOLang.g4
语法文件生Parser和Lexer的脚本, 首先引入两个必须的库:antlr4ts 和antlr4ts-cli, antlr4 Typescript 目标生成的解析器对antlr4ts包有运行时依赖, 另外一方面, 顾名思义antlr4ts-cli 就是CLI咱们将使用它生成该语言的Parser和Lexernode
npm add antlr4ts npm add -D antlr4ts-cli
在根路径建立包含TodoLang
语法规则的文件TodoLangGrammar.g4
react
grammar TodoLangGrammar; todoExpressions : (addExpression)* (completeExpression)*; addExpression : ADD TODO STRING; completeExpression : COMPLETE TODO STRING; ADD : 'ADD'; TODO : 'TODO'; COMPLETE: 'COMPLETE'; STRING: '"' ~ ["]* '"'; EOL: [\r\n] + -> skip; WS: [ \t] -> skip;
如今咱们在package.json
文件里增长经过antlr-cli生成Parser和Lexer的脚本webpack
"antlr4ts": "antlr4ts ./TodoLangGrammar.g4 -o ./src/ANTLR"
让咱们执行一下antlr4ts脚本,就能够在./src/ANTLR
目录看到生成的解析器的typescript源码了nginx
npm run antlr4ts
正如咱们看到的那样, 这里有一个Lexer 和 Parser, 若是你查看Parser文件, 你会发现它导出 TodoLangGrammarParser
类, 该类有个构造函数constructor(input: TokenStream)
, 该构造函数将TodoLangGrammarLexer
为给定代码生成的TokenStream
做为参数, TodoLangGrammarLexer
有一个以代码做为入参的构造函数 constructor(input: CharStream)
git
Parser文件包含了public todoExpressions(): TodoExpressionsContext
方法,该方法会返回代码中定义的全部TodoExpressions
的上下文对象, 猜测一下TodoExpressions
在哪里能够追踪到,其实它是源于咱们语法规则文件的第一行语法规则:github
todoExpressions : (addExpression)* (completeExpression)*;
TodoExpressionsContext
是AST
的根基, 其中的每一个节点都是另外一个规则的另外一个上下文, 它包含了终端和节点上下文,终端拥有最终令牌(ADD 令牌, TODO 令牌, todo 事项名称的令牌)web
TodoExpressionsContext
包含了addExpressions
和completeExpressions
表达式列表, 来源于如下三条规则
todoExpressions : (addExpression)* (completeExpression)*; addExpression : ADD TODO STRING; completeExpression : COMPLETE TODO STRING;
另外一方面, 每一个上下文类都包含了终端节点, 它基本包含如下文本(代码段或者令牌, 例如:ADD, COMPLETE, 表明 TODO 的字符串), AST的复杂度取决于你编写的语法规则
让咱们来看看TodoExpressionsContext, 它包含了ADD
, TODO
和STRING
终端节点, 对应的规则如:
addExpression : ADD TODO STRING;
STRING
终端节点保存了咱们要加的Todo
文本内容, 先来解析一个简单的TodoLang
代码以来了解AST如何工做的,在./src/language-service
目录建一个包含如下内容的文件parser.ts
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser"; import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer"; import { ANTLRInputStream, CommonTokenStream } from "antlr4ts"; export default function parseAndGetASTRoot(code: string): TodoExpressionsContext { const inputStream = new ANTLRInputStream(code); const lexer = new TodoLangGrammarLexer(inputStream); const tokenStream = new CommonTokenStream(lexer); const parser = new TodoLangGrammarParser(tokenStream); // Parse the input, where `compilationUnit` is whatever entry point you defined return parser.todoExpressions(); }
parser.ts
文件导出了parseAndGetASTRoot(code)
方法, 它接受TodoLang
代码而且生成相应的AST, 解析如下TodoLang
代码:
parseAndGetASTRoot(` ADD TODO "Create an editor" COMPLETE TODO "Create an editor" `)
在本节中, 我将引导您逐步了解如何向编辑器添加语法验证, ANTLR开箱即用为咱们生成词汇和语法错误, 咱们只须要实现ANTLRErrorListner
类并将其提供给Lexer和Parser, 这样咱们就能够在 ANTLR解析代码时收集错误
在./src/language-service
目录下建立TodoLangErrorListener.ts
文件, 文件导出实现ANTLRErrorListner
接口的TodoLangErrorListener
类
import { ANTLRErrorListener, RecognitionException, Recognizer } from "antlr4ts"; export interface ITodoLangError { startLineNumber: number; startColumn: number; endLineNumber: number; endColumn: number; message: string; code: string; } export default class TodoLangErrorListener implements ANTLRErrorListener<any>{ private errors: ITodoLangError[] = [] syntaxError(recognizer: Recognizer<any, any>, offendingSymbol: any, line: number, charPositionInLine: number, message: string, e: RecognitionException | undefined): void { this.errors.push( { startLineNumber:line, endLineNumber: line, startColumn: charPositionInLine, endColumn: charPositionInLine+1,//Let's suppose the length of the error is only 1 char for simplicity message, code: "1" // This the error code you can customize them as you want } ) } getErrors(): ITodoLangError[] { return this.errors; } }
每次 ANTLR 在代码解析期间遇到错误时, 它将调用此TodoLangErrorListener
, 以向其提供有关错误的信息, 该监听器会返回包含解析发生错误的代码位置极错误信息, 如今咱们尝试把TodoLangErrorListener
绑定到parser.ts
的文件的Lexer和Parser里, eg:
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser"; import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer"; import { ANTLRInputStream, CommonTokenStream } from "antlr4ts"; import TodoLangErrorListener, { ITodoLangError } from "./TodoLangErrorListener"; function parse(code: string): {ast:TodoExpressionsContext, errors: ITodoLangError[]} { const inputStream = new ANTLRInputStream(code); const lexer = new TodoLangGrammarLexer(inputStream); lexer.removeErrorListeners() const todoLangErrorsListner = new TodoLangErrorListener(); lexer.addErrorListener(todoLangErrorsListner); const tokenStream = new CommonTokenStream(lexer); const parser = new TodoLangGrammarParser(tokenStream); parser.removeErrorListeners(); parser.addErrorListener(todoLangErrorsListner); const ast = parser.todoExpressions(); const errors: ITodoLangError[] = todoLangErrorsListner.getErrors(); return {ast, errors}; } export function parseAndGetASTRoot(code: string): TodoExpressionsContext { const {ast} = parse(code); return ast; } export function parseAndGetSyntaxErrors(code: string): ITodoLangError[] { const {errors} = parse(code); return errors; }
在./src/language-service
目录下建立LanguageService.ts
, 如下是它导出的内容
import { TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser"; import { parseAndGetASTRoot, parseAndGetSyntaxErrors } from "./Parser"; import { ITodoLangError } from "./TodoLangErrorListener"; export default class TodoLangLanguageService { validate(code: string): ITodoLangError[] { const syntaxErrors: ITodoLangError[] = parseAndGetSyntaxErrors(code); //Later we will append semantic errors return syntaxErrors; } }
不错, 咱们实现了编辑器错误解析, 为此我将要建立上篇文章讨论过的web worker
, 而且添加worker
服务代理, 该代理将调用语言服务区完成编辑器的高级功能
首先, 咱们调用 monaco.editor.createWebWorker 来使用内置的 ES6 Proxies 建立代理TodoLangWorker
, TodoLangWorker
将使用语言服务来执行编辑器功能,在web worker
中执行的那些方法将由monaco代理,所以在web worker
中调用方法仅是在主线程中调用被代理的方法。
在./src/todo-lang
文件夹下建立TodoLangWorker.ts
包含如下内容:
import * as monaco from "monaco-editor-core"; import IWorkerContext = monaco.worker.IWorkerContext; import TodoLangLanguageService from "../language-service/LanguageService"; import { ITodoLangError } from "../language-service/TodoLangErrorListener"; export class TodoLangWorker { private _ctx: IWorkerContext; private languageService: TodoLangLanguageService; constructor(ctx: IWorkerContext) { this._ctx = ctx; this.languageService = new TodoLangLanguageService(); } doValidation(): Promise<ITodoLangError[]> { const code = this.getTextDocument(); return Promise.resolve(this.languageService.validate(code)); } private getTextDocument(): string { const model = this._ctx.getMirrorModels()[0]; return model.getValue(); }
咱们建立了language service
实例 而且添加了doValidation
方法, 进一步它会调用language service
的validate
方法, 还添加了getTextDocument
方法, 该方法用来获取编辑器的文本值, TodoLangWorker
类还能够扩展不少功能若是你想要支持多文件编辑等, _ctx: IWorkerContext
是编辑器的上下文对象, 它保存了文件的 model 信息
如今让咱们在./src/todo-lang
目录下建立 web worker 文件todolang.worker.ts
import * as worker from 'monaco-editor-core/esm/vs/editor/editor.worker'; import { TodoLangWorker } from './todoLangWorker'; self.onmessage = () => { worker.initialize((ctx) => { return new TodoLangWorker(ctx) }); };
咱们使用内置的worker.initialize
初始化咱们的 worker,并使用TodoLangWorker
进行必要的方法代理
那是一个web worker
, 所以咱们必须让webpack
输出对应的worker
文件
// webpack.config.js entry: { app: './src/index.tsx', "editor.worker": 'monaco-editor-core/esm/vs/editor/editor.worker.js', "todoLangWorker": './src/todo-lang/todolang.worker.ts' }, output: { globalObject: 'self', filename: (chunkData) => { switch (chunkData.chunk.name) { case 'editor.worker': return 'editor.worker.js'; case 'todoLangWorker': return "todoLangWorker.js" default: return 'bundle.[hash].js'; } }, path: path.resolve(__dirname, 'dist') }
咱们命名worker
文件为todoLangWorker.js
文件, 如今咱们在编辑器启动函数里面增长getWorkUrl
(window as any).MonacoEnvironment = { getWorkerUrl: function (moduleId, label) { if (label === languageID) return "./todoLangWorker.js"; return './editor.worker.js'; } }
这是 monaco 如何获取web worker
的 URL 的方法, 请注意, 若是worker
的 label 是TodoLang
的 ID, 咱们将返回用于在 Webpack 中打包输出的同名worker,
若是如今构建项目, 则可能会发现有一个名为todoLangWorker.js
的文件(或者在 dev-tools 中, 您将在线程部分中找到两个worker
)
如今建立一个用来管理worker
建立和获取代理worker
客户端的 WorkerManager
import * as monaco from "monaco-editor-core"; import Uri = monaco.Uri; import { TodoLangWorker } from './todoLangWorker'; import { languageID } from './config'; export class WorkerManager { private worker: monaco.editor.MonacoWebWorker<TodoLangWorker>; private workerClientProxy: Promise<TodoLangWorker>; constructor() { this.worker = null; } private getClientproxy(): Promise<TodoLangWorker> { if (!this.workerClientProxy) { this.worker = monaco.editor.createWebWorker<TodoLangWorker>({ moduleId: 'TodoLangWorker', label: languageID, createData: { languageId: languageID, } }); this.workerClientProxy = <Promise<TodoLangWorker>><any>this.worker.getProxy(); } return this.workerClientProxy; } async getLanguageServiceWorker(...resources: Uri[]): Promise<TodoLangWorker> { const _client: TodoLangWorker = await this.getClientproxy(); await this.worker.withSyncedResources(resources) return _client; } }
咱们使用createWebWorker
建立monaco代理的web worker
, 其次咱们获取返回了代理的客户端对象, 咱们使用workerClientProxy
调用代理的一些方法, 让咱们建立DiagnosticsAdapter
类, 该类用来链接 Monaco 标记 Api 和语言服务返回的 error,为了让解析的错误正确的标记在monaco上
import * as monaco from "monaco-editor-core"; import { WorkerAccessor } from "./setup"; import { languageID } from "./config"; import { ITodoLangError } from "../language-service/TodoLangErrorListener"; export default class DiagnosticsAdapter { constructor(private worker: WorkerAccessor) { const onModelAdd = (model: monaco.editor.IModel): void => { let handle: any; model.onDidChangeContent(() => { // here we are Debouncing the user changes, so everytime a new change is done, we wait 500ms before validating // otherwise if the user is still typing, we cancel the clearTimeout(handle); handle = setTimeout(() => this.validate(model.uri), 500); }); this.validate(model.uri); }; monaco.editor.onDidCreateModel(onModelAdd); monaco.editor.getModels().forEach(onModelAdd); } private async validate(resource: monaco.Uri): Promise<void> { const worker = await this.worker(resource) const errorMarkers = await worker.doValidation(); const model = monaco.editor.getModel(resource); monaco.editor.setModelMarkers(model, languageID, errorMarkers.map(toDiagnostics)); } } function toDiagnostics(error: ITodoLangError): monaco.editor.IMarkerData { return { ...error, severity: monaco.MarkerSeverity.Error, }; }
onDidChangeContent
监听器监听model
信息, 若是model
信息变动, 咱们将每隔 500ms 调用webworker
去验证代码而且增长错误标记;setModelMarkers
通知monaco增长错误标记, 为了使得编辑器语法验证功能完成,请确保在setup
函数中调用它们,并注意咱们正在使用WorkerManager来获取代理worker
monaco.languages.onLanguage(languageID, () => { monaco.languages.setMonarchTokensProvider(languageID, monarchLanguage); monaco.languages.setLanguageConfiguration(languageID, richLanguageConfiguration); const client = new WorkerManager(); const worker: WorkerAccessor = (...uris: monaco.Uri[]): Promise<TodoLangWorker> => { return client.getLanguageServiceWorker(...uris); }; //Call the errors provider new DiagnosticsAdapter(worker); }); } export type WorkerAccessor = (...uris: monaco.Uri[]) => Promise<TodoLangWorker>;
如今一切准备就绪, 运行项目而且输入错误的TodoLang
代码, 你会发现错误被标记在代码下面
如今往编辑器增长语义校验, 记得我在上篇文章提到的两个语义规则
要检查是否认义了 TODO,咱们要作的就是遍历 AST 以获取每一个 ADD 表达式并将其推入definedTodos
.而后咱们在definedTodos
中检查 TODO 的存在. 若是存在, 则是语义错误, 所以请从 ADD 表达式的上下文中获取错误的位置, 而后将错误推送到数组中, 第二条规则也是如此
function checkSemanticRules(ast: TodoExpressionsContext): ITodoLangError[] { const errors: ITodoLangError[] = []; const definedTodos: string[] = []; ast.children.forEach(node => { if (node instanceof AddExpressionContext) { // if a Add expression : ADD TODO "STRING" const todo = node.STRING().text; // If a TODO is defined using ADD TODO instruction, we can re-add it. if (definedTodos.some(todo_ => todo_ === todo)) { // node has everything to know the position of this expression is in the code errors.push({ code: "2", endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex, endLineNumber: node.stop.line, message: `Todo ${todo} already defined`, startColumn: node.stop.charPositionInLine, startLineNumber: node.stop.line }); } else { definedTodos.push(todo); } }else if(node instanceof CompleteExpressionContext) { const todoToComplete = node.STRING().text; if(definedTodos.every(todo_ => todo_ !== todoToComplete)){ // if the the todo is not yet defined, here we are only checking the predefined todo until this expression // which means the order is important errors.push({ code: "2", endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex, endLineNumber: node.stop.line, message: `Todo ${todoToComplete} is not defined`, startColumn: node.stop.charPositionInLine, startLineNumber: node.stop.line }); } } }) return errors; }
如今调用checkSemanticRules
函数, 在language service
的validate
方法中将语义和语法错误合并返回, 如今咱们编辑器已经支持语义校验
对于编辑器的自动格式化功能, 您须要经过调用Monaco API registerDocumentFormattingEditProvider
提供并注册 Monaco 的格式化提供程序. 查看 monaco-editor 文档以获取更多详细信息. 调用并遍历 AST 将为你展现美化后的代码
// languageService.ts format(code: string): string{ // if the code contains errors, no need to format, because this way of formating the code, will remove some of the code // to make things simple, we only allow formatting a valide code if(this.validate(code).length > 0) return code; let formattedCode = ""; const ast: TodoExpressionsContext = parseAndGetASTRoot(code); ast.children.forEach(node => { if (node instanceof AddExpressionContext) { // if a Add expression : ADD TODO "STRING" const todo = node.STRING().text; formattedCode += `ADD TODO ${todo}\n`; }else if(node instanceof CompleteExpressionContext) { // If a Complete expression: COMPLETE TODO "STRING" const todoToComplete = node.STRING().text; formattedCode += `COMPLETE TODO ${todoToComplete}\n`; } }); return formattedCode; }
在todoLangWorker
中添加format
方法, 该format
方法会使用language service
的format
方法
如今建立TodoLangFomattingProvider
类去实现`DocumentFormattingEditProvider
接口
import * as monaco from "monaco-editor-core"; import { WorkerAccessor } from "./setup"; export default class TodoLangFormattingProvider implements monaco.languages.DocumentFormattingEditProvider { constructor(private worker: WorkerAccessor) { } provideDocumentFormattingEdits(model: monaco.editor.ITextModel, options: monaco.languages.FormattingOptions, token: monaco.CancellationToken): monaco.languages.ProviderResult<monaco.languages.TextEdit[]> { return this.format(model.uri, model.getValue()); } private async format(resource: monaco.Uri, code: string): Promise<monaco.languages.TextEdit[]> { // get the worker proxy const worker = await this.worker(resource) // call the validate methode proxy from the langaueg service and get errors const formattedCode = await worker.format(code); const endLineNumber = code.split("\n").length + 1; const endColumn = code.split("\n").map(line => line.length).sort((a, b) => a - b)[0] + 1; console.log({ endColumn, endLineNumber, formattedCode, code }) return [ { text: formattedCode, range: { endColumn, endLineNumber, startColumn: 0, startLineNumber: 0 } } ] } }
TodoLangFormattingProvider
经过调用worker
提供的format
方法, 并借助editor.getValue()
做为入参, 而且向monaco提供各式后的代码及想要替换的代码范围, 如今进入setup
函数而且使用Monaco registerDocumentFormattingEditProvider
API注册formatting provider
, 重跑应用, 你能看到编辑器已支持自动格式化了
monaco.languages.registerDocumentFormattingEditProvider(languageID, new TodoLangFormattingProvider(worker));
尝试点击Format document 或Shift + Alt + F, 你能看到如图的效果:
若要使自动完成支持定义的 TODO, 您要作的就是从 AST 获取全部定义的 TODO, 并提供completion provider
经过在setup
中调用registerCompletionItemProvider
。completion provider
为您提供代码和光标的当前位置,所以您能够检查用户正在键入的上下文,若是他们在完整的表达式中键入 TODO,则能够建议预约义的 TO DOs。 请记住,默认状况下,Monaco-editor 支持对代码中的预约义标记进行自动补全,您可能须要禁用该功能并实现本身的标记以使其更加智能化和上下文化