架构
本文档涵盖了 Biome 的一些内部结构,以及它们在项目中的使用方式。
¥This document covers some of the internals of Biome, and how they are used inside the project.
¥Scanner
Biome 包含一个扫描器,负责爬取文件系统以提取项目的重要元数据。具体来说,扫描器有三种使用方式:
¥Biome has a scanner that is responsible for crawling the file system to extract important metadata about projects. Specifically, there are three ways in which the scanner is used:
-
在 monorepos 中发现嵌套的
biome.json/biome.jsonc文件。¥To discover nested
biome.json/biome.jsoncfiles in monorepos. -
如果启用了
vcs.useIgnoreFile设置,则可发现嵌套的.gitignore文件。¥To discover nested
.gitignorefiles if thevcs.useIgnoreFilesetting is enabled. -
如果启用了 项目域名 中的任何规则,则对项目中的
package.json清单和源文件进行索引。¥To index
package.jsonmanifests as well as source files in a project if any rules from the project domain are enabled.
¥Scanner targeting
如果未启用项目规则,扫描器将自动仅针对给定会话相关的文件夹进行扫描。
¥If project rules are not enabled, the scanner automatically targets only the folders that are relevant for a given session.
这意味着,如果你有一个大型单体仓库,并且你在 packages/foo/ 文件夹内运行 biome check,则该文件夹将被重命名为 “targeted”。这意味着系统将扫描以下文件夹以查找嵌套的配置文件和/或嵌套的忽略文件:
¥This means that if you have a large monorepo, and you run biome check from
inside the packages/foo/ folder, that folder will be “targeted”. This means
the following folders get scanned for nested configuration files and/or nested ignore files:
-
仓库的根文件夹。
¥The root folder of the repository.
-
packages/文件夹。¥The
packages/folder. -
packages/foo/文件夹。¥The
packages/foo/folder. -
任何存在于
packages/foo/下的文件夹,但node_modules/或你的配置中排除的文件夹除外(参见 below)。¥Any folders that exist under
packages/foo/, exceptnode_modules/or those that are excluded by your configuration (see below).
与 packages/ 或 packages/foo/ 相邻的其他文件夹将被自动跳过。
¥Other folders that may be adjacent to either packages/ or packages/foo/ will
be automatically skipped.
同样,如果你从仓库根目录运行 biome format packages/bar/src/index.ts,扫描器将以 packages/bar/src/ 文件夹为目标。
¥Similarly, if you run biome format packages/bar/src/index.ts from the root
of the repository, the scanner will target the packages/bar/src/ folder.
如果启用了项目规则,则这些优化将不适用。
¥If project rules are enabled, these optimisations don’t apply.
¥Configuring the scanner
可以通过 files.includes 设置配置扫描器。
¥The scanner can be configured through the
files.includes setting.
解析器和 CST
Section titled “解析器和 CST”¥Parser and CST
解析器的架构由 rowan 的内部分支(实现 绿色和红色树 模式的库)所影响。
¥The architecture of the parser is bumped by an internal fork of rowan, a library that implements the Green and Red tree pattern.
CST(具体语法树)是一种与 AST(抽象语法树)非常相似的数据结构,用于跟踪程序的所有信息,包括琐碎信息。
¥The CST (Concrete Syntax Tree) is a data structure very similar to an AST (Abstract Syntax Tree) that keeps track of all the information of a program, trivia included.
琐事由对程序运行很重要的所有信息表示:
¥Trivia is represented by all that information that is important to a program to run:
-
spaces
-
tabs
-
comments
琐事附加到节点。节点可以有前导琐事和尾随琐事。如果你从左到右阅读代码,则 leading trivia 会出现在关键字之前,而 trialing trivia 会出现在关键字之后。
¥Trivia is attached to a node. A node can have leading trivia and trailing trivia. If you read code from left to right, leading trivia appears before a keyword, and trialing trivia appears after a keyword.
前导琐事和尾随琐事分类如下:
¥Leading trivia and trailing trivia are categorized as follows:
-
直到标记/关键字(包括换行符)的每个琐事都将是前导琐事;
¥Every trivia up to the token/keyword (including line breaks) will be the leading trivia;
-
直到下一个换行符(但不包括它)的所有内容都将是尾随琐事;
¥Everything until the next linebreak (but not including it) will be the trailing trivia;
给定以下 JavaScript 代码片段,// comment 1 是标记 ; 的尾随琐事,// comment 2 是关键字 const 的前导琐事。以下是 Biome 所代表的 CST 的最小化版本:
¥Given the following JavaScript snippet, // comment 1 is a trailing trivia of the token ;, and // comment 2 is a leading trivia to the keyword const. Below is a minimized version of the CST represented by Biome:
const a = "foo"; // comment 1// comment 2const b = "bar";0: JS_MODULE@0..55 ... 1: SEMICOLON@15..27 ";" [] [Whitespace(" "), Comments("// comment 1")] 1: JS_VARIABLE_STATEMENT@27..55 ... 1: CONST_KW@27..45 "const" [Newline("\n"), Comments("// comment 2"), Newline("\n")] [Whitespace(" ")] 3: EOF@55..55 "" [] []CST 在设计上永远无法直接访问;开发者可以使用 Red 树读取其信息,使用从语言语法自动生成的多个 API。
¥The CST is never directly accessible by design; a developer can read its information using the Red tree, using a number of APIs that are autogenerated from the grammar of the language.
弹性和可恢复的解析器
Section titled “弹性和可恢复的解析器”¥Resilient and recoverable parser
为了构建 CST,解析器需要具有错误恢复能力和可恢复性:
¥In order to construct a CST, a parser needs to be error-resilient and recoverable:
-
弹性:能够在遇到属于该语言的语法错误后恢复解析的解析器;
¥resilient: a parser that is able to resume parsing after encountering syntax errors that belong to the language;
-
可恢复:能够理解错误发生位置并能够通过创建正确信息恢复解析的解析器;
¥recoverable: a parser that is able to understand where an error occurred and being able to resume the parsing by creating correct information;
解析器的可恢复部分不是一门科学,也没有一成不变的规则。这意味着根据解析器解析的内容以及发生错误的位置,解析器可能能够以预期的方式恢复自身。
¥The recoverable part of the parser is not a science, and no rules are set in stone. This means that depending on what the parser was parsing and where an error occurred, the parser might be able to recover itself in an expected way.
解析器还使用 ‘Bogus’ 节点来保护消费者免于使用不正确的语法。这些节点用于修饰语法错误导致的代码损坏。
¥The parser also uses’ Bogus’ nodes to protect the consumers from consuming incorrect syntax. These nodes are used to decorate the broken code caused by a syntax error.
在下面的例子中,while 中的括号丢失了,尽管解析器可以很好地恢复自身,并可以用像样的 CST 表示代码。循环的括号和条件被标记为缺失,并且代码块被正确解析:
¥In the following example, the parentheses in the while are missing, although the parser can recover itself in a good manner and can represent the code with a decent CST. The parenthesis and condition of the loop are marked as missing, and the code block is correctly parsed:
while {}JsModule { interpreter_token: missing (optional), directives: JsDirectiveList [], items: JsModuleItemList [ JsWhileStatement { while_token: WHILE_KW@0..6 "while" [] [Whitespace(" ")], l_paren_token: missing (required), test: missing (required), r_paren_token: missing (required), body: JsBlockStatement { l_curly_token: L_CURLY@6..7 "{" [] [], statements: JsStatementList [], r_curly_token: R_CURLY@7..8 "}" [] [], }, }, ], eof_token: EOF@8..8 "" [] [],}这是解析过程中发出的错误:
¥This is an error emitted during parsing:
main.tsx:1:7 parse ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✖ expected `(` but instead found `{`
> 1 │ while {} │ ^
ℹ Remove {以下代码片段则不是这样。解析器在恢复阶段无法正确理解语法,因此它需要依靠虚假节点将某些语法标记为错误。注意 JsBogusStatement:
¥The same can’t be said for the following snippet. The parser can’t properly understand the syntax during the recovery phase, so it needs to rely on the bogus nodes to mark some syntax as erroneous. Notice the JsBogusStatement:
function}JsModule { interpreter_token: missing (optional), directives: JsDirectiveList [], items: JsModuleItemList [ TsDeclareFunctionDeclaration { async_token: missing (optional), function_token: FUNCTION_KW@0..8 "function" [] [], id: missing (required), type_parameters: missing (optional), parameters: missing (required), return_type_annotation: missing (optional), semicolon_token: missing (optional), }, JsBogusStatement { items: [ R_CURLY@8..9 "}" [] [], ], }, ], eof_token: EOF@9..9 "" [] [],}这是我们从解析阶段得到的错误:
¥This is the error we get from the parsing phase:
main.tsx:1:9 parse ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✖ expected a name for the function in a function declaration, but found none
> 1 │ function} │ ^¥Formatter
Linter
Section titled “Linter”¥Daemon
Biome 使用服务器-客户端架构来运行其任务。
¥Biome uses a server-client architecture to run its tasks.
daemon 是一个长期运行的服务器,Biome 在后台生成并用于处理来自编辑器和 CLI 的请求。
¥A daemon is a long-running server that Biome spawns in the background and uses to process requests from the editor and CLI.
Biome v2.1 中文网 - 粤ICP备13048890号