Supabase Blog

Postgres Language Server: implementing the Parser

thumbnail

"Postgres Language Server: Implementing the Parser"

  • The Postgres Language Server aims to improve the experience of writing SQL in editors like VSCode.
  • The core component of a language server is the parser, which takes input source code and creates a semantic model of the code.
  • Implementing a parser for Postgres is challenging due to the complex and constantly evolving syntax of Postgres.
  • The parser needs to handle both expressions and statements, with statements being particularly difficult to parse.
  • The current parser API only parses the entire input and returns an error if there is any syntax error, making it challenging to handle errors in specific statements.
  • To implement the parser, we need to work with the AST and a stream of tokens.
  • The LL parser we use is simple and relies on distinct keywords to determine the start of each statement.
  • Ambiguity in statement identification is acceptable for our use case, as we only need to know if there is a statement and not the exact statement type.
  • The parser should generate a concrete syntax tree (CST) with proper ranges for each node in the AST.