Clang AST 介绍 (updating)

来源:互联网 发布:淘宝影响实体店 编辑:程序博客网 时间:2024/06/11 21:55

Clang AST 的介绍

这篇文章介绍了一片温柔的序曲关于 Clang AST 的神秘。这个针对于那写想对clang 有贡献的开发者,或是那些使用基于Clang 的AST 工具,如 AST匹配器。

幻灯片 (个人觉得很好值得好好研究)

介绍

Clang’s AST is different from ASTs produced by some other compilers inthat it closelyresembles both the written C++ code and the C++standard. For example, parenthesis expressions and compile timeconstants are available in an unreduced form in the AST. This makesClang’s AST a good fit forrefactoring tools.

Documentation for all Clang AST nodes is available via the generatedDoxygen. The doxygen onlinedocumentation is also indexed by your favorite search engine, which willmake a search for clang and the AST node’s class name usually turn upthe doxygen of the class you’re looking for (for example, search for:clang ParenExpr).

Examining the AST

一个熟悉Clang AST 的方法就是去 看一些简单的例子对应的AST。Clang 有一个内置 的AST-dump模式,他可以通过 标志 -ast-dump 调用。

看一下这个简单的AST 例子:

$ cat test.ccint f(int x) {  int result = (x / 42);  return result;}# Clang by default is a frontend for many tools; -Xclang is used to pass# options directly to the C++ frontend.$ clang -Xclang -ast-dump -fsyntax-only test.ccTranslationUnitDecl 0x5aea0d0 <<invalid sloc>>... cutting out internal declarations of clang ...`-FunctionDecl 0x5aeab50 <test.cc:1:1, line:4:1> f 'int (int)'  |-ParmVarDecl 0x5aeaa90 <line:1:7, col:11> x 'int'  `-CompoundStmt 0x5aead88 <col:14, line:4:1>    |-DeclStmt 0x5aead10 <line:2:3, col:24>    | `-VarDecl 0x5aeac10 <col:3, col:23> result 'int'    |   `-ParenExpr 0x5aeacf0 <col:16, col:23> 'int'    |     `-BinaryOperator 0x5aeacc8 <col:17, col:21> 'int' '/'    |       |-ImplicitCastExpr 0x5aeacb0 <col:17> 'int' <LValueToRValue>    |       | `-DeclRefExpr 0x5aeac68 <col:17> 'int' lvalue ParmVar 0x5aeaa90 'x' 'int'    |       `-IntegerLiteral 0x5aeac90 <col:21> 'int' 42    `-ReturnStmt 0x5aead68 <line:3:3, col:10>      `-ImplicitCastExpr 0x5aead50 <col:10> 'int' <LValueToRValue>        `-DeclRefExpr 0x5aead28 <col:10> 'int' lvalue Var 0x5aeac10 'result' 'int'

(下面是我在自己的电脑上编译运行的完整结果)TranslationUnitDecl 0x582af70 <<invalid sloc>>|-TypedefDecl 0x582b470 <<invalid sloc>> __int128_t '__int128'|-TypedefDecl 0x582b4d0 <<invalid sloc>> __uint128_t 'unsigned __int128'|-TypedefDecl 0x582b820 <<invalid sloc>> __builtin_va_list '__va_list_tag [1]'`-FunctionDecl 0x582b940 <test.c:1:1, line:5:1> f 'int (int)'  |-ParmVarDecl 0x582b880 <line:1:7, col:11> x 'int'  `-CompoundStmt 0x582bb78 <line:2:1, line:5:1>    |-DeclStmt 0x582bb00 <line:3:3, col:22>    | `-VarDecl 0x582ba00 <col:3, col:21> result 'int'    |   `-ParenExpr 0x582bae0 <col:16, col:21> 'int'    |     `-BinaryOperator 0x582bab8 <col:17, col:19> 'int' '/'    |       |-ImplicitCastExpr 0x582baa0 <col:17> 'int' <LValueToRValue>    |       | `-DeclRefExpr 0x582ba58 <col:17> 'int' lvalue ParmVar 0x582b880 'x' 'int'    |       `-IntegerLiteral 0x582ba80 <col:19> 'int' 42    `-ReturnStmt 0x582bb58 <line:4:3, col:10>      `-ImplicitCastExpr 0x582bb40 <col:10> 'int' <LValueToRValue>        `-DeclRefExpr 0x582bb18 <col:10> 'int' lvalue Var 0x582ba00 'result' 'int'

The toplevel declaration ina translation unit is always the translation unitdeclaration.In this example, our first user written declaration is thefunctiondeclarationof “f”. The body of “f” is acompoundstatement,whose child nodes are adeclarationstatementthat declares our result variable, and the returnstatement.

AST Context

All information about the AST for a translation unit is bundled up inthe classASTContext.It allowstraversal of the whole translation unit starting fromgetTranslationUnitDecl,or to access Clang’stable ofidentifiersfor the parsed translation unit.

AST Nodes

Clang’s AST nodes are modeled on a class hierarchy that does not have acommon ancestor. Instead, there are multiple larger hierarchies forbasic node types likeDecl andStmt. Manyimportant AST nodes derive fromType,Decl,DeclContextorStmt, withsome classes deriving from both Decl and DeclContext.

There are also a multitude of nodes in the AST that are not part of alarger hierarchy, and are onlyreachable from specific other nodes, likeCXXBaseSpecifier.

Thus, to traverse the full AST, one starts from theTranslationUnitDecland thenrecursively traverses everything that can be reached from thatnode - this information has to be encoded for each specific node type.This algorithm is encoded in theRecursiveASTVisitor.See theRecursiveASTVisitortutorial.

The two most basic nodes in the Clang AST are statements(Stmt) anddeclarations(Decl). Notethat expressions(Expr) arealso statements in Clang’s AST.

*转载请注明出处