Green Tree Snakes - the missing Python AST docs¶
Abstract Syntax Trees, ASTs, are a powerful feature of Python. You can write programs that inspect and modify Python code, after the syntax has been parsed, but before it gets compiled to byte code. That opens up a world of possibilities for introspection, testing, and mischief.
The official documentation for the ast module is good, but somewhat brief. Green Tree Snakes is more like a field guide (or should that be forest guide?) for working with ASTs. To contribute to the guide, see the source repository.
Contents:
Getting to and from ASTs¶
To build an ast from code stored as a string, use ast.parse(). To turn the
ast into executable code, pass it to compile() (which can also compile a
string directly).
>>> tree = ast.parse("print('hello world')")
>>> tree
<_ast.Module object at 0x9e3df6c>
>>> exec(compile(tree, filename="<ast>", mode="exec"))
hello world
Modes¶
Python code can be compiled in three modes. The root of the AST depends on the
mode parameter you pass to ast.parse(), and it must correspond to the
mode parameter when you call compile().
- exec - Normal Python code is run with
mode='exec'. The root of the AST is aast.Module, whosebodyattribute is a list of nodes. - eval - Single expressions are compiled with
mode='eval', and passing them toeval()will return their result. The root of the AST is anast.Expression, and itsbodyattribute is a single node, such asast.Callorast.BinOp. This is different fromast.Expr, which holds an expression within an AST. - single - Single statements or expressions can be compiled with
mode='single'. If it’s an expression,sys.displayhook()will be called with the result, like when code is run in the interactive shell. The root of the AST is anast.Interactive, and itsbodyattribute is a list of nodes.
Note
The type_comment and ignore_types fields introduced in Python 3.8
are only populated if ast.parse() is called with type_comment=True.
Fixing locations¶
To compile an AST, every node must have lineno and col_offset attributes.
Nodes produced by parsing regular code already have these, but nodes you create
programmatically don’t. There are a few helper functions for this:
ast.fix_missing_locations()recursively fills in any missing locations by copying from the parent node. The rough and ready answer.ast.copy_location()copieslinenoandcol_offsetfrom one node to another. Useful when you’re replacing a node.ast.increment_lineno()increaseslinenofor a node and its children, pushing them further down a file.
Going backwards¶
Python itself doesn’t provide a way to turn a compiled code object into an AST, or an AST into a string of code. Some third party tools can do these things:
- astor can convert an AST back to readable Python code.
- Meta also tries to decompile Python bytecode to an AST, but it appears to be unmaintained.
- uncompyle6 is an actively maintained Python decompiler at the time of writing. Its documented interface is a command line program producing Python source code.
Meet the Nodes¶
An AST represents each element in your code as an object. These are instances of
the various subclasses of AST described below. For instance, the code
a + 1 is a BinOp, with a Name on the left, a Num
on the right, and an Add operator.
Literals¶
-
class
Constant(value, kind)¶ New in version 3.6.
A constant. The
valueattribute holds the Python object it represents. This can be simple types such as a number, string orNone, but also immutable container types (tuples and frozensets) if all of their elements are constant.kindis'u'for strings with auprefix, andNoneotherwise, allowing tools to distinguishu"a"from"a".This class is available in the
astmodule from Python 3.6, but it isn’t produced by parsing code until Python 3.8.Changed in version 3.8: The
kindfield was added.
-
class
Num(n)¶ Deprecated since version 3.8: Replaced by
ConstantA number - integer, float, or complex. The
nattribute stores the value, already converted to the relevant type.
-
class
Str(s)¶ Deprecated since version 3.8: Replaced by
ConstantA string. The
sattribute hold the value. In Python 2, the same type holds unicode strings too.
-
class
FormattedValue(value, conversion, format_spec)¶ New in version 3.6.
Node representing a single formatting field in an f-string. If the string contains a single formatting field and nothing else the node can be isolated otherwise it appears in
JoinedStr.valueis any expression node (such as a literal, a variable, or a function call).conversionis an integer:- -1: no formatting
- 115:
!sstring formatting - 114:
!rrepr formatting - 97:
!aascii formatting
format_specis aJoinedStrnode representing the formatting of the value, orNoneif no format was specified. Bothconversionandformat_speccan be set at the same time.
-
class
JoinedStr(values)¶ New in version 3.6.
An f-string, comprising a series of
FormattedValueandStrnodes.>>> parseprint('f"sin({a}) is {sin(a):.3}"') Module(body=[ Expr(value=JoinedStr(values=[ Str(s='sin('), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None), Str(s=') is '), FormattedValue(value=Call(func=Name(id='sin', ctx=Load()), args=[ Name(id='a', ctx=Load()), ], keywords=[]), conversion=-1, format_spec=JoinedStr(values=[ Str(s='.3'), ])), ])), ])
Note
The pretty-printer used in these examples is available in the source repository for Green Tree Snakes.
-
class
Bytes(s)¶ Deprecated since version 3.8: Replaced by
ConstantA
bytesobject. Thesattribute holds the value. Python 3 only.
-
class
List(elts, ctx)¶ -
class
Tuple(elts, ctx)¶ A list or tuple.
eltsholds a list of nodes representing the elements.ctxisStoreif the container is an assignment target (i.e.(x,y)=pt), andLoadotherwise.
-
class
Set(elts)¶ A set.
eltsholds a list of nodes representing the elements.
-
class
Dict(keys, values)¶ A dictionary.
keysandvalueshold lists of nodes with matching order (i.e. they could be paired withzip()).Changed in version 3.5: It is now possible to expand one dictionary into another, as in
{'a': 1, **d}. In the AST, the expression to be expanded (aNamenode in this example) goes in thevalueslist, with aNoneat the corresponding position inkeys.
Variables¶
-
class
Name(id, ctx)¶ A variable name.
idholds the name as a string, andctxis one of the following types.
-
class
Load¶ -
class
Store¶ -
class
Del¶ Variable references can be used to load the value of a variable, to assign a new value to it, or to delete it. Variable references are given a context to distinguish these cases.
>>> parseprint("a") # Loading a
Module(body=[
Expr(value=Name(id='a', ctx=Load())),
])
>>> parseprint("a = 1") # Storing a
Module(body=[
Assign(targets=[
Name(id='a', ctx=Store()),
], value=Num(n=1)),
])
>>> parseprint("del a") # Deleting a
Module(body=[
Delete(targets=[
Name(id='a', ctx=Del()),
]),
])
-
class
Starred(value, ctx)¶ A
*varvariable reference.valueholds the variable, typically aNamenode.Note that this isn’t used to define a function with
*args-FunctionDefnodes have special fields for that. In Python 3.5 and above, though,Starredis needed when building aCallnode with*args.
>>> parseprint("a, *b = it")
Module(body=[
Assign(targets=[
Tuple(elts=[
Name(id='a', ctx=Store()),
Starred(value=Name(id='b', ctx=Store()), ctx=Store()),
], ctx=Store()),
], value=Name(id='it', ctx=Load())),
])
Expressions¶
-
class
Expr(value)¶ When an expression, such as a function call, appears as a statement by itself (an expression statement), with its return value not used or stored, it is wrapped in this container.
valueholds one of the other nodes in this section, or a literal, aName, aLambda, or aYieldorYieldFromnode.
>>> parseprint('-a')
Module(body=[
Expr(value=UnaryOp(op=USub(), operand=Name(id='a', ctx=Load()))),
])
-
class
NamedExpr(target, value)¶ New in version 3.8.
Used to bind an expression to a name using the walrus operator
:=.targetholds aNamewhich is the name the expression is bound to. Note that thectxof theNameshould be set toStore.valueis any node valid as thevalueofExpr.
>>> parseprint("b = (a := 1)")
Module(body=[
Assign(targets=[
Name(id='b', ctx=Store()),
], value=NamedExpr(target=Name(id='a', ctx=Store()), value=Constant(value=1, kind=None)), type_comment=None),
], type_ignores=[])
-
class
UnaryOp(op, operand)¶ A unary operation.
opis the operator, andoperandany expression node.
-
class
UAdd¶ -
class
USub¶ -
class
Not¶ -
class
Invert¶ Unary operator tokens.
Notis thenotkeyword,Invertis the~operator.
-
class
BinOp(left, op, right)¶ A binary operation (like addition or division).
opis the operator, andleftandrightare any expression nodes.
-
class
Add¶ -
class
Sub¶ -
class
Mult¶ -
class
Div¶ -
class
FloorDiv¶ -
class
Mod¶ -
class
Pow¶ -
class
LShift¶ -
class
RShift¶ -
class
BitOr¶ -
class
BitXor¶ -
class
BitAnd¶ -
class
MatMult¶ Binary operator tokens.
New in version 3.5:
MatMult- the@operator for matrix multiplication.
-
class
BoolOp(op, values)¶ A boolean operation, ‘or’ or ‘and’.
opisOrorAnd.valuesare the values involved. Consecutive operations with the same operator, such asa or b or c, are collapsed into one node with several values.This doesn’t include
not, which is aUnaryOp.
-
class
Compare(left, ops, comparators)¶ A comparison of two or more values.
leftis the first value in the comparison,opsthe list of operators, andcomparatorsthe list of values after the first. If that sounds awkward, that’s because it is:>>> parseprint("1 < a < 10") Module(body=[ Expr(value=Compare(left=Num(n=1), ops=[ Lt(), Lt(), ], comparators=[ Name(id='a', ctx=Load()), Num(n=10), ])), ])
-
class
Eq¶ -
class
NotEq¶ -
class
Lt¶ -
class
LtE¶ -
class
Gt¶ -
class
GtE¶ -
class
Is¶ -
class
IsNot¶ -
class
In¶ -
class
NotIn¶ Comparison operator tokens.
-
class
Call(func, args, keywords, starargs, kwargs)¶ A function call.
funcis the function, which will often be aNameorAttributeobject. Of the arguments:argsholds a list of the arguments passed by position.keywordsholds a list ofkeywordobjects representing arguments passed by keyword.starargsandkwargseach hold a single node, for arguments passed as*argsand**kwargs. These are removed in Python 3.5 - see below for details.
When compiling a Call node,
argsandkeywordsare required, but they can be empty lists.starargsandkwargsare optional.>>> parseprint("func(a, b=c, *d, **e)") # Python 3.4 Module(body=[ Expr(value=Call(func=Name(id='func', ctx=Load()), args=[Name(id='a', ctx=Load())], keywords=[keyword(arg='b', value=Name(id='c', ctx=Load()))], starargs=Name(id='d', ctx=Load()), # gone in 3.5 kwargs=Name(id='e', ctx=Load()))), # gone in 3.5 ]) >>> parseprint("func(a, b=c, *d, **e)") # Python 3.5 Module(body=[ Expr(value=Call(func=Name(id='func', ctx=Load()), args=[ Name(id='a', ctx=Load()), Starred(value=Name(id='d', ctx=Load()), ctx=Load()) # new in 3.5 ], keywords=[ keyword(arg='b', value=Name(id='c', ctx=Load())), keyword(arg=None, value=Name(id='e', ctx=Load())) # new in 3.5 ])) ])
You can see here that the signature of
Callhas changed in Python 3.5. Instead ofstarargs,Starrednodes can now appear inargs, andkwargsis replaced bykeywordnodes inkeywordsfor whichargisNone.
-
class
keyword(arg, value)¶ A keyword argument to a function call or class definition.
argis a raw string of the parameter name,valueis a node to pass in.
-
class
IfExp(test, body, orelse)¶ An expression such as
a if b else c. Each field holds a single node, so in that example, all three areNamenodes.
-
class
Attribute(value, attr, ctx)¶ Attribute access, e.g.
d.keys.valueis a node, typically aName.attris a bare string giving the name of the attribute, andctxisLoad,StoreorDelaccording to how the attribute is acted on.>>> parseprint('snake.colour') Module(body=[ Expr(value=Attribute(value=Name(id='snake', ctx=Load()), attr='colour', ctx=Load())), ])
Subscripting¶
-
class
Subscript(value, slice, ctx)¶ A subscript, such as
l[1].valueis the object, often aName.sliceis one ofIndex,SliceorExtSlice.ctxisLoad,StoreorDelaccording to what it does with the subscript.
-
class
Index(value)¶ Simple subscripting with a single value:
>>> parseprint("l[1]") Module(body=[ Expr(value=Subscript(value=Name(id='l', ctx=Load()), slice=Index(value=Num(n=1)), ctx=Load())), ])
-
class
Slice(lower, upper, step)¶ Regular slicing:
>>> parseprint("l[1:2]") Module(body=[ Expr(value=Subscript(value=Name(id='l', ctx=Load()), slice=Slice(lower=Num(n=1), upper=Num(n=2), step=None), ctx=Load())), ])
Comprehensions¶
-
class
ListComp(elt, generators)¶ -
class
SetComp(elt, generators)¶ -
class
GeneratorExp(elt, generators)¶ -
class
DictComp(key, value, generators)¶ List and set comprehensions, generator expressions, and dictionary comprehensions.
elt(orkeyandvalue) is a single node representing the part that will be evaluated for each item.generatorsis a list ofcomprehensionnodes. Comprehensions with more than oneforpart are legal, if tricky to get right - see the example below.
-
class
comprehension(target, iter, ifs, is_async)¶ One
forclause in a comprehension.targetis the reference to use for each element - typically aNameorTuplenode.iteris the object to iterate over.ifsis a list of test expressions: eachforclause can have multipleifs.New in version 3.6:
is_asyncindicates a comprehension is asynchronous (using anasync forinstead offor). The value is an integer (0 or 1).
>>> parseprint("[ord(c) for line in file for c in line]", mode='eval') # Multiple comprehensions in one.
Expression(body=ListComp(elt=Call(func=Name(id='ord', ctx=Load()), args=[
Name(id='c', ctx=Load()),
], keywords=[], starargs=None, kwargs=None), generators=[
comprehension(target=Name(id='line', ctx=Store()), iter=Name(id='file', ctx=Load()), ifs=[], is_async=0),
comprehension(target=Name(id='c', ctx=Store()), iter=Name(id='line', ctx=Load()), ifs=[], is_async=0),
]))
>>> parseprint("(n**2 for n in it if n>5 if n<10)", mode='eval') # Multiple if clauses
Expression(body=GeneratorExp(elt=BinOp(left=Name(id='n', ctx=Load()), op=Pow(), right=Num(n=2)), generators=[
comprehension(target=Name(id='n', ctx=Store()), iter=Name(id='it', ctx=Load()), ifs=[
Compare(left=Name(id='n', ctx=Load()), ops=[
Gt(),
], comparators=[
Num(n=5),
]),
Compare(left=Name(id='n', ctx=Load()), ops=[
Lt(),
], comparators=[
Num(n=10),
]),
],
is_async=0),
]))
>>> parseprint(("async def f():"
" return [i async for i in soc]")) # Async comprehension.
Module(body=[
AsyncFunctionDef(name='f', args=arguments(args=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[
Return(value=ListComp(elt=Name(id='i', ctx=Load()), generators=[
comprehension(target=Name(id='i', ctx=Store()), iter=Name(id='soc', ctx=Load()), ifs=[], is_async=1),
])),
], decorator_list=[], returns=None),
])
Statements¶
-
class
Assign(targets, value, type_comment)¶ An assignment.
targetsis a list of nodes, andvalueis a single node.type_commentis optional. It is a string containing the PEP 484 type comment associated to the assignment.>>> parseprint("a = 1 # type: int", type_comments=True) Module(body=[ Assign(targets=[ Name(id='a', ctx=Store()), ], value=Num(n=1)), type_comment="int" ], type_ignores=[])
Multiple nodes in
targetsrepresents assigning the same value to each. Unpacking is represented by putting aTupleorListwithintargets.>>> parseprint("a = b = 1") # Multiple assignment Module(body=[ Assign(targets=[ Name(id='a', ctx=Store()), Name(id='b', ctx=Store()), ], value=Num(n=1)), ])
>>> parseprint("a,b = c") # Unpacking Module(body=[ Assign(targets=[ Tuple(elts=[ Name(id='a', ctx=Store()), Name(id='b', ctx=Store()), ], ctx=Store()), ], value=Name(id='c', ctx=Load())), ])
-
class
AnnAssign(target, annotation, value, simple)¶ New in version 3.6.
An assignment with a type annotation.
targetis a single node and can be aName, aAttributeor aSubscript.annotationis the annotation, such as aStrorNamenode.valueis a single optional node.simpleis a boolean integer set to True for aNamenode intargetthat do not appear in between parenthesis and are hence pure names and not expressions.>>> parseprint("c: int") Module(body=[ AnnAssign(target=Name(id='c', ctx=Store()), annotation=Name(id='int', ctx=Load()), value=None, simple=1), ])
>>> parseprint("(a): int = 1") # Expression like name Module(body=[ AnnAssign(target=Name(id='a', ctx=Store()), annotation=Name(id='int', ctx=Load()), value=Num(n=1), simple=0), ])
>>> parseprint("a.b: int") # Attribute annotation Module(body=[ AnnAssign(target=Attribute(value=Name(id='a', ctx=Load()), attr='b', ctx=Store()), annotation=Name(id='int', ctx=Load()), value=None, simple=0), ])
>>> parseprint("a[1]: int") # Subscript annotation Module(body=[ AnnAssign(target=Subscript(value=Name(id='a', ctx=Load()), slice=Index(value=Num(n=1)), ctx=Store()), annotation=Name(id='int', ctx=Load()), value=None, simple=0), ])
Changed in version 3.8:
type_commentwas introduced in Python 3.8
-
class
AugAssign(target, op, value)¶ Augmented assignment, such as
a += 1. In that example,targetis aNamenode fora(with theStorecontext), op isAdd, andvalueis aNumnode for 1.targetcan beName,SubscriptorAttribute, but not aTupleorList(unlike the targets ofAssign).
-
class
Print(dest, values, nl)¶ Print statement, Python 2 only.
destis an optional destination (forprint >>dest.valuesis a list of nodes.nl(newline) is True or False depending on whether there’s a comma at the end of the statement.
-
class
Raise(exc, cause)¶ Raising an exception, Python 3 syntax.
excis the exception object to be raised, normally aCallorName, orNonefor a standaloneraise.causeis the optional part foryinraise x from y.In Python 2, the parameters are instead
type, inst, tback, which correspond to the oldraise x, y, zsyntax.
-
class
Assert(test, msg)¶ An assertion.
testholds the condition, such as aComparenode.msgholds the failure message, normally aStrnode.
-
class
Delete(targets)¶ Represents a
delstatement.targetsis a list of nodes, such asName,AttributeorSubscriptnodes.
-
class
Pass¶ A
passstatement.
Other statements which are only applicable inside functions or loops are described in other sections.
Imports¶
-
class
ImportFrom(module, names, level)¶ Represents
from x import y.moduleis a raw string of the ‘from’ name, without any leading dots, orNonefor statements such asfrom . import foo.levelis an integer holding the level of the relative import (0 means absolute import).
-
class
alias(name, asname)¶ Both parameters are raw strings of the names.
asnamecan beNoneif the regular name is to be used.
>>> parseprint("from ..foo.bar import a as b, c")
Module(body=[
ImportFrom(module='foo.bar', names=[
alias(name='a', asname='b'),
alias(name='c', asname=None),
], level=2),
])
Control flow¶
Note
Optional clauses such as else are stored as an empty list if they’re
not present.
-
class
If(test, body, orelse)¶ An
ifstatement.testholds a single node, such as aComparenode.bodyandorelseeach hold a list of nodes.elifclauses don’t have a special representation in the AST, but rather appear as extraIfnodes within theorelsesection of the previous one.
-
class
For(target, iter, body, orelse, type_comment)¶ A
forloop.targetholds the variable(s) the loop assigns to, as a singleName,TupleorListnode.iterholds the item to be looped over, again as a single node.bodyandorelsecontain lists of nodes to execute. Those inorelseare executed if the loop finishes normally, rather than via abreakstatement.type_commentis optional. It is a string containing the PEP 484 type comment associated to for statement.Changed in version 3.8:
type_commentwas introduced in Python 3.8
In [2]: %%dump_ast
...: for a in b:
...: if a > 5:
...: break
...: else:
...: continue
...:
Module(body=[
For(target=Name(id='a', ctx=Store()), iter=Name(id='b', ctx=Load()), body=[
If(test=Compare(left=Name(id='a', ctx=Load()), ops=[
Gt(),
], comparators=[
Num(n=5),
]), body=[
Break(),
], orelse=[
Continue(),
]),
], orelse=[]),
])
-
class
Try(body, handlers, orelse, finalbody)¶ tryblocks. All attributes are list of nodes to execute, except forhandlers, which is a list ofExceptHandlernodes.New in version 3.3.
-
class
TryFinally(body, finalbody)¶ -
class
TryExcept(body, handlers, orelse)¶ tryblocks up to Python 3.2, inclusive. Atryblock with bothexceptandfinallyclauses is parsed as aTryFinally, with the body containing aTryExcept.
-
class
ExceptHandler(type, name, body)¶ A single
exceptclause.typeis the exception type it will match, typically aNamenode (orNonefor a catch-allexcept:clause).nameis a raw string for the name to hold the exception, orNoneif the clause doesn’t haveas foo.bodyis a list of nodes.In Python 2,
namewas aNamenode withctx=Store(), instead of a raw string.
In [3]: %%dump_ast
...: try:
...: a + 1
...: except TypeError:
...: pass
...:
Module(body=[
Try(body=[
Expr(value=BinOp(left=Name(id='a', ctx=Load()), op=Add(), right=Num(n=1))),
], handlers=[
ExceptHandler(type=Name(id='TypeError', ctx=Load()), name=None, body=[
Pass(),
]),
], orelse=[], finalbody=[]),
])
-
class
With(items, body, type_comment)¶ A
withblock.itemsis a list ofwithitemnodes representing the context managers, andbodyis the indented block inside the context.type_commentis optional. It is a string containing the PEP 484 type comment associated to the assignment (added in Python 3.8).Changed in version 3.3: Previously, a
Withnode hadcontext_exprandoptional_varsinstead ofitems. Multiple contexts were represented by nesting a secondWithnode as the only item in thebodyof the first.Changed in version 3.8:
type_commentwas introduced in Python 3.8
-
class
withitem(context_expr, optional_vars)¶ A single context manager in a
withblock.context_expris the context manager, often aCallnode.optional_varsis aName,TupleorListfor theas foopart, orNoneif that isn’t used.
In [3]: %%dump_ast
...: with a as b, c as d:
...: do_things(b, d)
...:
Module(body=[
With(items=[
withitem(context_expr=Name(id='a', ctx=Load()), optional_vars=Name(id='b', ctx=Store())),
withitem(context_expr=Name(id='c', ctx=Load()), optional_vars=Name(id='d', ctx=Store())),
], body=[
Expr(value=Call(func=Name(id='do_things', ctx=Load()), args=[
Name(id='b', ctx=Load()),
Name(id='d', ctx=Load()),
], keywords=[], starargs=None, kwargs=None)),
]),
])
Function and class definitions¶
-
class
FunctionDef(name, args, body, decorator_list, returns, type_comment)¶ A function definition.
nameis a raw string of the function name.argsis aargumentsnode.bodyis the list of nodes inside the function.decorator_listis the list of decorators to be applied, stored outermost first (i.e. the first in the list will be applied last).returnsis the return annotation (Python 3 only).type_commentis optional. It is a string containing the PEP 484 type comment of the function (added in Python 3.8)
Changed in version 3.8:
type_commentwas introduced in Python 3.8
-
class
Lambda(args, body)¶ lambdais a minimal function definition that can be used inside an expression. UnlikeFunctionDef,bodyholds a single node.
-
class
arguments(posonlyargs, args, vararg, kwonlyargs, kw_defaults, kwarg, defaults)¶ The arguments for a function. In Python 3:
args,posonlyargsandkwonlyargsare lists ofargnodes.varargandkwargare singleargnodes, referring to the*args, **kwargsparameters.kw_defaultsis a list of default values for keyword-only arguments. If one isNone, the corresponding argument is required.defaultsis a list of default values for arguments that can be passed positionally. If there are fewer defaults, they correspond to the last n arguments.
Changed in version 3.8:
posonlyargswas introduced in Python 3.8Changed in version 3.4: Up to Python 3.3,
varargandkwargwere raw strings of the argument names, and there were separatevarargannotationandkwargannotationfields to hold their annotations.Also, the order of the remaining parameters was different up to Python 3.3.
In Python 2, the attributes for keyword-only arguments are not needed.
-
class
arg(arg, annotation, type_comment)¶ A single argument in a list; Python 3 only.
argis a raw string of the argument name,annotationis its annotation, such as aStrorNamenode.type_commentis optional. It is a string containing the PEP 484 type comment of the argument.In Python 2, arguments are instead represented as
Namenodes, withctx=Param().
In [52]: %%dump_ast
....: @dec1
....: @dec2
....: def f(a: 'annotation', b=1, c=2, *d, e, f=3, **g) -> 'return annotation':
....: pass
....:
Module(body=[
FunctionDef(name='f', args=arguments(posonlyargs=[],
args=[
arg(arg='a', annotation=Str(s='annotation')),
arg(arg='b', annotation=None),
arg(arg='c', annotation=None),
], vararg=arg(arg='d', annotation=None), kwonlyargs=[
arg(arg='e', annotation=None),
arg(arg='f', annotation=None),
], kw_defaults=[
None,
Num(n=3),
], kwarg=arg(arg='g', annotation=None), defaults=[
Num(n=1),
Num(n=2),
]), body=[
Pass(),
], decorator_list=[
Name(id='dec1', ctx=Load()),
Name(id='dec2', ctx=Load()),
], returns=Str(s='return annotation')),
])
.. versionchanged:: 3.8
``type_comment`` was introduced in Python 3.8
-
class
Return(value)¶ A
returnstatement.
-
class
Yield(value)¶ -
class
YieldFrom(value)¶ A
yieldoryield fromexpression. Because these are expressions, they must be wrapped in aExprnode if the value sent back is not used.New in version 3.3: The
YieldFromnode type.
-
class
Global(names)¶ -
class
Nonlocal(names)¶ globalandnonlocalstatements.namesis a list of raw strings.
-
class
ClassDef(name, bases, keywords, starargs, kwargs, body, decorator_list)¶ A class definition.
nameis a raw string for the class namebasesis a list of nodes for explicitly specified base classes.keywordsis a list ofkeywordnodes, principally for ‘metaclass’. Other keywords will be passed to the metaclass, as per PEP-3115.starargsandkwargsare each a single node, as in a function call. starargs will be expanded to join the list of base classes, and kwargs will be passed to the metaclass. These are removed in Python 3.5 - see below for details.bodyis a list of nodes representing the code within the class definition.decorator_listis a list of nodes, as inFunctionDef.
In [59]: %%dump_ast
....: @dec1
....: @dec2
....: class foo(base1, base2, metaclass=meta):
....: pass
....:
Module(body=[
ClassDef(name='foo', bases=[
Name(id='base1', ctx=Load()),
Name(id='base2', ctx=Load()),
], keyword=
keyword(arg='metaclass', value=Name(id='meta', ctx=Load())),
], starargs=None, # gone in 3.5
kwargs=None, # gone in 3.5
body=[
Pass(),
], decorator_list=[
Name(id='dec1', ctx=Load()),
Name(id='dec2', ctx=Load()),
]),
])
Async and await¶
New in version 3.5: All of these nodes were added. See the What’s New notes on the new syntax.
-
class
AsyncFunctionDef(name, args, body, decorator_list, returns, type_comment)¶ An
async deffunction definition. Has the same fields asFunctionDef.
-
class
Await(value)¶ An
awaitexpression.valueis what it waits for. Only valid in the body of anAsyncFunctionDef.
In [2]: %%dump_ast
...: async def f():
...: await g()
...:
Module(body=[
AsyncFunctionDef(name='f', args=arguments(args=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[
Expr(value=Await(value=Call(func=Name(id='g', ctx=Load()), args=[], keywords=[]))),
], decorator_list=[], returns=None),
])
-
class
AsyncFor(target, iter, body, orelse)¶ -
class
AsyncWith(items, body)¶ async forloops andasync withcontext managers. They have the same fields asForandWith, respectively. Only valid in the body of anAsyncFunctionDef.
Top level nodes¶
Those nodes are at the top-level of the AST. The manner by which you obtain the AST determine the top-level node used.
-
class
Module(stmt* body, type_ignore *type_ignores)¶ The root of the AST for code parsed using the exec mode. The
bodyattribute is a list of nodes.type_ignoresis a list ofTypeIgnoreindicating the lines on whichtype: ignorecomments are present. If type comments are not stored in the ast it is an empty list.Changed in version 3.8:
type_ignoreswas introduced in Python 3.8 and is mandatory when manually creating aModule
-
class
Interactive(stmt* body)¶ The root of the AST for single statements or expressions parsed using the single mode. The
bodyattribute is a list of nodes.
Working on the Tree¶
ast.NodeVisitor is the primary tool for ‘scanning’ the tree. To use it,
subclass it and override methods visit_Foo, corresponding to the node classes
(see Meet the Nodes).
For example, this visitor will print the names of any functions defined in the given code, including methods and functions defined within other functions:
class FuncLister(ast.NodeVisitor):
def visit_FunctionDef(self, node):
print(node.name)
self.generic_visit(node)
FuncLister().visit(tree)
Note
If you want child nodes to be visited, remember to call
self.generic_visit(node) in the methods you override.
Alternatively, you can run through a list of all the nodes in the tree using
ast.walk(). There are no guarantees about the order in which
nodes will appear. The following example again prints the names of any functions
defined within the given code:
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
print(node.name)
You can also get the direct children of a node, using ast.iter_child_nodes().
Remember that many nodes have children in several sections: for example, an
If has a node in the test field, and list of nodes in body
and orelse. ast.iter_child_nodes() will go through all of these.
Finally, you can navigate directly, using the attributes of the nodes.
For example, if you want to get the last node within a function’s body, use
node.body[-1]. Of course, all the normal Python tools for iterating and
indexing work. In particular, isinstance() is very useful for checking
what nodes are.
Inspecting nodes¶
The ast module has a couple of functions for inspecting nodes:
ast.iter_fields()iterates over the fields defined for a node.ast.get_docstring()gets the docstring of aFunctionDef,ClassDeforModulenode.ast.dump()returns a string showing the node and any children. See also the pretty printer used in this guide.
Modifying the tree¶
The key tool is ast.NodeTransformer. Like ast.NodeVisitor, you
subclass this and override visit_Foo methods. The method should return the
original node, a replacement node, or None to remove that node from the tree.
The ast module docs have this example, which rewrites name lookups, so
foo becomes data['foo']:
class RewriteName(ast.NodeTransformer):
def visit_Name(self, node):
return ast.copy_location(ast.Subscript(
value=ast.Name(id='data', ctx=ast.Load()),
slice=ast.Index(value=ast.Str(s=node.id)),
ctx=node.ctx
), node)
tree = RewriteName().visit(tree)
When replacing a node, the new node doesn’t automatically have the lineno
and col_offset parameters. The example above doesn’t deal with this
completely: it copies the location to the Subscript node, but not
to any of the newly created children of that node. See Fixing locations.
Be careful when removing nodes. You can quite easily remove a node from a
required field, such as the test field of an If node. Python
won’t complain about the invalid AST until you try to compile() it, when
a TypeError is raised.
Examples of working with ASTs¶
Working versions of these examples are in the examples directory of the source repository.
Wrapping integers¶
In Python code, 1/3 would normally be evaluated to a floating-point number,
that can never be exactly one third. Mathematical software, like SymPy or Sage, often wants to use
exact fractions instead. One way to make 1/3 produce an exact fraction is
to wrap the integer literals 1 and 3 in a class:
class IntegerWrapper(ast.NodeTransformer):
"""Wraps all integers in a call to Integer()"""
def visit_Num(self, node):
if isinstance(node.n, int):
return ast.Call(func=ast.Name(id='Integer', ctx=ast.Load()),
args=[node], keywords=[])
return node
tree = ast.parse("1/3")
tree = IntegerWrapper().visit(tree)
# Add lineno & col_offset to the nodes we created
ast.fix_missing_locations(tree)
# The tree is now equivalent to Integer(1)/Integer(3)
# We would also need to define the Integer class and its __truediv__ method.
See wrap_integers.py for a working demonstration.
Simple test framework¶
These two manipulations let you write test scripts as a simple series of
assert statements. First, we need to run the statements one by one,
so execution doesn’t stop at the first test failure:
tree = ast.parse(code)
lines = [None] + code.splitlines() # None at [0] so we can index lines from 1
test_namespace = {}
for node in tree.body:
wrapper = ast.Module(body=[node])
try:
co = compile(wrapper, "<ast>", 'exec')
exec(co, test_namespace)
except AssertionError as e:
print("Assertion failed on line", node.lineno, ":")
print(lines[node.lineno])
# If the error has a message, show it.
if e.args:
print(e)
print()
Next, we transform assert a == b into a function call assert_equal(a, b),
which can give more information about the failure. We could turn many other
assertions into similar function calls.
class AssertCmpTransformer(ast.NodeTransformer):
def visit_Assert(self, node):
if isinstance(node.test, ast.Compare) and \
len(node.test.ops) == 1 and \
isinstance(node.test.ops[0], ast.Eq):
call = ast.Call(func=ast.Name(id='assert_equal', ctx=ast.Load()),
args=[node.test.left, node.test.comparators[0]],
keywords=[])
# Wrap the call in an Expr node, because the return value isn't used.
newnode = ast.Expr(value=call)
ast.copy_location(newnode, node)
ast.fix_missing_locations(newnode)
return newnode
# Remember to return the original node if we don't want to change it.
return node
See test_framework/run.py for a working demonstration of both parts.
Real projects¶
- pytest uses the AST to produce useful error messages when assertions fail.
- astsearch lets you search through
Python code based on semantics rather than text, e.g. to find every
+= 1in your code. - astpath is a more powerful search tool using XPath expressions on Python code.
- bellybutton is a linter designed to be readily customised.
See also
- Python AST explorer
- Web-based AST viewer: paste some code in and see the AST
- Thonny
- A Python IDE with AST explorer built in (Main menu => View => AST)
- showast
- An IPython extension to show ASTs in Jupyter notebooks
- Instrumenting the AST
- Using AST tools to assess code coverage