A lot of articles have been published on Abstract Syntax Tree in Python, but most of them are limited to basic AST traversal and in-place modification. These examples may be good for basic understanding of AST, but there are no examples of practical use of AST traversal. Many times I read such articles and wondered how can I use it to solve real-world problems.
In this article I’ll show how to create graph of Django signals using AST traversal technique and NetworkX library.
Use of signals in Django allow you to create event-driven, loosely-coupled backends, which is crucial when dealing with huge projects, consisting of dozens of small django applications. However, all event-driven systems share the same flaw - it’s almost impossible to tell what’s going on in the system upon receiving an event.
Partially this can be solved if we could answer the question - Who listens to what events (signals)? To answer this question we have to use some kind of analysis - whether static source code analysis or analyzing signals at runtime. As always, there are pros and cons of both approaches, but in this post I will introduce you to static source code analysis using AST traversal.
Dealing with AST
There is a good module in Python’s stdlib for dealing with Python’s own AST -
ast. It allows you to parse
Python source code into AST and provides two usefull classes for tree
analysis and transformation - NodeVisitor and NodeTransformer.
For tree traversal we’ll use NodeVisitor.
To use this classes we have to define visit_{NodeType} methods, where
NodeType is a type of tree node type we are interested in.
Full list of available node types is available as part of ast module’s
Abstract Grammar section.
We need to define two methods:
visit_FunctionDef- to find @receiver decorated functions;visit_Assign- to find all created Signal() instances
For graph building we’ll use networkx.DiGraph().
import ast
import networkx as nx
class Visitor(ast.NodeVisitor):
def __init__(self):
self.graph = nx.DiGraph()
def visit_FunctionDef(self, node):
...
def visit_Assign(self, node):
...
We can use only visit_FunctionDef, but it that case we’ll be unable to find
unused signals.
Finding Nemo receivers
Every function definition passed to visit_FunctionDef have a list of
decorators, which it is wrapped with. We have to find function definitions
wrapped with @receive decorator.
Let’s look at some code.
First of all we need to find parametrized decorator named receiver
def visit_FunctionDef(self, node):
for dec in node.decorator_list:
if (isinstance(dec, ast.Call) and isinstance(dec.func, ast.Name)
and dec.func.id == 'receiver'):
Next, we have to find signal name which is passed to @receive decorator.
Signal name is passed as an argument, which can be described by AST in
several ways:
ast.Attribute:@receiver(billing_app_signals.my_signal)ast.Name:@receiver(my_signal)
There are other ways to pass signal name (e.g. billing_app.signals.my_signal),
but we’ll ignore them for the sake of simplisity.
arg = dec.args[0]
if isinstance(arg, ast.Attribute):
signal = arg.attr
elif isinstance(arg, ast.Name):
signal = arg.id
receiver = node.name
After managing to find signal name we need to add signal and receiver (which is the name of current node) as graph vertices and connect them with an edge.
self.graph.add_node(signal, role='signal')
self.graph.add_node(receiver, role='receiver')
self.graph.add_edge(signal, receiver)
Looking for unused signals
Every expression passed to visit_Assign stores target and value
attributes of assign operator.
Signal instantiation looks like this:
signal_name = Signal(...)
We need to find all ast.Assign expressions, where value is an ast.Call
of Signal identificator and get target variable name.
After that we are free to add found signal name to the graph.
def visit_Assign(self, node):
val = node.value
if (isinstance(val, ast.Call) and isinstance(val.func, ast.Name)
and val.func.id == 'Signal'):
signal = node.targets[0].id
self.graph.add_node(signal, role='signal')
Wrapping up
Knowing how to deal with AST could be useful for solving various problems dealing with source code analysis, which would be much harder to solve using other tools (such as regular expressions).
Signal graph example in my opinion shows that AST in Python is a very powerful, flexible tool that can be used to deal with everyday problems. Quite complicated task of finding signal-receiver relations can be solved in less than 30 lines of code - pretty impressive. The main thing is to understand what you want to find out.
If you want to learn more about AST in Python I suggest you to start with https://greentreesnakes.readthedocs.io.
But if you really want to understand AST and how to deal with it you should spend time playing with it analyzing your own small code snippets. There is also a great tool for pretty printing AST - astpretty.
Full example
import sys
import ast
import networkx as nx
class Visitor(ast.NodeVisitor):
def __init__(self):
self.graph = nx.DiGraph()
def write_graph(self, filename):
nx.write_gexf(self.graph, filename)
def visit_FunctionDef(self, node):
for dec in node.decorator_list:
if (isinstance(dec, ast.Call) and isinstance(dec.func, ast.Name)
and dec.func.id == 'receiver'):
arg = dec.args[0]
if isinstance(arg, ast.Attribute):
signal = arg.attr
elif isinstance(arg, ast.Name):
signal = arg.id
receiver = node.name
self.graph.add_node(signal, role='signal')
self.graph.add_node(receiver, role='receiver')
self.graph.add_edge(signal, receiver)
def visit_Assign(self, node):
val = node.value
if (isinstance(val, ast.Call) and isinstance(val.func, ast.Name)
and val.func.id == 'Signal'):
signal = node.targets[0].id
self.graph.add_node(signal, role='signal')
if __name__ == '__main__':
if len(sys.argv) < 2:
sys.exit('no source files specified!')
files = sys.argv[1:]
visitor = Visitor()
for fn in files:
with open(fn) as f:
module_ast = ast.parse(f.read())
visitor.visit(module_ast)
visitor.write_graph('output.gexf')