Designing a safe way for users to input code is a challenge, and allowing the user to input code or codelike strings opens the door for attacks. This guide will allow you to open an endpoint that safely takes code from an untrusted user input using the module AST in Python. The main use I have seen for the AST module is code analysis and assessment for dev-ops. This article lays out an implementation using the AST module to parameterize user inputs and make sure the user has restricted access, which makes our endpoint safer. 

What is AST:

First, an introduction to the AST module. AST stands for Abstract Syntax Tree and is a tree representation of source code. In Python the AST module has classes for traversing and understanding code. It turns a string into a basic representation of what the code is. In Python, the AST library will call, compile, and return the root node of the tree. Later we will see that you can pass this root node to a custom node visitor class. In this class you can create functions that will get run when the tree visits a specific type of node e.g. an if statement node.

From AST Wikipedia

It is important to use the AST method to parameterize the input so the user is not directly running the code. This system is designed to take untrusted user input. By designing a system to be used in an open environment, the developer should be aware of two situations. One is the user does not know what they are doing and could get the system into a bad state. The other, and more dangerous scenario, is the user has bad intentions and wants to attack the endpoint. By locking down the input, we can limit the scope of a bad user drastically. If a developer relies on re-writing parameterization for user inputs, it may lead to unforeseen vulnerabilities and attack vectors (see SQL injection attacks). By using a standard parameterization tool that analyzes strings before they are put in a dangerous place we make a safer endpoint.  The AST model is perfect for taking a string and turning it into Python without directly running the input.

Before we jump into our implementation let’s talk about some of the AST helpers. First is ast.literal_eval() which allows safe evaluation of Python literals. This is a little too restrictive for our implementation; however it is a great helper for securing an application. We will focus on ast.parse() which can take a string or a filename, and returns an AST node. We can directly compile this node, using Python’s builtin function. We are looking to inspect the tree representation and will do some more set up and make a custom ast.NodeVisitor. Below is the ast.parse method from Python’s source. 

ast.parse method from the Python (v.3.12.0 alpha 0) source. Be aware that it calls the Python compiler.


Here is an example scenario. We will create a music app that is controlled through code given by user input. We will expose the endpoints play, stop, skip, and get_song_length. We will have one argument for these functions: song. We will also keep track of a value with self.value.  This way we can act on the variable value without the need to pass it around within the function calls. These functions are a good subset of example use cases, where possible functions can execute an action or be a transformation where a value is given back to the user. 

To start we will look at FunctionFinderVisitor class. From official Python documentation: “A node visitor base class that walks the abstract syntax tree and calls a visitor function for every node found”. We will create a class that inherits from NodeVisitor. Within this class FunctionFinderVisitor, we create visitor functions which will be called when we walk to a node of a type. An example is creating a visit_Call method which handles function calls in an ast node visit. These function calls are either by the name e.g. play() or an attribute e.g play().get_song_length(). If you are concerned about either name or attributes of those you can create  visit_Attribute and visit_Name methods to track those as well. While these functions won’t return anything, this type of function chaining will still work due to how we implement visit_Call.  Once we have the specific methods that we care about tracking, we can insert some tracking logic. The tracking logic looks at the AST node attributes and assigns them to object variables. For our use case we will keep track of function names, arguments and keyword arguments. When saving the names of the functions and keywords make sure that nothing is saved automatically (outside a checking if statement). An example of this is when we check the arguments if it is a string. If that argument is any other type we do not want to save it and may even want to print a warning. Only by looking at those three tokens, and saving them as strings we are then able to restrict the user input. A full NodeVisitor class is implemented below. 

Implementing FunctionFinderVisitor class:

import ast
class FunctionFinderVisitor(ast.NodeVisitor):
   def __init__(self, code):
       self.code = code
       self.functions = []
   def visit_Call(self, node: ast.AST) -> None:
       handles all function Calls within an ast node visit
       kwargs = {}
       args = []
       if isinstance(node.func, ast.Name):
           # is first function call
           function_name =
           # is attribute
           function_name = node.func.attr
       for kwarg in node.keywords:
           kwarg_name = kwarg.arg
           kwarg_value = kwarg.value.value
           kwargs[kwarg_name] = kwarg_value
       for arg in node.args:
           # the functions defined below only take strings
           if isinstance(arg, ast.Str):
               # arg.s is type string
               str_arg = arg.s
           # you could look for other types here
               # arguement is not a type that we want
               print("Passing an argument type thats not allowed")
       function_package = {function_name: {'args': args, 'kwargs': kwargs}}
       # inserting functions to preserve order
       self.functions.insert(0, function_package)

 The controller for all of this will be implemented in the MusicPlayer class. This class has all the functions exposed to the user play, stop, skip, get_song_length, as well as developer methods: unknown_func, analyze_value, and importantly handle_func.  By only doing specific actions we keep our code base clean and readable. It also allows implementation to easily update and change.

 This is also where we call the ast.parse method, and initialize the FunctionFinderVisitor class. We call the ast.parse method with mode='eval'. This further locks down our implementation. The options for mode are ‘exec’, ‘eval’, ‘single’ or ‘func_type’. These get passed to the compiler (see above). The difference is in scope. For this blog post all you need to know is that eval is more restricted than exec, and fits our use case because it only evaluates a single expression. ‘single’ is similar to eval, and func_type relates to Python type hints (PEP-484). If we were to implement a full Python sandbox we would change that to exec. While we are passing these as arguments to the compiler, we will never call the functions eval or exec with code from a user.  

We have saved the function calls and their arguments (after passing to FunctionFinderVisitor), we will compare these to our whitelist self.available_functions in MusicPlayer. Good security for exposed applications includes a whitelist, which is a restricted list of actions the user can do. These are better than a blacklist, which restricts user actions based on what they cannot do (see appendix for why this is better). If the user’s input is not in the whitelist we will call self.unknown_function() which can be configured to alert, but for this post we will raise an exception. This is important because it tells us the user is trying to access something outside our whitelist. The whitelist is a dictionary where the keys are the function names and values are references to the functions. After comparing we call the function with given arguments. We call the functions in the order in which they are pulled in with the arguments. This is where we call self.handle_func() to run our exposed functions with the arguments.

For this implementation we will return a value to the user for the function get_song_length. We do the work we need to and save self.value. After the function finishes we can analyze the results by looking at self.value. If needed we return the transformed value back to the user. Depending on implementation this could be as simple as printing self.value

Implementing MusicPlayer class:

class MusicPlayer:
   def __init__(self, function_to_call: str, value=None) -> None:
       :param function_to_call: function defined in this class
       :param inputs: args and kwargs for the functions
       self.value = value
       self.available_functions = {
           "stop": self.stop,
           "skip": self.skip,
           "get_song_length": self.get_song_length
       if isinstance(function_to_call, str):
           parsed_func = ast.parse(function_to_call, mode='eval')
           mv = FunctionFinderVisitor(function_to_call)
           self.h = self.handle_function(mv.functions)
           self.h = self.handle_function(function_to_call)
   def handle_function(self, function_package: dict) -> None:
       runs the mapped functions after parsing
       :param function_package: a list of dictionaries which hold the functions and parameters
       for func in function_package:
           if isinstance(func, str):
               transform = self.available_functions.get(func, self.unknown_func)
           for func_name, call_args in func.items():
               transform = self.available_functions.get(func_name, self.unknown_func)
               if 'args' in call_args:
                   transform(*call_args['args'], **call_args["kwargs"])
   def analyze_value(self) -> None:
       # Implement analytics here
   def play(self, song: str) -> None:
       # logic to play song
       self.value = "playing song"
   def stop(self, song: str) -> None:
       # stop current song
       self.value = "stopping song"
   def get_song_length(self, song: str) -> None:
       # can be a transform function, acts on a value
       self.value = len(song)
   def skip(self) -> None:
       # skip to next song
       self.value = "skipping song"
   def unknown_func(self, *args, **kwargs):
       print("Calling a function not in whitelist")
       self.value = "No No No"
       # one could implement a custom Exception for this
       raise Exception

Now that you have your handlers setup you can connect this to a framework such as flask. Your flask endpoint could simply take a function_to_call as a text box and return MusicPlayer(function_to_call).value and it would return the value back to the user.  

A study of Attacks:

There are some key points I would like to point out. First is that eval or exec is never called directly on the user input code. The closest place that we get to this is when calling ast.parse. The ast.parse method will call compile. It is important to keep this in mind, and in the documentation we get this warning: “It is possible to crash the Python interpreter with a sufficiently large/complex string due to stack depth limitations in Python’s AST compiler.”  

A study of attacks would include a variety of Python sandbox escapes. Those attacks stem from the fact that the user input at some point is executed. Below is my journey in testing my own code.

When I first developed this method I focused more on securing the functions rather than the arguments. So I stood up a flask endpoint of my own so that I could “attack” my own code, and  I found some interesting results. At first I had been saving the arguments and directly passing them to the function. This does give an opening to an attacker, depending on the function that was being run, and how the inputs would be used. I looked at the get_song_length function as it directly calls len() on the user’s input for the argument song. I found an interesting result. I was being hindered not by roadblocks which I had purposely set up myself but by roadblocks when we walked the tree. An example is instead of passing a string to the get_song_length method I passed a proof of concept to sandbox escaping: get_song_length(__builtins__.__dict__['__import__']('subprocess').check_output('whoami')). That structure is a common way of escaping sandboxes because many people delete builtins, but not __builtins__ (again whitelist better than blacklist). The expected result for an open sandbox may be a return of the name of a user on the server. The result against my endpoint was an attribution error. `AttributeError: 'Subscript' object has no attribute 'attr'. This result was true whether the parse mode was exec or eval. While it may be good to implement a more realized security feature, a typical attack was still stopped by using a parser on the first try. This is a good example of why using a well established parser is better than a developer creating their own. Even though as a developer I had not focused on this particular attack, I still had a layer of protection. I also found out that it is much more fun to test your code if you throw an endpoint in front of it and treat it like a CTF (Capture the Flag competitions). 

Another type of attack is against the Python compiler. In thinking about these attacks I ask myself: ‘What types of strings can be passed to the Python compile function. With consideration for the length of this post, I will make a general statement and say: the Python compile function will not be able to execute source code. However we are running the compile in an active Python interpreter. Careful consideration should be made so that users do not pass arbitrarily long strings to the parse method.  An example of when Python does something similar is raising an IndentationError when there are too many levels of indentation. If you need arbitrarily long strings you may also call a Python subprocess with compile e.g.: python -m py_compile, however this opens another attack vector, buyer beware. A greater study about breaking the Python compile will be needed, as I was able to create a memory error (with a long string). This can be addressed in subsequent posts.

Related Work

A search would reveal Tim Savages implementation from 2019. In his example he uses control flow while I visit each node with custom methods. It is my understanding that his example is not a full implementation, and was shortened for the talk. Without comment on how his code is fully implemented I will compare results from a test of that code to mine. When running Tim’s implementation on the command line, using a builtins sandbox escape, one could get any code to run. In fact by running len(__builtins__['__import__']('subprocess').run('bash') one could get a complete terminal. I notified Tim when I found this out and he assured me that his final implementation was mitigated against this attack and is no longer in production; for those reasons I am comfortable including the case here.

Next Steps:

A good question for further study would be can we generalize this to a total Python sandbox? What would it take to implement a code interview tool? In my research for this I came across many references to GO’s ast package. How would a GO implementation look vs Python? Which one is safer as an exposed endpoint to an untrusted user? What would a study of the Python compiler look like? I hope to go into these topics in subsequent blog posts.


When I say a whitelist is better than a blacklist in this context I mean more secure. It is more secure because you are only specifying specific actions a user can take. Nothing more. A blacklist is a bit open ended, there may be something that the developer forgot to add, and in some cases can make a security catch-up game where users (or attackers) find things that should have been in a blacklist to start. A whitelist however allows for adding more features, and the ability to review their security.