Been looking into something interesting (regarding Python anyway) today. I currently want to programatically analyse and edit python source code files. At first I thought I could just do some old-fashioned string parsing, but this would quickly become a big pain in the neck. Instead I found the AST module, which allows for parsing of .py files and representing code as an Abstract Syntax Tree. This structure can then be queried, edited and executed. Turning it back into python source code is not supported though, so I’ll need to find another way to deal with this. It does support line number and indentation information, so that could prove to be useful if I decide to write my own converter! Another complication is that AST does not retain comments, but I guess I can do some normal string parsing (or use the tokenize module) to store these, and finally inject them back into the source code. Seems like things are never quite as straight forward as I would like them to be! :)


Quick Update:

Have found a ‘unparser’ online which converts AST trees back into code. It works great, but appears to ignore formatting of any type. But, half of the way there!!