I think it was less than a week after I announced my little Android Manifest auditor tool, Manitree, that Anthony Desnos, the developer of Androguard, sent me a message in the tone of “hey, why didn’t you use Androguard for that?” If nothing else, why didn’t I use Andoguard’s native AXML converter?
Andoguard is this immense Android app analysis project. If you take a look at the first page, you may get overwhelmed pretty quickly. I hope Anthony doesn’t take this the wrong way because it’s an impressive tool when I’ve seen it working, and it’s great for all kinds of things besides malware analysis. For instance it can analyze apks, diff binary apps, visualize the flow of an app between classes — fun stuff. But for my dinky project, most of the work was focused on the AndroidManifest.xml file. But the simplest feature was most impressive to me: a native python Android XML file format converter. As of writing this, I’ve not seen someone publicly do this.
Mandatory technical background: The AndroidManifest.xml file is stored in a format called the Android XML format or AXML. This is an optimized binary format and not a lot of fun to look through. So tools like AXMLPrinter, AXMLPrinter2, aapt, and apktool converted these files back to a standard XML format that it was originally created in. This format was created to link to the resources.arsc file without having to duplicate efforts. For instance instead of calling the name of a string value over and over in a Manifest, the resources.arsc file is linked to it so actually what you’ll see in the binary is the location of the value in this file.
For the reason above, this weekend, a few of us have started to extract Androguard’s AXML into a separate project that aims to be a native python library for parsing AXML files. It’s up on github and is still in progress but the goal is that it can be useful as a standalone python module without having to import all of Androguard. https://github.com/antitree/AxmlParserPY
Here’s a quick example that takes in AndroidManifest.xml in binary format and spits it out in xml:
from xml.dom import minidom
ap = axmlprinter.AXMLPrinter(open('AndroidManifest.xml', 'rb').read())
buff = minidom.parseString(ap.getBuff()).toxml()
if __name__ == "__main__":