Exploring Android's binary XML file format

Submitted by olaf on 2015-05-11
Tags: android c++ xml

There are several apps on Google’s Play store, which basically allow to remove permissions from apps. They do this by modifying the binary AndroidManifest.xml inside the APK file, repacking and then reinstalling the modified APK.

I was curious, how this could be done, because information about Android’s binary XML format is pretty scarce. There is some sort of documentation however, in the form of source code.

Exploring the format

aapt is used to compile the application resources into the binary format. It can also show an XML tree of binary XML files. Its source code is available at googlesource.com below the frameworks/base module. Starting at main and working towards dump xmltree, we get to handleCommand and doDump. When you look at the doDump function in Command.cpp, you will see, that it uses ResXMLTree, which in turn can be found in the same module in include/androidfw/ResourceTypes.h and its companion libs/androidfw/ResourceTypes.cpp.

Digging deeper into ResourceTypes.h reveals struct ResChunk_header

/**
 * Header that appears at the front of every data chunk in a resource.
 */
struct ResChunk_header
{
    // Type identifier for this chunk.  The meaning of this value depends
    // on the containing chunk.
    uint16_t type;

    // Size of the chunk header (in bytes).  Adding this value to
    // the address of the chunk allows you to find its associated data
    // (if any).
    uint16_t headerSize;

    // Total size of this chunk (in bytes).  This is the chunkSize plus
    // the size of any data associated with the chunk.  Adding this value
    // to the chunk allows you to completely skip its contents (including
    // any child chunks).  If this value is the same as chunkSize, there is
    // no data associated with the chunk.
    uint32_t size;
};

and the possible chunk types

enum {
    RES_NULL_TYPE               = 0x0000,
    RES_STRING_POOL_TYPE        = 0x0001,
    RES_TABLE_TYPE              = 0x0002,
    RES_XML_TYPE                = 0x0003,

    // Chunk types in RES_XML_TYPE
    RES_XML_FIRST_CHUNK_TYPE    = 0x0100,
    RES_XML_START_NAMESPACE_TYPE= 0x0100,
    RES_XML_END_NAMESPACE_TYPE  = 0x0101,
    RES_XML_START_ELEMENT_TYPE  = 0x0102,
    RES_XML_END_ELEMENT_TYPE    = 0x0103,
    RES_XML_CDATA_TYPE          = 0x0104,
    RES_XML_LAST_CHUNK_TYPE     = 0x017f,
    // This contains a uint32_t array mapping strings in the string
    // pool back to resource identifiers.  It is optional.
    RES_XML_RESOURCE_MAP_TYPE   = 0x0180,

    // Chunk types in RES_TABLE_TYPE
    RES_TABLE_PACKAGE_TYPE      = 0x0200,
    RES_TABLE_TYPE_TYPE         = 0x0201,
    RES_TABLE_TYPE_SPEC_TYPE    = 0x0202,
    RES_TABLE_LIBRARY_TYPE      = 0x0203
};

This describes more or less the basic structure of a binary XML file. Any binary XML file, or more generally binary resource file, contains a sequence of resource chunks. Each resource chunk is composed of this basic header, plus any additional header fields where necessary, and the associated data if any.

For example, a binary XML file is made up of just one resource chunk with a type of RES_XML_TYPE (0x0003), a header size of 8 and the size of the whole file. The remaining XML tree is contained inside the data part as a sequence of nested resource chunks, describing the parts like a string pool (RES_STRING_POOL_TYPE), namespaces (RES_XML_START_NAMESPACE_TYPE, RES_XML_END_NAMESPACE_TYPE) and opening (RES_XML_START_ELEMENT_TYPE) and closing (RES_XML_END_ELEMENT_TYPE) element tags.

To minimize space and simplify parsing, strings, like tag names and attribute names or values, are kept in a string pool and are referenced by the chunks. This string pool is described by another binary header struct ResStringPool_header later in the header file.

Removing permissions

To get back to removing permissions from a binary AndroidManifest.xml, this becomes easy now. You must simply copy chunks from the original file to the new one, only omitting the element opening and closing chunks, which contain the uses-permission tag. One caveat remains however. Since you remove chunks from the file, the file and so the containing XML chunk becomes smaller and the appropriate size in the header must be adjusted accordingly.

This shows how, at least in this case, the binary format makes parsing and modifying a file easier than parsing a regular XML text file would be.

1 Comment

Simone on 2016-05-24 16:13:00 +0200

I do it in this way:

https://github.com/simoneaonzo/RmPerm/blob/master/src/main/java/it/saonzo/rmperm/AndroidManifest.java

and I released an app that remove permissions in a better way: check AndRmPerm at PlayStore

Post a comment

All comments are held for moderation; Markdown and basic HTML formatting accepted. If you want to stay anonymous, leave name, e-mail and website empty.