Extracting bookmark icons (favicons) from Firefox

Favicons are those little icons sites have that end up in your bookmark list if you add the site there. I was wondering where were all these icons stored when I ran across this post that explains that Firefox holds those icons as base64-encoded strings inside the bookmarks.html file.

I wanted to get access to all those icons and wrote a small Python script to help me do that.

I used the Python built-in base64 module to handle base64 encoding and decoding and the wonderful BeautifulSoup library for parsing the bookmarks.html file.

The resulting code snippet is quite short:

import base64
import re
from BeautifulSoup import BeautifulSoup

HEADER = "data:image/x-icon;base64,"

f = file("bookmarks.html")
page = BeautifulSoup(f)
for tag in page.findAll("a"):
    try:
        iconData = tag["icon"]
        print tag.string
        if iconData.startswith(HEADER):
            iconData = iconData[len(HEADER):]
            iconBinaryData = base64.decodestring(iconData)
            iconFilename = re.sub("[^a-zA-Z0-9_\-.' ]", "_", tag.string) + ".ico"
            file(iconFilename, "wb").write(iconBinaryData)
    except KeyError:
        pass

Using BeautifulSoup, I do a simple for on all the <a> tags in the bookmarks.html file. For each such tag, I get the “icon” attribute and parse the base64 encoded icon using the base64 module after removing a small header that Firefox puts in each “icon” attribute before the actual icon data.

Maybe somebody will find it useful someday.

Leave a Reply