Extracting bookmark icons (favicons) from Firefox

Favicons are those little icons sites have that end up in your bookmark list if you add the site there. I was wondering where were all these icons stored when I ran across this post that explains that Firefox holds those icons as base64-encoded strings inside the bookmarks.html file.

I wanted to get access to all those icons and wrote a small Python script to help me do that.

I used the Python built-in base64 module to handle base64 encoding and decoding and the wonderful BeautifulSoup library for parsing the bookmarks.html file.

The resulting code snippet is quite short:

import base64
import re
from BeautifulSoup import BeautifulSoup

HEADER = "data:image/x-icon;base64,"

f = file("bookmarks.html")
page = BeautifulSoup(f)
for tag in page.findAll("a"):
    try:
        iconData = tag["icon"]
        print tag.string
        if iconData.startswith(HEADER):
            iconData = iconData[len(HEADER):]
            iconBinaryData = base64.decodestring(iconData)
            iconFilename = re.sub("[^a-zA-Z0-9_\-.' ]", "_", tag.string) + ".ico"
            file(iconFilename, "wb").write(iconBinaryData)
    except KeyError:
        pass

Using BeautifulSoup, I do a simple for on all the <a> tags in the bookmarks.html file. For each such tag, I get the “icon” attribute and parse the base64 encoded icon using the base64 module after removing a small header that Firefox puts in each “icon” attribute before the actual icon data.

Maybe somebody will find it useful someday.

16 Comments on “Extracting bookmark icons (favicons) from Firefox”


By Joshua. March 26th, 2009 at 01:52

I will find it useful! …except… I don’t know what to with it. Can I execute it like a bat file, or do I need something to compile it or… as you can see I am clueless of Python. But this is exactly what I want to do: extract the bookmark icons.

By gooli. March 26th, 2009 at 08:50

You’ll need to install Python from http://www.python.org (version 2.6 should do just fine) and install BeautifulSoup from http://www.crummy.com/software/BeautifulSoup/. You’ll need to create a text file called extract.py and copy the code in the post into it. You’ll need to copy the bookmarks.html file from the Firefox profile to the place where you created extract.py. Now you can run it using “python extract.py” snd you should get the icons in the same directory.

By Joshua. March 28th, 2009 at 00:46

Thank you! Downloading now, I’ll let you know how it goes. :)

By Linda. April 17th, 2009 at 17:02

I’m going to ask a couple of even dumber questions!…
1) I’ve installed Python and downloaded BeautifulSoup. It doesn’t seem to have an installer so at present the extracted files are sitting in the same file as extract.py Is this correct?
2) I’ve created extract.py and copied bookmarks.html to the same folder but I don’t know how to run ‘python extract.py’ I started Python which opens a command window and typed “Python C:\Users\Linda….extract.py” and the syntax is invalid. Can you explain (in words of one syllable!) exactly what to do?!!
Sorry for being a pain and thanks for providing the code to enable icon extractions. :)
Regards,
Linda

By gooli. April 18th, 2009 at 21:31

Copying BeautifulSoup into the same directory as extract.py should work. By running ‘python extract.py’ I meant opening a command window (Start -> Run -> cmd), going to the directory where you have the extract.py script (using the cd command) and then typing ‘python extract.py’. That should work, let me know if it doesn’t.

By Innovative1. May 3rd, 2009 at 20:16

i tried this using Python 3.0.1 and BeautifulSoup 3.1.0.1 and I get “invalid syntax” at line 12 print tag.string
^

By Innovative1. May 3rd, 2009 at 20:21

Just FYI, the carrot (^) is located under the ‘g’ of tag.string on the error. I don’t know if there is any significance.

By Innovative1. May 4th, 2009 at 00:48

Don’t worry about it man. I uninstalled 3.0.1 and installed 2.6.2 and it worked fine. Thanks for this great script! I have been looking for something like this for a long time!

By lionelb. September 14th, 2009 at 03:24

help !

Traceback (most recent call last):
File “extract.py”, line 12, in
print tag.string
File “C:\Program Files\Python\lib\encodings\cp850.py”, line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: ‘charmap’ codec can’t encode character u’\u2019′ in position
53: character maps to

By Mike Belsky. September 25th, 2009 at 03:14

Or, using Firefox, open the bookmark (actually load the page), Tools > Page Info
Media
scroll through the list to the onethat ends in /favicon.ico

on bottom of widow click “save as”
save your icon!

By Arturic. October 1st, 2009 at 02:49

Wonderful Script!

I’m still running Ubuntu 8.04 Hardy (Just one more month till 9.10 Karmic, yeah!) I installed BeautifulSoup from Synaptic, used the script as described and worked perfectly, I only got a couple errors from filenames being too long (from bad websites giving huge page titles, definitively not the scrip’s fault), I manually edited the bookmarks file and the script did it’s magic.

python 2.5.2
python-beautifulsoup 3.0.4-1build1

THANK YOU VERY MUCH !!!

By Wanted: bookmarks.html merging program. May 13th, 2013 at 06:41

[...] It also looks pretty easy to use. I like this short example of extracting favicons to .ico files from a bookmarks.html file using Beautiful Soup. At least one other person has released tools to manipulate bookmarks.html files with Beautiful [...]

By Wanted: bookmarks.html merging program « My SEO Expert. May 13th, 2013 at 07:03

[...] It also looks pretty easy to use. I like this short example of extracting favicons to .ico files from a bookmarks.html file using Beautiful Soup. At least one other person has released tools to manipulate bookmarks.html files with Beautiful [...]

By Wanted: bookmarks.html merging program | Search Engine Optimization Blog. May 13th, 2013 at 07:26

[...] It also looks pretty easy to use. I like this short example of extracting favicons to .ico files from a bookmarks.html file using Beautiful Soup. At least one other person has released tools to manipulate bookmarks.html files with Beautiful [...]

By Wanted: bookmarks.html merging program « SeoExperiments.org. May 13th, 2013 at 07:56

[...] It also looks pretty easy to use. I like this short example of extracting favicons to .ico files from a bookmarks.html file using Beautiful Soup. At least one other person has released tools to manipulate bookmarks.html files with Beautiful [...]

By Wanted: bookmarks.html merging program - New Zen Solution. December 15th, 2013 at 19:56

[...] It also looks pretty easy to use. I like this short example of extracting favicons to .ico files from a bookmarks.html file using Beautiful Soup. At least one other person has released tools to manipulate bookmarks.html files with Beautiful [...]