Gvim 7.2 with Python 2.5/2.6 support Windows binaries

March 31st, 2009

Using the excellent instructions at ShowMeDo Jack Atkinson, I built a version of both Gvim and Vim for Windows with support for Python 2.5 and for Python 2.6.

Python 2.4: Support is already built-in in Gvim 7.2

Python 2.5: Install Gvim 7.2 and replace vim.exe and gvim.exe with the files in gvim72python25.zip

Python 2.6: Install Gvim 7.2 and replace vim.exe and gvim.exe with the files in gvim72python26.zip

Python 3.1: I wasn’t able to build Gvim 7.2 with Python 3.1 (most probably due to the backwards incompatible changes in Python 3)

To determine what version of Python your Gvim installation supports use the following command in Gvim:

python import sys; print sys.version

Digital reading

February 10th, 2009

I haven’t written here in quite a while and was going to keep it that way, but an interesting post caught my eye today about eBook readers. I started writing a comment and as it often happens, discovered I had way too much to say. Hope I won’t bore you to tears.

I’ve been a  long time proponent of digital reading. I started out with a (slightly darker) gray on (a slightly Palm IIIPlighter) gray display of an old  Palm III. I read the entire Hitchhiker’s Guide to the Galaxy series (which I incidentally didn’t like very much) on that device straining my eyes to see the badly drawn fonts. When the old PDA broke, I moved on to a Palm Tungsten T3 and have been reading books on it since using the wonderful Plucker software. I read about a hundred books over the last 4 years, almost all of which were in digital format on and I enjoyed every minute of it. Most of the time when I’m reading I forget I’m holding a digital device in my hands and not a real book, except that I find it’s much more convenient to hold the PDA. The first thing I noticed about digital reading is that I could hold the device and turn pages using just one hand. I was taking the train to work at the time, which was so crammed I had to stand most of the way. I found that I could stand on the train squeezed between odd smelling passengers, hold on to a pole with one hand and hold the PDA in the other comfortably reading and turning pages with my thumb. Paper backs on the other hand, I can’t even figure out how to hold properly with both hands. If you open a Palm Tungsten T3paperback all the way, you ruin the book with an ugly crease on the back of the cover. If you don’t, you have to awkwardly hold it half open sticking your thumb in the middle and struggle to read the inner part of the pages. Perhaps I’m overly modern in my approach, but I fail to see the sentiment in holding, owning and reading paperbacks printed on the cheapest recycled grayish paper with the cheapest possible ink in a font that’s either too small or too large, but is never the right size. In modern terms, I’d say the user interface of the common paperback leaves much to be desired. Note, I’m not talking about a rare edition of a classic whose pages were manually arranged with love and caring. I’m talking about those pulp fiction books we all read and forget about a day later which are printed at a factory somewhere in China and shipped in containers for us to read. There’s no sentimental value in most of those books either in form or content and we don’t read most of them more than once.

I believe the psychological aspects of holding a book, smelling it (ew!) and turning its pages will last at Amazon Kindlemost a single generation once a proper technology is available. One of the things I like best about reading from a PDA is the immediate access to more information about what I’m reading. Even when I read something as simple as a Vince Flynn thriller I sometimes use a dictionary or turn to Wikipedia for some more info on a term, a place or a person. This is one thing Amazon got right with their Kindle – it’s always online and you have full access to Wikipedia, a dictionary and other internet resources. The experience of reading a book is so much richer when you can instantly find out more about what you are reading. This is even more important for scientific or professional books. The ability to directly jump to a cited source and read the original material the author based her claims on is quite amazing as I imagine you know from reading stuff on the web. The feeling is only enhanced by the ability to do that while in bed or sprawled on a sofa.

A final point that came to me while writing this is that the digitization of books can help books evolve into a different non-linear media. The result might be similar to the hypertext we see on the web, but perhaps more akin to the Dungeons and Dragons books that were common in the early nineties. In those books you could choose the storyline by jumping to some page in the book according to the author’s instructions. Perhaps the Interactive Fiction style of games will return with a vengeance and allow us to be more active participants in the books we read.
Vynil
I don’t believe paper books will disappear and be completely replaced by digital books. After all, the TV hasn’t completely replaced the radio and CDs and MP3 players haven’t replaced vinyl records. However, when new technology emerges, the old ones usually mutate and find the niche they serve best. After all, people watch TV at home but still prefer the radio when they are driving. They listen to old jazz records on vinyl, but take their MP3 player when they jog (which might be a bit inconvenient carrying a record player). In a similar manner, paper books and digital books will coexist each finding its niche. Meanwhile, I’ll be ordering myself a brand new Sony PRS-700 and hoping that somebody ports a dictionary program and a Wikipedia browser to it so I’ll be able to enjoy a truly enhanced digital reading experience.

Unicode and permalinks

September 22nd, 2008

Working on integrating of automation scripts with Testuff, I’ve encountered an interesting Unicode-related issue I’d like to share.

The integration allows for an automated testing script to report the results of its run to the Testuff server. In order for the results to be grouped, displayed and summarized correctly, the automation script needs to tell the server which test it ran, and whether the test has passed or failed. A long discussion emerged on what the best way to uniquely identify tests.

After quite a bit of back and forth, we’ve settled on permalinks, those more-or-less-readable URLs that are in common use in blogs. The idea of a permalink is to take the title (of a blog post or a test) and replace any characters that aren’t numbers or letters with an underscore or a hyphen. Using this simple scheme, “Unicode and permalinks” becomes “unicode-and-permalinks”, which is quite suitable for use in a URL.

The implementation is a simple regular expression:

def to_permalink(string):
    return re.sub("[^a-zA-Z0-9]+", "_", string).lower()

While this code works perfectly for the English language, it doesn’t work at all if string is a Unicode string containing something in Hebrew, Russian or Polish – language that some of our customers use. And so, I set out to write code that will essentially behave like the regular expression above, but will work for letters and numbers in all the languages of the world.

Fortunately the Unicode standard includes a rarely used classification of characters into various categories. For each given character we can find out whether it is an uppercase letter, a lowercase letter, and number, a punctuation mark and so on. Surprisingly, Python includes a module called unicodedata that contains all that information. The function category accepts a character and returns a string that tells us what the character is: “Lu” denotes an uppercase letter, “Nd” denotes a decimal digit, etc.

All that remains to be done is go over the characters in the title, keep the letters and numbers, and replace all the other characters with a dash or an underscore. The regular expression at the end replaces any sequence of underscores into a single underscore to make the resulting URLs even nicer to look at.

def to_permalink(s):
    """
    Converts sequences of characters that aren’t letters or numbers
    to a single underscore to achieve wikpedia like unicode URLs.
    "
""
    import re
    import unicodedata
    def conv(c):
        if unicodedata.category(c)[0] in ["L", "N"]:
            return c
        else:
            return "_"
    s2 = "".join([conv(c) for c in s])
    return re.sub("_+", "_", s2)

[Update] Or, as Almad correctly pointed out, you could just use the re module support for Unicode and be done with it in two lines, which kind of takes the air out of this post.

def to_permalink(s):
    import re
    return re.compile("\W+", re.UNICODE).sub("_", s)

There’s one other thing to consider when dealing with Unicode permalinks. If you’re a native speaker of a language other than English, you’ve probably seen URLs that in your own language in Wikipedia.

From the looks of it, URLs can include characters in any language. Right?

Wrong.

RFC3986 defines the syntax for URLs (actually URIs, but that’s a moot point) explicitly and states which characters are allowed in a URL. This includes little more than English letters and numbers from the lower half of the ASCII chart.

If you look at the headers your browser passes when you access such a URL, you’ll see that it encodes all the characters with percent encoding, so neither the browser nor the web server is violating the standard. This is what the server saw when I navigated to the main Hebrew page of Wikipedia:

GET /wiki/%D7%A2%D7%9E%D7%95%D7%93_%D7%A8%D7%90%D7%A9%D7%99 HTTP/1.1
Host: he.wikipedia.org

In order to understand what this percent encoding means, you need to know a bit about Unicode. Basically, the Unicode URL is encoded in UTF8 and each byte of the UTF8-encoded string is encoded using percent encoding. The browser apparently recognized this specific encoding scheme (which isn’t documented anywhere I could fine) and displays nice internationalized URLs for the user.

If you want to support such URLs in your server, you’ll probably need to write some code to translate the percent-encoded URLs into their actual Unicode representation.

Solution for Edimax router BR-6204WG being slow

August 31st, 2008

This message is for anybody who has recently bought the aforementioned router and is experiencing a slow connection and many timeouts. This affects the routers that have the 1.57 version of the firmware installed.

To check the firmware version, go to http://192.168.2.1 (that’s the IP of the router in its default configuration) and click Status Info at the top of the page that appears. The firmware version is written under Runtime Code Version.

The 1.57 version apparently has a bug, which did not exist in previous versions and has been fixed in newer ones. The latest official firmware release is 1.51, and can be found at the Edimax site. I’ve also found a 1.58 version on an Israeli’s dealer’s site here.

Direct link to 1.51 (edimax.com): BR6204Wg_1.51.zip (unzip it before upgrading)
Direct link to 1.58 (pikok.co.il): EdiEngBR6204Wg_1.58.bin

I’ve tested both 1.51 and 1.58, and they both fix the problem.

Upgrading the firmware is actually quite easy.

Go to http://192.168.2.1, select System Tools at the top, then Firmware Upgrade on the left and follow the instructions on the screen.

Hope this helps somebody someday.

Disable scroll wheel clicking in Firefox

August 14th, 2008

I have one of those Microsoft mice here at work and its wheel is quite
light to the touch. It is so light in fact, that I often click it while
scrolling. By default, clicking the scroll wheel in Firefox puts it
into scrolling mode, showing this icon:

Now when you move the mouse up and down, the page will scroll.

That is so annoying!

It took me over 20 minutes of Googling to find how to disable that feature, so I’m documenting it here.

The option that controls it in Firefox is Tools > Options > Advanced > General > Browsing > Use autoscrolling.

Apparently “auto scrolling” is the name of this obscure feature, which is now forever disabled on all my machines.

Accessing SVN revision via a browser

August 10th, 2008

Most people who use Subversion know that you can access the repository with your browser to get a readonly interface that you can use to take a cursory look at the files in there.

This is how the Python repository looks like via http://svn.python.org/projects/python:

It says Revision 65620 at the top and I’ve always wondered if you could access another revision in the same simple way. Turns out there is a way.

All you need to do is add !svn/bc/REVISION to the URL:

http://svn.python.org/projects/!svn/bc/5000/python shows revision 5000 of the Python Subversion repository.