Off The Grid
   Simple listserv
   xml tools
Karel as an adult


Just a collection of random rants that I can't really place anywhere else.

2016-07-20 Fswatch 1.09
2016-05-04 Another fswatch update
2016-03-24 fswatch updated
2016-02-22 OTG Price Claimed
2016-02-01 File System Watcher fswatch
2016-01-05 Commandline flag parsing in Go
2015-11-19 Phoning the Dutch Vehicle Registration
2015-11-09 FE 1.05 released
2015-09-28 Bad Horse
2015-09-24 A generic attribute holder class in Python
2015-08-05 Autorotating jpegs according to EXIF tags
2015-04-15 0MQ in Python
2015-04-13 FE update
2015-02-23 Perl Course notes moved
2015-01-13 For my dad
2014-12-11 Peugeot 307 te koop
2014-11-20 Combining tornado.web StaticFileHandler with dynamic handlers
2014-11-05 Pushing scripts to servers using ssh
2014-11-03 Python3 argparse Mini HOWTO
2014-10-30 Handling secure httponly cookies with Python3 and tornado.web
2014-10-29 Overloading libc Calls
2014-10-27 Python Bootcamp Notes
2014-10-06 Custom Squid Proxy Authentication
2014-08-30 CyclicLog
2014-05-02 Playing with Python
2014-04-08 Kalk 1.38
2014-03-18 Closing C++ lecture
2014-02-26 Spring Cleanup
2014-02-26 Gravity error
2014-02-03 More on the knight tour
2014-01-28 The circular knight tour
2014-01-24 A Perl IBAN Checker
2013-10-16 Go2SEPA is live
2013-07-24 AJAXy Fileuploads with JQuery
2013-05-23 The Ticket
2013-04-24 Who is leeching my Facebook
2013-04-15 Time for a compact Coolpix
2013-04-15 Selling my Peavey amp
2013-03-28 encfs Helper Revisited
2013-03-15 c conf 1.17
2013-02-25 Flitsers op de Nederlandse wegen
2013-01-14 OTG Revisited
2013-01-11 Kalk revisited
2012-12-12 God is coming
2012-11-29 Strangely disturbing
2012-11-27 How to sneeze on a motorbike
2012-11-23 Een aanslag uit de toekomst
2012-11-22 More voluntary layoffs
2012-10-17 Browser fingerprint
2012-10-15 Site layout revamped
2012-10-14 Crossroads performing well
2012-10-08 The saddle seat
2012-09-30 Sudo Exploit
2012-09-26 Technomona is live
2012-09-19 A Perl to HTML Prettyprinter
2012-09-17 Playing with Moose
2012-08-22 How do you export photos from Adobe Photoshop Elements
2012-08-21 Security hole at the Empire State Building
2012-07-15 Matrix Sunglasses
2012-07-12 Two Morgans
2012-07-12 Review of the BMW R1200RT Motorbike
2012-06-13 Passwords: maak het hackers moeilijk
2012-06-13 Mijndomein foutje
2012-06-07 YOLNT
2012-06-06 Human Origins
2012-05-23 Me on Facebook
2012-05-16 Voluntary Layoffs
2012-05-07 RBS Bike Tour
2012-04-27 Room with a view
2012-04-23 Browser cookies and Javascript revisited
2012-02-17 Schrodingers Cat
2012-02-14 KPN heeft problemen
2012-01-12 Memebase forever
2012-01-11 Strange squares
2011-12-22 TVV zuigt ezel
2011-12-08 Dilbert vs Skype
2011-11-29 The uncanny resilience of bulshytt
2011-11-23 Another silly Trojan attempt
2011-10-29 ACTA is coming our way
2011-10-28 Burgernet in the Netherlands
2011-10-27 Facepalm art
2011-10-26 Do not drag this image
2011-10-22 Off The Grid Challenge
2011-10-12 PI like a boss
2011-10-07 Once upon a time
2011-07-13 Dutch eticket system for trains
2011-07-12 Is Hell exothermic or endothermic
2011-04-27 Optical Illusions
2011-04-19 Odd lyrics
2011-04-16 Band Revival at MON
2011-03-13 Protests in the Middle East and you
2011-03-10 Mac OSX Hotkey for locking your system
2011-02-12 dnspb 0.06 is out
2011-02-08 Would I buy this fridge
2011-02-06 InstaYouth
2011-02-05 The Thinker is back
2011-01-17 Math challenge
2011-01-11 Zero tolerance and zero intelligence
2011-01-05 My interest income in 1991
2011-01-01 Your horoscope by Eddie
2010-12-22 New York City Tours might be half price for you
2010-12-20 Weather Forecast
2010-12-14 World Economy Collapse explained in 3 minutes
2010-12-13 The Salvation Army and its choice of toys
2010-12-08 Elizabeth thinks highly of me
2010-12-06 Should I trust my government with my data
2010-12-05 Announcing dnspb
2010-12-03 Realistic piechart
2010-11-26 Crossroads 2.71 is out
2010-11-24 8 bit Starwars
2010-11-17 Six to eight black men
2010-11-16 Canada wants backdoors and data and everything
2010-11-11 Autumn storm over the Netherlands
2010-10-08 USA wants backdoors to everything
2010-10-05 Sudoku solver in Perl
2010-10-02 Finally wrote up a Syscheck page
2010-09-28 Neon sign fail
2010-09-27 The Renault Eco Team
2010-09-23 Crossroads 2.68 is out
2010-09-20 How to suppress Flash cookies
2010-09-15 Meanwhile on Facebook
2010-09-09 The Yes Men Fix The World
2010-09-07 ed is not dead
2010-08-26 Installing Perl modules in a non root environment
2010-08-22 Magic self leviation
2010-08-20 Google Chrome does not support offline Gmail
2010-08-19 The number 48
2010-08-12 Welsh trout mini HOWTO
2010-08-04 Fooling a NetCache proxy into fetching forbidden files
2010-07-30 The world will end on May 21, 2011
2010-07-28 Hiding or showing a textbox with image animation using JQuery
2010-07-27 Manipulating browser cookies using Javascript
2010-07-25 Survival of the fittest book
2010-07-23 Pastafarians in Spain
2010-07-22 You have two sheep
2010-07-09 Highway bank fire
2010-07-08 Setting up a remote git repository
2010-07-06 Bye bye trusted old Macbook
2010-06-28 John Cleese on Football
2010-06-23 ABN Amro and the Pathetic Customer Service Dept.
2010-06-22 Wally does not like criticism
2010-06-14 Soccermatch Netherlands vs Denmark
2010-06-13 Lazy Cat
2010-06-08 Reading public Buzz using the Google API
2010-06-07 A Personal Letter from Steve Martin
2010-06-05 Sushi Saturday
2010-06-04 Suppressing the Enter key with Javascript
2010-05-31 Temporal spacial anomaly on the Dutch highway
2010-05-23 Greenhost will not log your traffic
2010-05-10 Jarlsberg Webapp Exploits
2010-05-04 A Thought Experiment
2010-05-03 SafeEdit information updated
2010-05-01 Microproxy now supports ftp
2010-04-30 What could get Data angry
2010-04-29 Lego Mindstorm solving the Rubik Cube
2010-04-28 Crossroads 2.65 is out
2010-04-17 Goggomobil in its natural habitat
2010-04-14 Bacon Time
2010-04-11 104 More friends to connect with
2010-04-10 Bacteria infested radio reporter
2010-04-07 The Kubat STAR
2010-03-30 Homework Essay
2010-03-29 C++ mutexes again
2010-03-20 Weird Eyechart
2010-03-15 Microproxy 1.01
2010-03-05 Microproxy
2010-03-03 Sven Kramer and the wrong lane
2010-02-26 Endearing Babe Magnet
2010-02-17 Speed of light measured using chocolate and a microwave
2010-02-17 Never again expires after 65 years
2010-02-16 encfs on the Mac
2010-02-15 and sexual predators
2010-02-10 Funny textbook
2010-02-09 DNS failing after sleep wake cycle
2010-02-06 Blast from the past
2010-01-28 Simple and straight Perl HTTP::Proxy
2010-01-15 Avatar the Movie
2010-01-08 Slightly NSFW Linux Ad
2010-01-07 WTF
2010-01-05 Stop Software Patents in the EU
2009-12-05 HammerServer 1.02
2009-11-28 Perls Automagical Autoloading
2009-10-07 Office Poster
2009-10-06 The nr 1 Nerdjoke
2009-10-04 WoW Startscript for my Mac
2009-09-27 HammerServer section is online
2009-09-26 The BING HQ
2009-09-26 Digging a WOW Tunnel
2009-06-29 Wee Todd
2009-06-23 The On Off Switch Revisited
2009-06-22 Meatspace
2009-05-30 My old houses
2009-05-11 LOLcats are funny
2009-05-11 Civic Duty WIN
2009-05-10 Vote for the baby, Sky Radio promo FAIL
2009-05-05 My secure data center
2009-02-15 My Valentine is sending me a dot exe
2009-02-05 MacPorts trash: .mp_123456 savefiles cleaning
2009-02-01 Truecrypt 6 on Linux and the ext3 filesystem
2009-01-28 www versus
2009-01-27 Songsmith and The Police
2009-01-25 My own Ministery of Silly Walks
2009-01-09 CoolIris Mini HOWTO
2008-11-04 UDP and DNS balancing
2008-11-02 Life in graphs
2008-11-01 Skeined yet?
2008-10-30 New Crossroads on the horizon
2008-10-28 Thread safe or not
2008-10-15 WOW patch 3 on a case sensitive MacOSX filesystem
2008-10-15 Surprising C++ optimizations
2008-10-14 Weird system message
2008-10-08 Data mining against terrorism does not work
2008-09-16 Crossroads at the top of
2008-09-09 Stupid spammers at Computable
2008-09-06 Spam prevention with Postfix and Postgrey
2008-09-03 The Gnomish Flying Machine
2008-08-27 Bank customer data on eBay
2008-08-26 Mutexes in C++ Threads
2008-08-22 4M dataloss in the UK last year
2008-08-21 Dropping spam with Postfix and Spamassassin
2008-08-18 Bayes and the War on Photography
2008-08-13 Good marital advice
2008-08-12 Squid proxy for personal usage
2008-08-11 Posix threads in C++
2008-08-09 Crossroads mailing list
2008-08-08 Crossroads 2.00 is out
2008-08-01 Fail Pics
2008-07-14 The Fish Dance
2008-07-01 Big Bother and Massive Data Storage
2008-06-30 MMV One of omitted Unix tools
2008-06-08 Even anonymous breadcrumbs can give you away
2008-05-29 Crossroads in Argentina
2008-05-20 The Party at the Company Outing
2008-05-19 Crossroads 1.80 is out
2008-05-18 Where does technical innovation really come from
2008-05-16 Corporate bs generator
2008-05-15 Even the Vatican has to adapt
2008-05-12 Big Brother is watching your dog
2008-05-09 666 all over the place
2008-04-17 Security and privacy are incompatible
2008-04-16 The Hallmark E Card
2008-04-15 Crosroads Solaris port is out
2008-04-04 Identity theft can cost you dearly
2008-04-03 Crossroads can already do that
2008-03-31 A dagerous safari
2008-03-28 Why some Java J2EE projects are inefficient
2008-03-26 The Hummingbird
2008-03-25 The Easter delusion
2008-03-18 McAfee detects mass hack of 200.000 webpages
2008-03-17 More predictive statistics
2008-03-10 Backwards conclusions even on Slashdot
2008-02-18 A fractal photograph
2008-02-15 Kaprekar revisited
2008-02-14 Kaprekar numbers
2008-02-12 A tale of the criminal ineptitude
2008-02-10 Irritating Selfregistered users in PHPBB
2008-02-08 B2B Spam in the Netherlands
2008-02-06 Surprising iSight Capture
2008-02-05 Breadcrumbs at
2008-01-29 iSight Capture Utility
2008-01-28 The Male Brain
2008-01-26 Searching for the next Uri Geller
2008-01-24 Opt in for b2b spam
2008-01-14 Bokito Revisited
2008-01-13 Top Crossroads User
2008-01-12 World of Warcraft Dancing
2008-01-12 Justice dispensed better late than never
2008-01-11 Jeremy Clarkson and Identity Theft
2008-01-10 Terrorism in the Netherlands
2007-12-07 The mind and bodysnatchers are among us
2007-12-05 Bruce Schneier and Hildo
2007-12-04 Bye bye, good Christian soul
2007-12-03 Confusing mail message
2007-11-30 Medion MD 85276 reviewed
2007-11-29 Recent cases of data exposure
2007-11-20 Bayes bites
2007-11-19 Japan starts fingerprinting foreigners
2007-11-14 Privacy, Yahoo and the Strange World

2007-11-14 Privacy, Fall through algorithms, and Securing data

Today I was discussing data privacy with my good buddy & collegue Eddie. We were talking about the idea of hashing sensitive information and storing the hash, instead of storing the actual plain-text information - an idea that I suggested in my previous note on Yahoo and data privacy. Is storing hashes instead of plain data feasible? When can it be used, and when not? Our discussion prompted me to elaborate a bit.

What's a hash value anyway?

The principle works as follows. One can convert text to a value by chopping the text into chunks and feeding them into some algorithm which is non-reversible. That means that if you have the text, you can compute the value - but not the other way 'round.

Imagine the following hash function: If you want to compute the hash value of a name, then take each letter of that name, where a=1, b=2 and so on, and add these values. The result is the hash of that name. This very simple hash function illustrates the concept: "karel" would become 11+1+18+5+12=47, while "eddie" would become 27. But knowing only the number 47, it would be impossible to reconstruct my first name. So the algorithm is truly one-way.

What good is storing a hash instead of the original data?

So based on this algorithm, one could distinguish two names without actually knowing them, but only knowing the hash. E.g., let's say that both Eddie and I register at some website. I am mainly interested in music and build up my profile accordingly, while Eddie is interested in tech news stories. Now let's say that the website owner is so privacy-aware that he decides not to store the usernames in a database, but only the hashes. I can still log on using "karel" and Eddie can log on using "eddie", but the first thing the website will do, is convert my name to "number 47" and Eddie's to "number 27". Then the website looks up a personal profile based on the hash number, and shows corresponding data on the site. So profile number 47 will display playlists of music, and profile number 27 will display the latest tech news. The trick here is that the actual user name only exists on the login page, where I enter "karel" and Eddie enters "eddie". Beyond that login page, the user name no longer exists, only the hash!

No problems so far. But what is it good for? Well, one important aspect is, that my username is not stored in the site's list of profiles. So a malevolent sysop there cannot get at my username and log in using my account! Since only number "47" is known there, the malevolent sysop has no way of knowing that he must type "karel" to steal my identity.

But wait! This hash algorithm isn't too safe, is it? Number 47 can also be constructed using four j's (four times 10) and one g (7). So user name "jjjjg" would also access my profile. And what if my evil twin Lerak decided to register? He would also hit the same hash!

Accidentally hitting the same hash is called a 'collision'. Obviously a good hash function should not be only one-way, but should also avoid collisions. There are very good hash functions out there, such as the SHA family. However, none of these are guaranteed to avoid collisions.

Is there a practical application for this?

When I purchase items over the Internet, I often use my last name and account number for billing. The account number in the Netherlands consists of nine digits (eg., 123456789). I enter the name and account number at the site where I'm purchasing an item, and the site then connects to my bank to see if my balance is sufficient for the transaction. Furthermore the site will surely store my last name and account number, so that I don't have to retype it during a next purchase. They'll do anything to improve the "user experience" and to entice me into visiting them again.

Two things are happening here: One, when I'm purchasing something for ten euro's, the site asks my bank: "Is the balance of Kubat, with account number 123456789, sufficient for a transaction of 10 euro's?" Two, the last name Kubat and the account number 123456789 are stored in some database of the site.

There is a number of dangers lurking here. First, someone may be eavesdropping on the chitchat between the site and my bank. They can find out that the combination Kubat/123456789 is valid for making purchases, and start making purchases in my name! Second, any malevolent employee of the site can find this in the site's database and also misuse my identity for their profit.

Obviously I don't want that to happen. So we can take a number of measures - e.g., secure transactions between the site and my bank, and restrict database access for all but a few site employees. Unfortunately, however well designed the security measures may be, there will always be ways around them. And there is still the basic question - do the data need to be there in the first place?

So, why don't we use hashes instead of plain text data?

  • The hash512 value of my last name "Kubat" is UlieNqKVJFYPlgu7ZAMUl0J5SG5ZV7nKtF9AK+fuuV/K3uljp0Glj2CyUFvC3N1GCV7SgxBmxkTOmCM+EeIGeA.
  • When I log onto the site, I state my last name just once - on the login page. Beyond that, my last name just isn't available anymore, only the above hash.
  • The hash of my account number "123456789" is 2eZ2LdHI6vbWGzxhkvxAjU1tXxF20MKRabwk5xw/J0rSf81YEbMT1oH35V7ALXPUmclUVba1u1A6z1dPuo/+hQ. I'd have to enter my account number just once, when making the first purchase. AFter that, the site's database would store that "Person UlieNqKVJFYPlgu7ZAMUl0J5SG5ZV7nKtF9AK+fuuV/K3uljp0Glj2CyUFvC3N1GCV7SgxBmxkTOmCM+EeIGeA has account 2eZ2LdHI6vbWGzxhkvxAjU1tXxF20MKRabwk5xw/J0rSf81YEbMT1oH35V7ALXPUmclUVba1u1A6z1dPuo/+hQ".
  • When I purchase an item, the site would ask my bank: "Does person UlieNqKVJFYPlgu7ZAMUl0J5SG5ZV7nKtF9AK+fuuV/K3uljp0Glj2CyUFvC3N1GCV7SgxBmxkTOmCM+EeIGeA with account 2eZ2LdHI6vbWGzxhkvxAjU1tXxF20MKRabwk5xw/J0rSf81YEbMT1oH35V7ALXPUmclUVba1u1A6z1dPuo/+hQ have a balance high enough to cover ten euro's?"
  • My bank cannot reverse the supplied hashes to my name and account number; no-one can. But that's not necessary! My bank has a full list of customers and account numbers, They can pre-generate lists of hashes, and compare the two hashes that the site sends with their list to find my data.
Much better. Since the hashes aren't reversible to values, the security risks have magically disappeared. But what about collisions? Fortunately we are quite able to verify whether collisions occur. They don't - each account number, ranging from 00000000 to 99999999 yields its own unique hash.

And consider the following real-life situation. Many people purchase items over the Internet and pay with their credit card. Each transaction needs to be verified with a credit card company, where a website might ask: "Is credit card holder A. Smith, with card number 1111222233334444 and verification code 123 and expiry date 02/10, good for 20 dollars?" Any malevolent person only needs to get hold of these data; once they have them, they are free to roam Internet sites and purchase items using Smith's stolen identity.

In contrast, imagine that the identifying data were a hash of the combined values. In this case we could build up one long identifier, consisting of "Smith,1111222233334444,123,02/10", compute the hash (which is incidentally "svzW3IjYHVr+sFj85FVXyZmMHrtcPMSJdZoTb9BXOjSfoEOdfZYeGYjSlMCbBcaPheYS1yiIMqu7ox+ICxjoIw") and use that -again- in the following way:

  • The hash would be computed just once, when the user registers at the site. After that, the original data would be discarded, and only the hash would be stored in the site's database.
  • Verification of the transaction for 20 dollars would only transmit the hash, not the separate data.
All wannabe identity thieves who are looking for a quick buck: good luck extracting the separate name, credit card number, verification code and card expiry date from this hash.

So where and how can this approach be used?

The hash approach is suitable for all situations where a requestor asks some central service a question concerning a person, or indeed, anything that's identified by some data, such as in the previous examples. The approach is not suitable in situations where a requestor asks a generic question without supplying an identifier.

So for example, if websites were allowed to ask a credit card company: "Give me a list of all your customers whose balance is high enough to make a $20 purchase", then this approach couldn't be used - there is no "identifier" in the question to hash. Furtunately, this isn't a request that's allowed by the credit card companies (one may hope).

Is this appraoch in use?

Personally, I haven't heard of it. Except for the fact that passwords are stored in encrypted format; the encryption routine being a one-way non-reversible algorithm. That's why you can't phone the helpdesk and ask, "what is my password again, I lost it". The helpdesk guys don't know, and they have no way of finding out. All they can do is reset your password to some value that you can use for your next login. The password storage methodology is so common that I wonder why the same concept hasn't been applied to other privacy-sensitive data.

Will the approach ever be used? One may hope so, I don't see any obstacles, only benefits. Well there is of course one obstacle - websites and credit card companies would need to change their systems to conform to the new way. This won't ever happen unless someone tells them to. As I stated previously: storing private data should be prohibited, unless it is absolutely necessary and there's no other way. That would be a good incentive!

Perl snippets for the playful

If you want to play around with the algoritms, below are a few Perl snippets that I used when writing this.

Here is the "idiotically simple" hash algorithm.


use strict;

# ihash - idiotic hash
# --------------------

# Check the command line.
die ("Usage: ihash name(s)\n") if ($#ARGV < 0);

# Show hash all arguments.
for my $n (@ARGV) {
    print ("$n: ", hash($n), "\n");

# The hash function
sub hash ($) {
    my $n = shift;
    my $h = 0;
    for my $c (split ('', $n)) {
	$c = lc($c);
	$h += ord($c) - ord('a') + 1;
    return ($h);

Here's a short script to display sha512 hashes. You'll need the Perl module Digest::SHA to make this run.


# sha512 - displays the sha512 hash of all arguments

use strict;
use Digest::SHA qw(sha512_base64);

# Check the command line
die ("Usage: sha512 strings\n",
     "Displays the sha512 digest of all strings.\n") if ($#ARGV < 0);

for my $str (@ARGV) {
    print ("$str: ", sha512_base64 ($str), "\n");

I verified that Dutch account numbers between 000000000 and 999999999 do not collide when hashed using sha512. Here's now.


# sha-accnr-checker
# If we sha512-encode Dutch account numbers (which have 9 digits),
# will we encounter collisions?

use strict;
use Digest::SHA qw(sha512_base64);

# Make the output unbuffered

# Hash of seen digests
my %used;

for my $nr (0..999999999) {
    # Pad account number to 9 positions, compute the digest
    my $acc = sprintf ("%9.9d", $nr);
    my $dig = sha512_base64 ($acc);
    # Stop if there's a collision
    die ("\n$acc conflicts with $used{$dig}\n",
	 "both yield sha512: $dig\n") if ($used{$dig});
    # Store digest since we've now used it
    $used{dig} = $acc;

    # Show a ticker so we see what's going on
    print ("\r$acc") if (! ($nr % 10000));

# All done
print ("\nno collisions detected\n");

2007-11-07 European airlines to retain data
2007-11-03 BloggEd
2007-10-30 Wilders and
2007-10-28 The goldplated Mac
2007-10-26 More morons
2007-10-26 Dilbert nails it again
2007-10-23 Rough yet funny
2007-10-05 Another silly Trojan mail
2007-10-01 So ugly it is beautiful
2007-09-28 Here is a nickel kid
2007-09-23 Spy Shredder
2007-08-29 Web svn view 1.08
2007-08-24 Caught in THE Process
2007-08-21 Stupid Trojan attack
2007-08-21 Back in 1994
2007-08-20 A girly iPod
2007-08-17 Crossroads for RDP connections
2007-08-15 Firewall art
2007-08-14 jpeginfo
2007-08-13 Good People
2007-08-07 The Real Crossroads
2007-07-30 BBC Documentaries in the Netherlands
2007-07-12 No problems with Crossroads so far
2007-07-11 Politically correct ad nauseam
2007-07-02 Waka Waka Poem
2007-07-02 Voyage of the rubber ducks
2007-06-28 The On Off Switch
2007-06-27 No free lunch
2007-06-25 Crossroads web interface
2007-06-25 Blinkenlights
2007-06-21 There is no silver bullet
2007-06-18 Motto of the week
2007-06-18 Do not feed the troll
2007-06-17 Which programming language are you
2007-06-13 Crossroads support request
2007-06-12 Bokito glasses
2007-06-07 Apache mod_proxy balancer description
2007-06-05 A ticketnumber is not support
2007-06-05 403 Hammertime
2007-06-04 Playground Fun
2007-05-24 Ascii man
2007-05-07 Cannot find the damn server
2007-05-02 The BFG200
2007-04-27 Crossroads Top User
2007-03-30 Crossroads Usage
2007-03-25 The guy with the dark motorhelmet
2007-03-22 The Process and The Result
2007-03-21 Quotes attributed to Jos
2007-03-20 A really nice comment about Crossroads
2007-03-18 Kubat in the air