PS Consultants - ideas & solutions

Your life in bits

July 2003

Some computer users seem to have a weakness for hoarding. They just can’t stop gathering data and information “because they can”. Watch anyone new to the internet who has discovered how to download files – their hard disk soon fills up as they collect anything and everything they can. Some eventually pause to reflect and realise that the internet icon on their desktop can be regarded as the world’s largest disk drive, and it’s accessible on demand. So why bother setting up your very own mirror site?

However, Big Brother is much the same as an internet neophyte, and he has not begun to learn the benefit of restraint. Without really thinking through all the issues, the collection of data by all those “who can” continues apace, with numerous new snooping devices – like the mushrooming of closed circuit TV in town centres and Ken’s London congestion charge spies – adding countless terabytes of data to the vaults of Big Brother.

The Data Protection Act exists to protect private individuals, but as anyone who has ever tried to navigate the Act will realise, it’s a very convoluted and frequently self-contradictory piece of legislation. And as with the laws on firearms, the bottom line is that the only two classes of society that have unfettered access to the subject of these two bodies of legislation are criminals, and the government. And in pursuit of its self-appointed right to possess that data, the government has provided itself with all sorts of powers to enable their many agents to sniff, tap and intercept.

Just recently, we’ve seen this government take some pretty big decisions with dubious public approbation, because “Tony knows best”; and although the consequences of that decision are difficult to hide, the consequences of HMG squirreling away personal data are very easy to hide.

And now Gordon Bell, the father of the VAX  (a venerable mainframe computer from DEC that lead to Windows NT – and that’s a very long story cut short) and one Microsoft’s research gurus is heading a Microsoft research project, the goal of which is to collect everything you watch, read, listen to, and write about into a single, searchable database. For many people, simply keeping every bit of email sent and received would provide the basis of the mother of all diaries, but Gordon has gone the whole hog with the typical enthusiasm of a technologist with a new toy, and has been transferring as much of his life as he can assemble – the stuff he’s written, the books he's read, CDs he's listened to, DVD he’s watched, e-mail, phone calls, voice messages, television programs, and even the urls he visits – into electronic form for storage in his MyLifeBits database. 

He reckons that the storage required to contain all this is about one terabyte (1000Gbyte), which is just about 5 hard drives-worth today, and probably a single drive in the foreseeable future. And guess what? Microsoft has just the OS required to manage this sort of data compendium with an integrated database/filing system coming along in development, for release in 2004/5. You can read more about all this at: where you can watch Gordon at the scanner and collect other documents and links. The conclusion of the report includes this line: “Gordon Bell, our apha user, has digitized nearly everything possible from his entire life, and will have eliminated all paper (except those legally required) by the time this paper is published.”

This all displays a touching faith in MS SQL server – but I know if I was to try and do the same, just as the last photos drops in the shedder, the blue screen of death would appear. (I still bear a huge grudge for the 2Gbyte file size limit imposed on Outlook 2000 PST files that simply trashes the entire file without warning).

And maybe Gordon is not too bothered with privacy (perhaps all Microsoft staff realise that they have been assimilated and that all resistance is futile), but as any 12 year old knows, it’s the easiest thing in the world to alter data. From the time on an email message to the finest works of Industrial Light and Magic, it is not possible to believe anything that comes off a hard disk at face value. Computer law specialist, barrister Alistair Kelman, has been aware of the issues longer than most in his profession, and has been thinking through the issues.

“The legal process presently accepts the validity of any digital data presented as evidence with a somewhat naive and unquestioning manner. The coughs on the sound track that convicted Major Charles Ingram in the Millionaire trial come at the low end of the scale of the “easily faked” but no one questioned the validity of that data,

“The idea that information drawn from something like the MyLifeBits database is taken at face value is alarming, to put it mildly. So before getting carried away with this type of endeavour, we should be spending more time considering the fundamental issues of proof and privacy.”

But whatever anyone thinks of the MyLifeBits project – a wheeze to flog a new Microsoft OS and more hard drives, or something really significant – the need to radically rethink the way that personal operating systems cope with terabytes of data is a highly pertinent topic. Although when faced with the enormity of such tasks, we should remember that the internet is the ultimate in big filing systems, storing googols of data that can be accessed in milliseconds by the magic of IP and DNS – techniques that are now 30 years old and invented before hard drives could handle more a megabyte or two.

And Bill Gates was still in high school.