Your life in bits
July 2003
Some computer users seem to have a weakness for
hoarding. They just can’t stop gathering data and information
“because they can”. Watch anyone new to the internet who has
discovered how to download files – their hard disk soon fills up as
they collect anything and everything they can. Some eventually pause
to reflect and realise that the internet icon on their desktop can
be regarded as the world’s largest disk drive, and it’s accessible
on demand. So why bother setting up your very own mirror site?
However, Big Brother is much the same as an
internet neophyte, and he has not begun to learn the benefit of
restraint. Without really thinking through all the issues, the
collection of data by all those “who can” continues apace, with
numerous new snooping devices – like the mushrooming of closed
circuit TV in town centres and Ken’s London congestion charge spies
– adding countless terabytes of data to the vaults of Big Brother.
The Data Protection Act exists to protect
private individuals, but as anyone who has ever tried to navigate
the Act will realise, it’s a very convoluted and frequently
self-contradictory piece of legislation. And as with the laws on
firearms, the bottom line is that the only two classes of society
that have unfettered access to the subject of these two bodies of
legislation are criminals, and the government. And in pursuit of its
self-appointed right to possess that data, the government has
provided itself with all sorts of powers to enable their many agents
to sniff, tap and intercept.
Just recently, we’ve seen this government take
some pretty big decisions with dubious public approbation, because
“Tony knows best”; and although the consequences of that decision
are difficult to hide, the consequences of HMG squirreling away
personal data are very easy to hide.
And now Gordon Bell, the father of the VAX (a
venerable mainframe computer from DEC that lead to Windows NT – and
that’s a very long story cut short) and one Microsoft’s research
gurus is heading a Microsoft research project, the goal of which is
to collect everything you watch, read, listen to, and write about
into a single, searchable database. For many people, simply
keeping every bit of email sent and received would provide the basis
of the mother of all diaries, but Gordon has gone the whole hog
with the typical enthusiasm of a technologist with a
new toy, and has been transferring as much of his life as he can
assemble – the stuff he’s written, the books he's read, CDs he's
listened to, DVD he’s watched, e-mail, phone calls, voice messages,
television programs, and even the urls he visits – into electronic
form for storage in his MyLifeBits database.
He reckons that the storage required to contain
all this is about one terabyte (1000Gbyte), which is just about 5
hard drives-worth today, and probably a single drive in the
foreseeable future. And guess what? Microsoft has just the OS
required to manage this sort of data compendium with an integrated
database/filing system coming along in development, for release in
2004/5. You can read more about all this at: http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx
where you can watch Gordon at the scanner and collect other
documents and links. The conclusion of the report includes this
line: “Gordon Bell, our apha user, has digitized nearly
everything possible from his entire life, and will have eliminated
all paper (except those legally required) by the time this paper is
published.”
This all displays a touching faith in MS SQL
server – but I know if I was to try and do the same, just as the
last photos drops in the shedder, the blue screen of death would
appear. (I still bear a huge grudge for the 2Gbyte file size limit
imposed on Outlook 2000 PST files that simply trashes the entire
file without warning).
And maybe Gordon is not too bothered with
privacy (perhaps all Microsoft staff realise that they have been
assimilated and that all resistance is futile), but as any 12 year
old knows, it’s the easiest thing in the world to alter data. From
the time on an email message to the finest works of Industrial Light
and Magic, it is not possible to believe anything that comes off a
hard disk at face value. Computer law specialist, barrister Alistair
Kelman, has been aware of the issues longer than most in his
profession, and has been thinking through the issues.
“The legal process presently accepts the
validity of any digital data presented as evidence with a somewhat
naive and unquestioning manner. The coughs on the sound track that
convicted Major Charles Ingram in the Millionaire trial come at the
low end of the scale of the “easily faked” but no one questioned the
validity of that data,
“The idea that information drawn from something
like the MyLifeBits database is taken at face value is alarming, to
put it mildly. So before getting carried away with this type of
endeavour, we should be spending more time considering the
fundamental issues of proof and privacy.”
But whatever anyone thinks of the MyLifeBits
project – a wheeze to flog a new Microsoft OS and more hard drives,
or something really significant – the need to radically rethink the
way that personal operating systems cope with terabytes of data is a
highly pertinent topic. Although when faced with the enormity of
such tasks, we should remember that the internet is the ultimate in
big filing systems, storing googols of data that can be accessed in
milliseconds by the magic of IP and DNS – techniques that are now 30
years old and invented before hard drives could handle more a
megabyte or two.
And Bill Gates was still in high school.
BACK
TO FEATURES
|