timothy falconer's semantic weblog
Big Fractal Tangle


RDF
 



universal human identifiers

Anyone who's made software for any length of time will be familiar with the perennial quandary: how do we uniquely identify human beings. First and last names are no good. Home addresses and phone numbers change all the time. People are reluctant to give out Social Security Numbers. This leaves the most unlikely candidate of all: email addresses.

How did we arrive at email addresses as our universal identifer? Nearly every web account uses email addresses. PGP uses email addresses. Even FOAF uses email addresses (hashed or not) as the primary key. Seems odd that the thing about us that changes the most often (other than our spleen) be chosen as our persistent identifier. Perhaps the people making this decision have more stable email addresses (universities, companies, etc). Most non-corporate, non-academic people I know change ISPs about every fifteen milliseconds, requiring an email change each time.

One solution is domain names. I happen to own timothyfalconer.com, which serves the purpose. Companies are likewise identified by domain. Is it worth $15 a year to preserve my identity? And what of the "Bob Smiths"? Will they have to get "BobSmithOfSchenectadyNewYork.com"? Web services like the Handle System and purl.org and even Microsoft's Passport have all tried to address this problem. There doesn't seem to be any contender (that I know of) for the Universal Human Identifier.

Let's reconsider the SSN (or equivelant) for a sec. Everyone in the US has a unique lifetime number assigned to them, and most of us have it memorized. It's used in all our official paperwork. It's used as a student ID number in college. Yet it's considered too risky to use more generally. For some reason, people are more reluctant to give out their SSN online than their credit cards numbers. Most cite privacy issues as the reason:

There are two problem with the way SSNs are used these days. The first is that they are used (by different parties) as if they were both a representation of identity and a secure password. The second problem is that they have become a widely used identifier which can be used to tie multiple records together about a single individual.

Hmm..."widely used identifer which can be used to tie multiple records together." This sounds strangely familiar. Isn't timothyfalconer.com just as dangerous to me as my SSN? Won't human URIs, whatever we use, cause exactly the same concern?

And is this concern really warranted? Wouldn't we just be better off finally excepting that our DNA Fingerprint (or the like) will be emblazoned on everything we touch, and this information will be accessible to everyone, whether we like it or not? Isn't our trust in "privacy policies" a kind of social pacifier, that in the end just relaxes our guard? And when you come right down to it, don't you think that anyone, anyone, with enough motivation and money, could pretty much find out everything they wanted about you, even now?

I mean really ... we live in an age where anyone can type your name in Google or InfoSpace and find tons about you, including your street address, complete with arial map. Isn't it time to just give up and let our clothes drop to the floor? Aren't you just as likely, or more, to be victimized by a random passing stranger as a lurking Internet creep or evil corporation?

Lots of questions, but no answers tonight. In this area, I really just don't know. It's still worth asking the questions though.




Trackbacks (use http://immuexa.com/cgi-bin/mtype/mt-tb.cgi/88)


Comments

I'm not afraid that "http://www.markbaker.ca" is passed around as identifier for me, because although anybody can claim it as their own (as with an SSN), I'm the only one who can back up that claim as I control the content that's returned on a GET (unlike with an SSN).

Check out Technorati's means of "claiming" a weblog.

posted by Mark Baker at January 4, 2004 10:32 PM


I think you overlook the main problem with SSNs as a public identifier. If someone has access to that, they can apply for credit in your name. I think widely dispersed SSNs would make it much more likely to be victimized in this way than by a passing stranger. Would you be willing to post yours online?

posted by Jen Golbeck at January 4, 2004 10:49 PM


Hi. Quick correction re FOAF. FOAF doesn't dictate that mailboxes (or their hashes) are "the FOAF way" of identifying you. FOAF's way is to say that any property that is an owl:InverseFunctionalProperty can be used. This includes but isn't limited to foaf:homepage, foaf:mbox etc.
My take on the problem is at
http://rdfweb.org/mt/foaflog/archives/000039.html

One observation re mailboxes changing: http://xmlns.com/foaf/0.1/#term_mbox
[[
personal mailbox - A personal mailbox, ie. an Internet mailbox associated with exactly one owner, the first owner of this mailbox. This is a 'static inverse functional property', in that there is (across time and change) at most one individual that ever has any particular value for foaf:mbox.
]]

This allows for scenarios such as my losing access to mailto:daniel.brickley@bristol.ac.uk yet it remains in FOAF terms a foaf:mbox of mine, and hence a thing that can be used to identify me, even years later. Of course we might now want to add some notion of 'current mailbox', 'work vs home' mailboxes etc., but all in good time...

posted by Dan Brickley at January 5, 2004 06:25 PM


I really don't know how I feel about SSNs. I'm simply raising two questions: 1) are privacy fears regarding SSNs still justified? 2) if they are, why wouldn't *any* universal identifier (URI) carry the same danger. This second point is connected to SSNs ability to tie information about me together easily.

As for their "password" quality, perhaps the question that needs asking is "Why would a bank give credit because I simply gave a number to them?" Perhaps the real danger is our false comfort in thinking people don't know our SSN. If everyone knew our SSN, then people wouldn't be so stupid as to treat them as some kind of personal password.

Maybe our efforts are better spent lobbying for banks and hospitals to make more of an effort verifying our identity. I'm mean, come on.... "SSN", "mother's maiden name?", "zip code", "city of birth" ... seems like identity theft would be a piece of cake. I mean, how hard could it be to find these things out on anyone.

posted by Timothy Falconer at January 5, 2004 09:24 PM


Dan, about URIs ... I see the problem of choosing one method for all, but it'd sure be sweet if we could all agree (and trust) some common naming scheme. I'm not sure how often email addresses get re-used, but I'm sure it happens. Without some kind of registration procedure or central repository, the chance of two people using the same URI for their record certainly exists.

I guess the real point of this post is this: let's say there was a "foaf.org" registration server that merely existed to prevent conflicts. I made my foaf file, submitted it to foaf.org, it checked my URI for uniqueness, and sent me some kind of ascii-armored something or other I could include in the foaf file. To prevent privacy concerns, no informaton could be retrieved from the server other than YES or NO. Let's say (for example) that it actually assigned a unique ID number or key sequence, so you didn't have to rely on a mailbox.... the foaf server gave you the URI you could use.
'
Here's the point: isn't this the same problem as with SSNs? Doesn't personal URIs, however they're done, make it too easy to tie information together, much as the SSN privacy folks are worried about?

posted by Timothy Falconer at January 5, 2004 09:38 PM


Crikey!

posted by dave at January 11, 2004 06:55 AM