SIDEBAR
»
S
I
D
E
B
A
R
«
Cloudy Verification
Apr 26th, 2008 by jens

Continuing from the previous Cloudy post

The first time you connect to someone, how do you establish that digital identifier you’re communicating with is the human being you think it is? This is surprisingly difficult to do, because it’s prone to what cryptographers call the “man-in-the-middle attack”.

(Those of you already wearing tinfoil hats can skip past the general explanation, down to “What Cloudy Does”.)

1. A Quick Overview Of Verification Attacks.

First, consider the most obvious attack: simple spoofing.

Spoofing.

Let’s suppose there’s an instant-messaging UI, and while working at home you receive a message from someone with an unknown key, whose nickname is “AliceLiddell”, which happens to be the name of a co-worker.

“AliceLiddell”: yo, this is alice
You: hi alice, what’s up?
You add this identity to your friends-list.
Alice: i need the admin password to the web server to fix a template
You: oh ok, it’s wend4743kt
Alice: kthxbye

Fifteen minutes later your company’s website is pwned by the hacker who posed as Alice. All he had to do was create a new identity with her name as the nickname, and pretend to be her.

How do we get around this? You might think that asking questions before accepting someone’s claimed identity would help, and it does help with spoofing, but there are nastier attacks.

Man-In-The-Middle

“AliceLiddell”: yo, this is alice
You: You haven’t contacted me before … how do I know you?
“AliceLiddell”: i’m down the hall next door to brad. i need to ask you a question but you’re not in the office today.
You: yeah, i’m working from home. sorry to be paranoid, but what’s the poster on your wall say?
“AliceLiddell”: it used to say “hang in there baby” but i took it down when lolcats started getting too popular =)
You add this identity to your friends-list.
You: cool … hi alice, what’s up?

Having established that this is really Alice, you go on to give her the password … and fifteen minutes later your company’s website gets pwned anyway. What went wrong? Well, it really was Alice you were talking with; but the hacker was able to listen in and read the password. Wasn’t the industrial-strength 2048-bit RSA encryption supposed to prevent this?

The problem is that you and Alice were talking with each other; but you weren’t directly connected to each other. Instead each of you was connected to the hacker, who was relaying your messages back and forth. In this scenario what probably happened was that Alice tried to look you up by your name, found the hacker’s fake account instead, and the hacker’s computer then quickly created an identity with the same nickname as Alice, connected to the real you using that identity, and started forwarding your messages to each other while recording them itself.

What’s even worse: That identity you added to your friend-list as Alice? It’s really the hacker’s identity. From now on the hacker can talk directly to you and you’ll probably assume it’s Alice.

How Do We Solve This?

The man-in-the-middle attack is resistant to nearly any kind of in-band verification. You can ask Alice any personal questions you want, but it won’t reveal that you’re not connected directly to Alice. You can ask Alice to type in her public key, but the hacker can edit her reply and substitute the key he’s connected to you by.

About the only practical way to solve this, unfortunately, is to use an out-of-band channel. You need to talk with the real Alice and compare notes, before you can trust that her digital identity belongs to her. All you have to do, really, is get her real public key and compare it to the key you’re communicating with. (And she has to do likewise, of course.)

The canonical way to do this is to meet Alice in person and swap public keys. (PGP users call this a “key-signing ceremony”.) Or you and Alice can read your keys to each other over the phone (or Skype, or an iChat video conference.) Sending the keys over IM is somewhat less reliable, but enough so for many purposes, since forging centralized IMs is a fairly involved task.

Of course, we don’t want to read 512 hexadecimal digits to each other! One optimization is to compare secure hashes of the keys (as PGP does), but that’s still 40 digits. And those “B”s and “D”s are so easy to mix up over the phone.

2. What Cloudy Does.

CloudyAlert.png

Cloudy’s verification scheme is blatantly stolen from the one used in Bryan Ford et al’s Unmanaged Internet Architecture. Instead of making you read a number as a string of digits, Cloudy converts it into a three-word phrase by mapping consecutive chunks of bits into words in an English dictionary, moreover a dictionary that’s been specially constructed of words that are easy to recognize and hard to mix up.

And instead of making you listen to the words and type them in, Cloudy (like UIA) presents a short list of phrases with radio buttons, for you to pick from. One of them is the one that will be correct if the connection is genuine, the others are chosen at random, and there’s a catch-all “None of the above” at the end. If the user didn’t select the expected phrase, something’s wrong.

CloudyVerification.png

(An aside: the phrase only encodes 32 bits, which is far less than even the SHA-1 hash. Just hashing the key down to 32 bits would not be secure enough; instead Cloudy creates a one-time 32-bit key by combining the public key with a randomly-chosen integer that’s sent to the other peer at the time of verification.)

Ford points out another benefit of this interface: “its multiple-choice design prevents users from just clicking ‘OK’ without actually comparing the keys”, which defeats the user’s damnable tendency to just dismiss all security-related alerts.

Once this is done, and the user chose the right verification phrase, Cloudy adds the other person’s public key/identity to your “contact list” in its persistent storage. You can then decide to associate that key with an entry in your Address Book. Cloudy also mints a “relationship” certificate attesting that you have verified the other person’s identity; you can choose to annotate the relationship with XFN tags like “friend” or “co-worker”. These certs can be passed to other friends to transitively extend trust.

How well does this user interface work? Cloudy hasn’t seen much real-world use yet, but I’ve gone through the initial setup with a half-dozen people, and the verification (once I debugged it!) is quite easy to follow and takes only ten seconds or so.

3. Is This Too Paranoid?

One of the unpleasant side effects of learning too much about computer security is that you start to become paranoid. You swallow the red pill of the Internet and discover how much we take for granted, how much trust we implicitly place in things that are not trustworthy: domain names, centralized databases, passwords, emails. In severe cases, you start to self-identify as a cypherpunk and refuse to connect to any server through fewer than three anonymizing proxies. It’s a bit like Medical Student Syndrome.

On the other hand, I think a lot of this paranoia is justified. I remember the old days, when “spam” was just a Monty Python sketch and you could trust the “From:” line of an email. Nowadays most of the emails we get have forged senders, and even a message that sounds like it came from a friend might have been sent by some shady social-networking site he foolishly uploaded his address book to. Not too many people worry about domain names yet, but DNS is not hard to mess with, either by hackers or by profit-motivated ISPs.

Eventually, anything that can be subverted, will. And since peer-to-peer software can’t use the standard brute-force obstacles (centralized authority, locked-down servers) to delay attacks, it has to rely on actually being secure. And that means public keys, encryption, webs of trust. As many have pointed out, if you make security an optional add-on to a product, hardly anyone will use it. (How many people you know sign or encrypt their email?) It needs to be built in by default. And the more our privacy is invaded by advertisers, ISPs, search engines, phishers, monopolistic content owners and the like, the more that drives the adoption of actually-secure software by end-users.

Having to go out-of-band and swap three-word verification codes with your buddies is an inconvenience. But you only have to do it once with any particular person; after that, Cloudy remembers their key. And I will probably, in the future, put in some form of transitive trust: if I haven’t verified you, but I verified Jean-Claude and he’s verified you [and signed a cert to that effect] then I’ll decide to trust you too.

Next: Cloudy Gossip.

Why They’re Doing This
Apr 19th, 2008 by jens

I don’t want to make a habit of replying on my blog to posts on other blogs, because (a) it’s dorky in an autistic way, and (b) it only encourages the annoying practice of blogs that stick their fingers in their ears.

But I’ve seen a couple of references now to Dean Allen’s complaint about sites that offer multiple RSS feed formats, but no place to post a comment about it; and since it directly relates to my past job monkeying with feeds, I feel like I should answer.

There are two reasons why a web page would link to multiple feeds.

  1. To support feed-readers that don’t understand every format. The XML-syndication-format field has a totally ludicrous history of incompatible versionitis, and the only format that’s actually sanely designed (Atom) is new enough that for a while some major clients, such as BlogLines, didn’t support it. So it’s reasonable during such a transition period to generate both formats.
  2. Because there might be feeds with different content. Some sites offer headlines-only feeds and full-content feeds. Some blogs offer a feed of all the comments on one post, as well as the usual feed of all the posts. Some wikis offer a feed of revisions of a single page.

The problem for a client like Safari is that there isn’t a clear way to tell the difference between those two cases. When a site offers multiple feeds (by including multiple LINK tags in its HTML) do they have the same human-readable content or not? Answering that would require, at a minimum, downloading and parsing both feeds and trying to match the contents of the entries.

The first version of Safari RSS (in Mac OS X 10.4) went for simplicity, by indicating in its UI only that there was a feed available, not how many. If the user pressed the “RSS” button, it would pick one of the feeds to switch to. The heuristic was to give priority both to order (picking the first feed listed) but also to format (preferring Atom to RSS, because the format is much better-defined and less prone to nasty ambiguities and parsing problems.)

But some people complained that, on sites that offered multiple feeds with different content (like articles vs. comments) you couldn’t use the button to pick which one you wanted. So in Mac OS X 10.5 we decided to make the button into a pop-up if there were multiple feeds listed.

The problem now is that you more commonly end up being offered a choice between two formats with identical content, which is what Dean and others complain about, because most blog engines are configured to offer both Atom and RSS (and sometimes multiple flavors of each.)

Here’s where I could talk about the design problems of the feed auto-discovery mechanism, and describe better ways it could have been designed to support multiple feed formats [hint: HTTP 1.1 already supports content-type negotiation] or even future improvements that could be made to reduce this annoyance in the future. But you know what? I don’t care about Trendy XML-Based Syndication Formats anymore. OK, all right, I clearly care enough to write a multi-paragraph explanation and post it to my blog. But not any more than that.

Cloudy Networking
Apr 17th, 2008 by jens

Next I need to talk about networking; having an identity and minting certificates isn’t very interesting until you can connect to someone else.

Point-to-Point Communications.

When one Cloudy peer wants to communicate with another one, it opens a TCP socket to its IP address —

[Hang on, there are two issues I suddenly glossed over in that last phrase. First, how did this peer find out the others’ IP address? These are just random computers, not servers, so they don’t have their own domain names or even stable addresses. This is indeed a problem with any unstructured peer-to-peer network, but the solution involves things I won’t get to until the next installment, in an unfortunately but necessary violation of layering.]

[Oh, and issue #2 is that most home computers are now behind Network Address Translators (usually some kind of WiFi base station or broadband router), which means they don’t have their own real IP addresses and can’t receive incoming connections. Fortunately, most NATs now support protocols that allow clients to open listening ports to the outside world, and doubly fortunately, Mac OS X 10.5 includes an API for making such connections. Cloudy opens such a port whenever it finds itself behind a NAT.]

— and runs a protocol called BEEP over the socket.

BEEP.

BEEP is a sort of generic application protocol that multiplexes a TCP socket into multiple virtual channels, each of which can send and receive binary messages. It’s very handy for designing your own protocols, since it lets you focus on the high-level tasks of defining how your messages are encoded and when to send them and what to send in response.

I’m using an open-source (LGPL) implementation of BEEP called Vortex. Its API is in C, but I’ve written a Foundation-level Objective-C wrapper around it. (I’ll probably open-source that code sometime.)

One nice feature of BEEP and Vortex is that they handle SSL for you. The BEEP protocol lets the two peers negotiate what type of SSL they support, before switching over to it, almost transparently to the application code. Since the first thing that happens in SSL setup is exchanging certificates, each instance of Cloudy immediately learns the identity of the peer it’s connecting to. (In normal HTTP-over-SSL, only the server has a certificate and the browser remains anonymous; but SSL supports bidirectional authentication and Cloudy uses it.) Unlike most client-server protocols, Cloudy has no need for a login: each peer has seen the others’ public key, and the ability to use that public key proves that the other peer owns the private key, and hence that identity.

So now the two peers are connected, they’ve identified and authenticated each other, and their communication channel is encrypted. They can now open BEEP channels and send each other messages across them. The primary types of messages Cloudy sends are signed objects (certificates); I’ll get into those later.

Local Area Discovery (Bonjour).

As you’d expect, Cloudy also uses Bonjour. This is somewhat orthogonal to BEEP —Bonjour’s a discovery protocol, so its main purpose is to let peers on the same LAN find out each others’ names and addresses. But Bonjour does support a thing called a TXT Record, which is a small chunk of arbitrary metadata that a service can associate with itself. For example, iChat stores your availability and status message in its Bonjour TXT record, which is how its Bonjour buddy list can show that information for everyone on your network.

Remember the “CallingCard” I used as an example of a signed object in the last post? Well, that’s what Cloudy puts in its TXT record. The CallingCard contains your availability and status, but what’s really important is that it contains your public key, which is your identity.

So Bonjour solves, at least on a LAN, the discoverability problem I pointed out at the start of this post. At this point, if the peer you want to send messages to is on the same network, Cloudy can easily find it via Bonjour, open a BEEP socket, and authenticate over SSL.

What’s more, Cloudy’s view of who’s on the network is actually trustworthy. The CallingCard is signed with your public key, proving that you created it. iChat’s Bonjour IM has always been insecure in that there’s no way to tell whether anyone else is who they say they are: all you know about someone is their name, which they can easily change to anything they want by editing their address book. In Cloudy, on the other hand, once you’ve communicated with someone once, your app remembers their public key, and it can identify in the future whether a peer appearing on the network is that person or not. (To make this clear in the UI, the name of anyone you haven’t previously vouched for is shown “in quotes”.)

— Oops, I just skipped over a tricky problem again. The first time you connect to someone, how do you establish that the digital identifier you’re communicating with corresponds to the human being you think it is? This is surprisingly difficult to do, because it’s vulnerable to what cryptographers call the man-in-the-middle attack. It’s worth a post by itself…

Next: Verifying Identities.

Cloudy Identity
Apr 15th, 2008 by jens

Continuing from the previous Cloudy post

At the root of Cloudy is the means for creating and establishing identity. A lot of peer-to-peer systems treat the peers mostly as interchangeable anonymous nodes, often deliberately so, but Cloudy is a social system.

Quick Crypto Recap.

The identity and security layers of Cloudy are tightly intertwined, because identity without security is useless. And security is accomplished entirely through cryptography, because the centralized alternatives like locking all of your servers up in a closet don’t apply. Cloudy doesn’t do anything new cryptographically (wisely so), but for the benefit of those who aren’t familiar with it, here’s a superficial overview of the off-the-shelf tools I’m using:

Cryptographic Hashes, or, Digests.

Like any hash algorithm, a cryptographic hash converts a block of data of arbitrary length into a short fixed-length output; the same input always produces the same output; and even the slightest change to the input should produce an entirely different output. Unlike a regular hash, two different inputs should never result in the same hash output. (That’s “never” in the practical sense: collisions are mathematically inevitable, but it should impractically long, ideally millions of years, to find one.) And it should be infeasible to identify anything about the original data given only the hash.

Cryptographic hashes are rather weird and neat. I’ve previously called them “the Dewey Decimal numbers for the Universal Library”. They also remind me of the scene in the old TV cartoon of “The Cat In The Hat”, where the Cat and kids are running around labeling every object in the house with cryptic identifiers like “QW-X12”. Digests are a bit longer than that (SHA-1 outputs 160 bits, i.e. 20 bytes) but it’s still very handy to have a compact label to identify any conceivable chunk of data.

Public/Private Key Pairs, or, Asymmetric Keys

A regular cipher uses what’s called a symmetric key. The sender and receiver choose a single key that they both have to know, but keep secret. The sender inserts the key into the encryption algorithm and feeds the message in, and out comes the encrypted form. The receiver then inserts the same key into the decryption algorithm, feeds the encrypted data into it, and out comes the original message. The point is that they use a single key, and the one who generates the key has to somehow convey it secretly to the other party before they can use it, leading to an obvious chicken-and-egg problem.

Asymmetric encryption algorithms, the best known of which is RSA, use two keys. The keys are generated together in a matched pair. When one key is used by the sender to encrypt data, it takes the other key of the pair to decrypt it.

The genius of this is that you only have to keep one of the keys secret. The other one can be given away freely and is called the “public key”. If someone else has a copy of your public key, s/he can use your public key to encrypt a message to you. Remember, it doesn’t matter how people get your key; you can read it to them over the phone, email it, print it on a billboard. The encrypted message still can’t be read by anyone else, because only you have the matching private key to decrypt it. In other words, it becomes possible to send secure messages without having to share a secret key in advance.

Digital Signatures.

There’s another use for key-pairs. You can use your private key to generate a signature of a message, a small block of data that can be attached to it. Anyone who has your public key can then use it to verify the signature, i.e. prove that you generated the signature of that message with your private key. No one else could have generated the signature. In other words, as the name “signature” implies, this is a way to unforgeably mark a document to show your authorship or approval.

(A digital signature is really just a cryptographic hash of the document, which is then encrypted with the private key. To verify someone’s signature you just decrypt it using their public key, then compute your own hash of the document and compare the two results.)

Digital Certificates.

A certificate is, in general, just a signed document that attests to something. Usually it’s vouching for the identity of the owner of another key; something like “the owner of the public key 3FD8B640 is Joe-Bob Briggs, of Dallas TX, joebob@example.com. Signed, Verisign.com.” This is the standard way that trust gets spread around in a distributed system.

Back To Cloudy: Generating An Identity.

Your Cloudy identity is simply a public key, currently 2048-bit RSA, generated the first time you launch the program. (The matching private key is stored securely in the Mac OS Keychain.) From then on, that public key uniquely identifies you. It’s unique because it’s randomly picked from a space so large that the possibility of collisions is for all practical purposes zero. It identifies you because it can be used by others to verify anything you signed with the matching private key.

[2048 bits (256 bytes) is somewhat bulky to use as an identifier; so a public key can be run through an SHA-1 hash and converted into a 160-bit (20-byte) digest form that’s, for all intents and purposes, equally unique. (2160 is about 1048, nearly the number of atoms in the Earth.) The digest, however, has no cryptographic value to a recipient who doesn’t already have the public key, so it’s not a secure identifier by itself.]

The first thing the new key is used for is to mint an SSL certificate, which will be used for identification when you communicate with other peers over SSL sockets. It’s a “self-signed” certificate because it doesn’t contain a signature from any trusted higher authority (there aren’t any). But that’s OK: when Cloudy peers connect, they only need to make sure of the identities they’re contacting, which are literally just the public keys in the certificates.

Cloudy Certificates.

SSL unfortunately uses an awkward and complex certificate format called X.509, which is one of two evolutionary relics left over from a long-dead overly-ambitious network architecture called X.500. (The other one is the LDAP directory protocol; the fact that the L stands for “Lightweight” gives you an idea of how comparatively elephantine X.500 must have been.) Most cryptographic experts seem to hate X.509: Ferguson and Schneier’s Practical Cryptography flatly recommends avoiding it if possible, and Peter Gutmann’s overview is a masterful takedown, with sarcasm worthy of Doug Piranha.

After spending a week painfully figuring out how to generate a goddamn trivial self-signed cert, even with the help of state-of-the-art system APIs, I could understand what the experts meant. I didn’t want to use X.509 anymore. And it wasn’t flexible enough anyway, since it was designed around the idea of hierarchical authorities. Unfortunately I didn’t have a choice for SSL, but I went with an alternate approach for all the other certs Cloudy peer use when talking to each other.

I really liked the approach taken by SDSI, a distributed identity system from about 10 years ago that never took off. It defined a simple textual syntax for certificates. SDSI used LISP-like S-expressions as the syntax, but the details aren’t important—I took the abstract concepts and went with something I found more readable. I tried JSON first, but found it too limiting, so I ended up using YAML.

[YAML is a data serialization syntax; it’s language-agnostic, but most popular in the Ruby community. Its main advantages over JSON (or OS X property lists) are a richer set of data types, custom typing for collections (i.e. you can say “this array is a Rectangle” or “this dictionary is a Person”), and the ability to represent arbitrary object graphs, not just trees. You can think of it as being like a pretty syntax for Cocoa object archives or Java object serialization.]

All I had to do (aside from writing a good Cocoa wrapper API for YAML) was define a schema for representing things like keys and signatures in YAML. Then I could use those to define my own signed certificate objects.

A YAML Certificate Example.

.  --- !cloudy/CallingCard
.  host: 76.191.199.123
.  prof: 229474364
.  port: 60507
.  stat: 4
.  signature: !cloudy/Signature
.    signed: !binary |-
.      oVCuVVlXPEdRPR+gy1k/UNOXtwvcN7LNpK6xTcA/hmlKh6uIT56E19LxWzA7POxm
.      nhc351NVdoKC9XaUVsaZYDOnp2wWEWLUtdYYA8I++NZZIVlCHOjHCHr7mcfNcceD
.      v+15RE9vguQ/PO1yaOU4DlviYt75y7xKMRs5REbZss6E/mr+0r1KE+f73dpHCVoD
.      SW0azTD43pug2Pyh2Kar0GHXQcS4Iq/Y2nRFv7wyLUUmyVA7XI665a8QjMCiec2w
.      0PqQ32FwGBYkH/iR/cfmaKjuwjAbW/qo7NoTH6WSFQy2ua/PVQs9B+dyjnZ5Z30E
.      rnl9UTCVwjUmCc8J4hoaTQ==
.    digest: !cloudy/Digest pFCzUK7yuO0dWtm0oATB7ag6vj0=
.    date: 2008-04-15 21:55:46.830 -07:00
.    expires: 21600
.    signer: !cloudy/Person
.      nickname: snej
.      publicKey: !cloudy/PublicKey
.        algorithm: 42
.        format: 1
.        bits: 2048
.        data: !binary |-
.          MIIBCgKCAQEApP6/D5aZm7nYfGwSMD3xQCCWw+XeU1NmZE7N/7eHvQlCUHMS8Aac
.          Wh+s/PlPd1o7k+YePhoHnc1vR9uAfWm8iowiUU0RluUNxY0dRkTauRqeYM6//s+5
.          ZXuh27pDDq2BgQYPL6EOp2UtWSQ/ojQjqX2/sGMkZ3k+uYiu1ZGQS2s0xTHPkgtu
.          VI+Kg2TBY/28zAG4H/seUHNAP+frlpX+fizSC2oYNdREpEcVcVacHMQGwrj3mAr7
.          g/LpJTnWgZhiJYvp7c4MkAYfHOIbKIXeXrF8oOz0EwgwSp0ZWkezuIYa4BMAns52
.          WYK3LooQ+GttPIdVhSzzhLlY3psLeOf6nQIDAQAB

This represents a “CallingCard” object, which a peer broadcasts in order to tell other peers that it’s online, and where it is on the network and what it’s current state is. (One of the places a CallingCard appears is in the TXT record of Cloudy’s Bonjour service.)

The syntax is more complex than JSON, but still pretty easy to read:

  • The first line says this is a dictionary structure whose higher-level type is “cloudy/CallingCard” (this gets mapped to the CallingCard class in my code.)
  • The next four lines describe four key/value pairs in the dictionary. In a CallingCard these represent the IP address, port number, timestamp of the user’s current “profile”, and online status (4 = “Available”). The keys are four letters long just to save some room and because I get nostalgic for OSTypes sometimes.
  • Line 6 assigns the “signature” key to a nested dictionary of type “cloudy/Signature”. This dictionary is the digital signature of the enclosing object.
  • The “signed” attribute of the signature is the raw RSA signature data.
  • The “digest” attribute is the SHA-1 digest of the object being signed, in this case the enclosing CallingCard, ignoring the “signature” attribute.
  • The “date” attribute is the timestamp of the moment the signature was generated.
  • The “expires” attribute is the lifetime of the signature, in seconds, starting from the “date”. After this interval the signature expires, and the signed object loses its validity and will generally be deleted, or at least not passed on to other peers anymore.
  • The “signer” attribute is a cloudy/Person object, the identity who generated the signature.
  • “nickname” is a brief human-readable name for this identity. It doesn’t really mean anything; it’s just useful as a default name to display (like an AIM handle in a buddy list) if the local user hasn’t set up a customized name.
  • “publicKey” is the identity’s public key, the actual unique identifier.
  • “algorithm” identifies the type of key (RSA), “format” identifies the format of the key data (PEM, I think), “bits” is the number of bits in the key, and “data” is the key data itself.

This does look a bit verbose when written out, but of course usually you’d never see this unless you were debugging something. (And it compresses well, by about 50% using gzip.) One space-saving feature that doesn’t show up here is that, if the same object appears more than once, it’s only written out once; after that it appears as a short reference back to the definition. So if I YAML-encode an array of signed objects (which is very common), my cloudy/Person data only appears once.

A Nasty Detail: Canonical Form

I glossed over an important detail: to sign an object you have to compute a digest of it, and to compute a digest you have to be able to express the object as raw data. Clearly the raw data in this case is the YAML encoding of the object, right?

The catch is that in YAML, as with any human-readable syntax, there are many different ways to write out the same object. I can change the indentation, I can change the line breaks, I can list dictionary attributes in a different order. Any of those changes causes the resulting digest to look completely different.

This is a real problem because, if I read a signed object from YAML into a native object and then write it back out to YAML, it’s likely to come out slightly differently. For example, if I write it as part of an array, then as a nested element its lines will be indented. Also, the ways dictionaries are stored in hashtables mean that their keys come out in unpredictable orders when iterated. But if that happens, the digest changes, which invalidates the signature.

So any certificate syntax has to define a single standard (“canonical”) encoding of an object into binary data. In my YAML code I had to enable a “canonical mode” that, when turned on, causes a specific set of spacing rules to be used, dictionary and set entries to be written in alphabetical order, et cetera. This mode isn’t normally used, but it has to be turned on when computing the digest of an object, in order to sign it or to verify a signature.

[Incidentally, one of the reasons that digital signatures aren’t being used much in the various trendy XML-based data formats, like RSS and Atom, is that XML is much more difficult to canonicalize. I don’t understand all of the details, but they looked nasty enough that I was glad enough to rule out using XML.]

Verifying A Signed Object.

When you verify the signature of a block of YAML like the above, you have to do this:

  1. Parse the YAML into a graph of native objects.
  2. Take the root Signed object, remove the “signature” attribute, and write it back into YAML in “canonical mode”.
  3. Compute the digest of that canonical YAML.
  4. Compare the digest with the “digest” attribute of the Signature. If it doesn’t match, the object’s been tampered with (or damaged) and should be ignored.
  5. Otherwise, write the “Signature” object back into canonical YAML and compute the digest.
  6. Encrypt that digest using the public key in the “signer.publicKey” attribute.
  7. Compare the result with the “signed” attribute. If it doesn’t match, the signature was forged (or damaged.)
  8. Otherwise, the signature is valid and the outer Signed object can be treated as being definitively created by the Person listed in the Signature.

Whew.

OK, so we can create secure identities, encrypt stuff, and sign arbitrary objects. Now what do we do with them? The CallingCard example above should give you some ideas, but I’ll go into more detail in the next ‘thrilling’ installment.

Next: Networking.

Cloudy As Buzzwords
Apr 13th, 2008 by jens

Continuing from Unstealthing, Incrementally

I have many ideas for applications, but most of them seem to rely on similar kinds of infrastructure, in particular a distributed, secure application-level messaging system. Unfortunately, this doesn’t really exist yet, at least not in any form that meets my needs.

What am I talking about here? More colloquially, it’s a mechanism for letting applications all over the network send messages to each other, without requiring a central server, and without allowing messages to be eavesdropped upon or faked.

Let’s take it one buzzword at a time…

Distributed.

I don’t know about you, but I’m getting fed up with centralization. It happens because it’s the path of least resistance: buy a domain name, rent a server, buy more servers and stick a load-balancer up front as your user base grows. It’s solving problems by throwing hardware at them. The end result can certainly work fine, but too often it’s fragile: both technically (site goes down, ten million users get pissed off) and politically (just one domain for China to censor, one company for France to file lawsuits against.)

In social software especially, there’s an additional type of cultural fragility, since the owners, implementors and users of a big social site have different goals and massively different levels of power. Many examples have shown that this creates sites that scale up into the equivalent of planned communities, shopping malls, theme parks, marred periodically by the protests of an angry minority of users against what they see as privacy intrusions or censorship.

Unfortunately it’s hard to use the Web architecture in a distributed way. This would involve lots of small groups (or individuals) running their own Web-based services that interoperated seamlessly. There are certainly technologies that help with this interoperability, such as OpenID and web services (whether REST or SOAP), but the bottom line is that setting up any kind of Web based system is a task that’s way beyond nearly all end-users, just as it was in 1994. It just never got any easier! You still have to know about FTP uploads and file permissions and CGI-BIN directories and Apache logs and MySQL configuration, even to set up something trivial like a damn blog.

So the options for Web-based social software end up being (a) Install it yourself on your vanity domain, if you’re a total geek and don’t mind doing your own tech support; or (b) Patronize some hosted service that will take care of it for you. What happens then, as popularity of this medium increases, is that the hosted services get bigger and bigger, money pours into them, they go into arms races of feature creep and marketing, they get even bigger, and voila! It’s MySpace and FaceBook all over again.

So that leads me to distributed non-Web-based systems. Which are completely untenable because the only way you can run custom oddball software on a real server is to rent your own private server (or virtual one), which costs orders of magnitude more than the regular web hosts that just let you run CGI scripts. Or if you want to run server software on your computer at home, you find that your consumer broadband connection comes with throttled upload bandwidth, no fixed IP address, and terms of service that forbid you from running servers.

Do you see where I’m going? Peer to peer. It’s the ultimate decentralization. If you resolve the scaling problems at the protocol level, then none of the network nodes need unusual amounts of horsepower or bandwidth, and the total supply of both naturally increases along with the user base. Installation of software on desktop computers is a cut-and-dried problem: it’s as easy as downloading and double-clicking an app. And the content is completely in the hands of its creators and users: no Disneylands, no thought police, no ad beacons.

Secure.

“Secure” means a lot of things. In a messaging system, no legit user wants messages to be spoofed, tampered with, censored, read by the wrong people, or sent by the wrong people.

In the Good Old Days, no one worried about this because the net consisted of a few thousand geeks with Ph.Ds who pretty much trusted each other. (And that’s why email is so pathetically vulnerable to all the attacks I just listed.)

In the present day, developers focus on making their big monolithic servers secure from intruders, and the end users mostly trust that the developers are (a) doing a good job of that, and (b) not doing anything unethical themselves with the customers’ data. It mostly works out, I guess, although there are daily reports of embarrassing security holes, crazy new types of cross-site-scripting attacks, “creative” uses of sensitive customer data for marketing purposes, and backup tapes of millions of customer credit card numbers being accidentally left behind on the bus.

The more you think about that, the less crazy it seems to trust a peer-to-peer system where there are no authority figures handing out accounts or enforcing access privileges. Because in such a system you have to actually make everything secure by design instead of just trusting people. Want to create an identity that no one can steal? Generate your own key-pair (and keep your private key safe!) Want to keep the wrong people from reading something? Encrypt it. Want to prove you wrote something and keep anyone from altering it? Digitally sign it.

It’s not like there aren’t any known solutions to this stuff. It’s just that applying them can be tricky, and so far it’s generally been easier to just lock your big server up in a data center instead.

Application-Level Messaging.

What I mean by this is that the messages sent between users’ computers are not necessarily directly human-readable. Unlike emails or instant messages, they might not be delivered right to the screen, but rather to an application that presents them differently. For example, a message might represent a move in a chess game, a revision of a shared document, or a notification of new content stored elsewhere. Sending these as emails or IMs is awkward and relies on having the user intervene by opening file enclosures or URLs.

Next: How Cloudy Does These Things.

Unstealthing, Incrementally
Apr 12th, 2008 by jens

I got about 14 minutes of fame back in January with a blog post, wherein I grumbled about (among other things) how I disliked Apple’s culture of secrecy, and announced that I’d left Apple to work on my own, unspecified, project. In the intervening three months, I haven’t said anything about what that project is, almost as though it were … secret.

The irony of this is not lost on me.

Admittedly, there are things about my app that I do want to keep under my hat until they’re ready to show off in their full glory. I want to spend my one minute of remaining fame wisely; ideally accompanied by a large friendly “BUY NOW” button on my website.

But the main reason I haven’t been talking is just that I’ve been lazy. Well, not lazy, but focused on coding rather than talking. I’m mindful of a quote by (I think) John Crowley, which goes something like:

—There are two kinds of poems: the ones you write, and the ones you talk about writing. They’re both important, but never get mixed up about which kind you have.

I feel like I’ve been talking about writing this type of app (if only to myself) for a decade now, so it’s really been time to buckle down and make it happen.

But now I’ve got a lot of stuff up and running, and I’m excited about it, and feeling annoyed that I can’t just blab about my progress. Oh wait, but I can! So starting now, I’ll be writing about my project here — I want to post high-level overviews, geekier details of the innards, and progress notes. Sort of like those “developer diaries” the videogame sites love to run.

The catch is that I’m going to talk about the architecture of the app, and its core functionality … but not the primary user-level feature, the selling point. Not yet. Even without that, I hope this will still be interesting to some of you.

Meet Cloudy.

Cloudy is a comic-strip character my son Jed started drawing two years ago, when he was ten.

I’ve already appropriated Cloudy in the past for a mix CD and a t-shirt, so it was a no-brainer to make her the mascot of my new project as well.

I suspect the app itself will have a more descriptive name by the time it ships, but Cloudy’s a good name to keep for the underlying architecture. And what’s that?

Next: What Cloudy Is.

Japanese Advertisers Discover Zooko’s Triangle
Mar 26th, 2008 by jens

Cabel Sasser, of indie developer Panic, reports from Japan:

“Within minutes of riding on the first trains in Japan, I notice a significant change in advertising, from train to television. The trend? No more printed URLs. The replacement? Search boxes! With recommended search terms!” [*]

He goes on to note how common it is for people to type URLs or domain names into their browser’s search box instead of the address field. To American geeks this seems clueless, but Cabel points out that in Japan it makes more sense, since URLs are in a foreign alphabet, so search words are much more memorable.

First off, this instantly reminded me of two favorite jokes:

  • Homer Simpson, picking up the phone: “Operator! Get me the number for ‘911’!”
  • Scott Pilgrim, on finding out that the cute girl he saw at a party in Toronto works as a delivery courier for Amazon.ca: “Hey, Amazon.ca, that’s the online bookstore or whatever, right? … What’s the website for that?”

But seriously: This is another example of Zooko’s Triangle, which basically says “names cannot be global, securely unique, and memorable, all at the same time”. URLs are global and unique, but not memorable, especially not in Japan; search terms are global and memorable, but not unique. Japanese advertisers are betting that you’re more likely to reach their site through keywords, even if nine competing sites show up next to it on Google, than if you forget the URL before you even get to a browser.

Marc Stiegler, on that page, predicted this:

“A good example of a nickname management system is Google. Type in a name, and Google will return a list that includes all the entities Google knows, to which the name refers. Google makes a mapping between these nicknames and their keys (if we think of the url of a page as a trusted-path-style key, which will be discussed later). Often enough to be interesting, the first item in the list will be the one you wanted. But it fails often enough, and endless pages of other choices appear often enough, to never leave us in doubt that these identifiers are not unique mappings to single keys. As is already true in the current world, in a world filled with petname systems, a key goal of marketing would be to get your nickname listed at the top of the Google rankings for that nickname.”

I wrote about this earlier, somewhere in the middle of my post FaceBook And Decentralized Identifiers”.

The iPhone Has Blinders On
Mar 21st, 2008 by jens

I bow to my esteemed colleague Craig Hockenberry’s greater experience in iPhone development; but I must disagree with his take on the infeasibility of background applications. He gives two reasons why networked apps shouldn’t run in the background — one technical and one user-interface.

Battery life.

The heart of the problem are the radios. Both the EDGE and Wi-Fi transceivers have significant power requirements. Whenever that hardware is on, your battery life is going to suck. My 5 minute refresh kept the hardware on and used up a lot of precious power. *

My immediate response is that, yes, polling is inefficient. Everybody knows this; but it’s also easy to implement, which is why way too many protocols use it. Normally the problems with polling first manifest as scalability problems on the server (as Twitter quickly discovered), but in the case of mobile devices, polling kills battery life.

So it’s a good thing that none of the real instant-messaging services poll. AIM, Jabber, MSN, Yahoo and ICQ all open a socket at login and leave it open, sending data only when necessary. If you suppress buddy-list updates while the app isn’t active, then data only needs to be sent when you send or receive an IM.

The remaining issue is one I don’t have the technical knowledge to answer: how much less power does it take to leave the EDGE radio passively listening for packets, as opposed to sending them? (Anyone who knows, please comment.)

One possibility is that it’s the same standby state the radio is already in (since obviously it has to be listening for incoming cellphone calls and SMS messages), so it incurs no extra battery drain. That would make non-polling background network services like IM clients totally feasible.

The other possibility, though, is that even listening for EDGE IP packets uses a lot of power. Which would be bad news, as it makes any background notifications other than phone calls and SMS impractical. But I don’t really believe this is the case — because if it were, how would the iPhone’s upcoming “push email” support for Microsoft Exchange work?

(Moreover, I have to point out that my prior cellphone, the T-Mobile Sidekick, had excellent AIM support, as well as push email, from day one, so it clearly is possible. The Sidekick’s battery life was decent, with maybe 3/4 the standby time of my iPhone.)

User-interface clutter.

From Craig’s second post:

You now have five independent sources for notifications. How do you let the user know which one is which? One might say, “make the sound different.” Another might say, “make something flash in the status bar.” Someone else might say, “make the phone vibrate.” Or even, “put up an alert box.” A truly sick individual might say, “Do all four.”
Can you see where I’m going with this? Your phone soon becomes a fricken’ pinball machine as multiple applications fight for your attention. With 24 notification permutations for every application, things will quickly get out of hand. *

I don’t buy this at all. In fact, I think it’s paternalistic. Yes, user interface design has to consider unintended consequences of users’ own actions, but this is a situation where the consequences are entirely obvious to the user: the more notifications you turn on, the more distractions you’ll get. The remedy is just as obvious: if you end up with too many distractions, you turn some of them down or off by using the exact same steps you used to turn them on in the first place.

I don’t mean to put words in Craig’s mouth, but I think that between the lines, that quote says: “You’d be able to turn on so many notifications that I would find it intolerable.” He’s dismissing a huge swath of functionality because it could be overused in ways that he personally would not want on his phone. That’s not a valid argument.

[It is, however, the kind of thing that’s already gone on with Apple’s own iPhone apps. My pet peeve is that the calendar’s alarm cannot be customized. It’s hardwired to emit two pathetic little bleeps, and that’s all, whereas phone ringtones are customizable and repeat. Clearly this was designed by and for people who have no problem remembering to attend meetings, and never procrastinate appointments, but will drop everything to answer the phone. Unfortunately I am completely the other way around, and I’m sure I’m not the only one.]

There’s also considerable irony here, in the fact that Craig’s best-known application is a client for Twitter, the service that is today’s poster child for “constant annoying notifications”. Clearly, then, he finds Twitter’s level of intrusiveness tolerable, whereas I generally don’t (I have an account but almost never use it.) On the other hand, I do find instant messaging invaluable, and mourn the loss of the mobile AIM presence I used to have on my Sidekick. Not that I got a lot of IMs while away from the computer; but the ones I did get, or sent, were often important to me. Chacun a son gout.

Gloomy conclusions.

I’ll end this with two spot-on quotes. First, from developer Hank Williams:

Prohibiting background processing is not just a question of one feature being left off a long list of otherwise very well executed features. The issue of background processing is the issue for a mobile device because it is key to two things:
• telling the world about your status in some ongoing way
• receiving notification of important events
These two things are the key to most new real innovations in the mobile space.

And c|net’s Tom Krazit’s response:

That makes sense; remember that friend or relative who got a mobile phone but never turned it on? That practice greatly diminishes (although some might say it enhances) the value of a mobile communications device, and one-way communication is not what has made the Web so interesting in its second decade.

Bingo. I’ve been obsessed with what we now call “social software” for over a decade; and the iPhone SDK doesn’t give us the full functionality we need to do this kind of stuff. Basically, the iPhone has blinders on, like a cart-horse: if it’s not already looking at you (in your online manifestation) it can’t see you, and you can’t get its attention. That’s great for a cart-horse, but it’s death for any kind of social interaction.

As I said above, I am aware of the technical issues (and would like to understand the details better). There may need to be work done to allow multiple applications to receive background notifications, in a practical manner. For all we know, Apple may already be working on this. But of course we have no way of knowing, because, in typical Apple fashion, they will absolutely refuse to admit the existence of the problem, much less any efforts to work around it, until they have a complete solution to spring on us. Unless of course they just don’t care; entirely likely, since quite a lot of people upstairs at Apple don’t seem to ‘get’ social software at all. We just have no way of knowing…

The Origin Of The iChat UI
Mar 18th, 2008 by jens

I had lost this historical document for a long time, but finally found it the other day on an old backup CD. It’s the original 1997 sketch I made of a chat user interface based on speech balloons.
Read the rest of this entry »

Dear Lazyweb: Certificates in RDF?
Jan 27th, 2008 by jens

Dear Lazyweb,

The project I’m working on will be using cryptographic certificates in a distributed web-of-trust model a little like that of PGP. It will also use certs as more than just proofs of identity. Given that I’ll be writing a lot of code using certs, I want to avoid the nastiness of X.509 whenever possible.

After thinking about this a while, it seems to me that RDF ought to be a good way to represent certs, since it describes arbitrary types of relationships between entities (e.g. FOAF), and allows them to be composed in complex ways. And there are a lot of tools available for parsing/storing/querying RDF.

Unfortunately, I know very little about RDF so far, or about the uses to which it’s being put. I’ve been looking, but I haven’t found any existing schema yet for using RDF for cryptographic certificates. Does anyone know of such a thing, or something related?

(The closest thing I know of is SDSI, a Simple Distributed Security Architecture, which was inspirational to me in showing how one can use general-purpose data structures like S-expressions to describe certs and form a web of trust. But SDSI and its successor SPKI seem to be dead, sadly, and nothing comparable has replaced them.)

Thanks,
—Jens

Update, 30 January:

No answer being forthcoming, and given the learning curve of RDF, I’m now pursuing the approach of representing certs in YAML. I also considered JSON, but YAML is essentially a superset of JSON that has some very useful features like tagging and aliasing.

»  Substance:WordPress   »  Style:Ahren Ahimsa