SIDEBAR
»
S
I
D
E
B
A
R
«
I’m a qwitter
Mar 12th, 2011 by jens

I have backed up all the tweets from my Twitter account (@snej) to a local file, and am now mass-deleting all of them. This is a venerable form of protest that goes back to early BBSs like the WELL. Basically, I am no longer willing to donate my ‘valuable’ user-generated content to a centralized service that issues fuck-yous of this magnitude to its developers and users.

I could rant at length about the arrogance, stupidity and just plain creepiness of that message and the policies behind it, but I don’t know that it’s even worth it. Others have already done a pretty good job of deconstructing its marketroid Newspeak. I just can’t resist pointing out that two of the major components of Twitter’s content model—the @-mention and the #hashtag—were invented by early users and app developers, not by Twitter itself, then later integrated directly into the system to make them more useful. That’s a great example of collaborative development. Now, perversely, Twitter sees fit to tell app developers exactly how they can and can’t represent those same features in their UIs.

And yes, this is enforceable, because thanks to OAuth they can and will revoke an app’s access to Twitter at the flick of a switch. They brag about how they “revoke literally hundreds of API tokens / apps a week” [ibid]. I just now realized the implications of this, actually. OAuth may be more secure than traditional HTTP auth in that it doesn’t give apps access to your account password, but the centralization of control that it gives to service providers is really disturbing.

“But Jens”, you say, “you still have accounts on other centralized social networking sites such as Facebook, Tumblr, LiveJournal and flickr, many of which have also shown a similar disregard for users and developers. Why aren’t you deleting those accounts?”

Good question, anonymous readership. It comes down to three factors:

  1. These other services feel more like real apps, with idiosyncratic features. Twitter has always seemed more like (and promoted itself as, at least to developers) a general purpose platform. It’s a series of tubes for publishing and subscribing to 140-character blobs. I could ignore its stupid star-shaped topology, centralized control, and frustrating payload limitations … as long as it stuck to being a generic service. Now they’re taking that back. It’s as though the phone company is telling me what color of telephone I’m allowed to plug into their lines and what size the touch-tone buttons have to be. (Think that’s a silly example? The old monopoly AT&T actually did enforce that ludicrous degree of control until court rulings in 1956 and 1968.)
  2. Some of these services are, frankly, a lot more important to me than Twitter. I am basically on Twitter mostly to keep up with some friends/acquaintances who post about their daily lives there. (It makes me very sad that some of those friends once used to post far more meaningful content on a regular basis on LiveJournal. I miss those days.) But even though I keep my list of follows really small, my stream still has so many retweets and links to random URLs and unreadably-shorthanded opinions, that it’s often more frustrating than useful.
  3. Finally some of those services just haven’t done anything particularly evil yet. Turns out I can put up with innocent everyday failure pretty well when I’m not paying anything for it.

The big question in my mind is what to replace Twitter with. Ironically (and perhaps pathetically) I think I will end up reading Facebook more, because some of my Twitter friends are also there. At least until the next time Facebook does something egregiously evil.

In a larger sense, it should not be rocket science to build some plumbing that does what Twitter does—publish and subscribe small blobs—with an actually-decentralized architecture. There are a lot of smart developers out there, but to some extent we’ve been seduced into suckling at the proprietary API teats of big providers, at the expense of developing the next generation of open protocols.

Yeah, in my current day job I’m as guilty of this as anyone else. But at home I’ve got a garage full of various pieces of half-built tech that attempt to solve that problem in one form or another, if I could ever finish any of them. A lot of the trouble is motivation. Anyone want to help out?

Social Networks Personified
Jul 7th, 2010 by jens

Twitter: Charming in brief doses, he tells you little one-liner jokes, then wanders off after two sentences to go talk at somebody else. He absolutely will not shut up for an instant, and namedrops shamelessly about his famous friends. When he’s outworn his welcome he passes out drunk on the floor and has to be dragged home.

MySpace: Who? Oh, right, this anorexic high-school girl who threw herself at you at a party once in 2005. She kept bragging about all the bands she knew (and which you could overhear on the tinny earbuds she wore.) After one too many Jägermeister Jell-O shots she barfed Day-Glo all over your shoes. Last you’ve heard, she’s found some 80-year-old media mogul to be her sugar daddy.

Facebook: You vaguely remember him from high school. He was a nonentity then and he’s equally uninteresting now, but he’s somehow infiltrated your circle of friends and shows up at every social event you go to, telling boring anecdotes about last night’s game and what he bought at Wal*Mart. Worse, it seems he’s joined some cult and wants you to join too so he can go up a level.

Tumblr: She’s got impeccable taste, a lovely apartment and fascinating stories, but after a while you realize she only talks about what other people have done; she doesn’t have an original thought in her head. She won’t carry on a conversation, either, so the only way to get her to pay attention to you is to repeat back something she’s already told you.

Soup.io: Similar to Tumblr, but with a cute Austrian accent. She’s more conversational, but on the downside she sometimes insists on talking to you in German.

LiveJournal: A mysterious Goth chick you were introduced to at a club. After you strike up a friendship with her, she starts telling you all her innermost secrets whenever you see her. This is terribly alluring at first, enough so that you can overlook her appallingly bad fanfic, but after a while you begin to realize how seriously disturbed she is. Around then she abruptly stops showing up, and you’re never sure whether she killed herself or just moved to a more elite social circle. You never learn her real name.

[Update: I’ve changed two of them to male. I wanted to be consistent in the personification, but in retrospect that leaves me open to charges of sexism, which absolutely wasn’t intended. They all have male counterparts, of course, whom I’d love to hear about if you want to write about them.]

Re: Idea for alternative RSS syncing system
Feb 9th, 2010 by jens

Brent “NetNewsWire” Simmons raises the idea of an open protocol for syncing RSS/Atom subscriptions, that is, a way of keeping multiple local newsreader apps (like on a Mac and an iPhone) in sync with each other, so that they share the same set of subscribed feeds, and remember which articles have already been read. You can think of it as “IMAP for RSS”.

NetNewsWire already does this using Google Reader as an intermediary, and Apple’s PubSub framework (which is what Safari and Mail use) shares the read/unread state using MobileMe. But it would be nice to have an open protocol.

I have some experience with this, having implemented the sync system used by PubSub. It’s an interesting problem—you might think I would have just used Apple’s SyncServices, and it’s true that it would have worked great for the subscription list, but it doesn’t scale well to huge numbers of rapidly-changing “read/unread” flags.

I have two suggestions (which I would have made on Brent’s blog, except he doesn’t allow comments anymore.)

CouchDB

CouchDB is an awesome web-centric database engine. It doesn’t use SQL; instead, it’s a glorified key-value store whose values are arbitrary JSON objects, and which uses map-reduce for efficient querying. The basic API is pure REST, though glue libraries for many languages exist.

CouchDB natively supports syncing data through distributed groups of servers. It’s sort of like the way distributed version-control systems like Git or Mercurial work: multiple CouchDB instances each store a replica of the same data set, but can “pull” changes from each other over HTTP to stay in sync.

CouchDB is pretty lightweight and is already being used on the desktop by client apps: GNOME has been integrating it into the Linux desktop to use as a shared store for user data like contacts and bookmarks. It plays a similar role to SyncServices on Mac OS, but it’s all open source and any two instances can sync with each other instead of requiring a proprietary server. I hear this is already shipping in the latest Ubuntu releases.

It doesn’t look as though anyone’s designed a schema for storing RSS subscriptions this way, but it would be pretty easy to define one. You then need a local agent running CouchDB (it can be stripped down to be pretty small), a client library for Cocoa apps, and an upstream CouchDB server to sync to.

REST-Logging

This protocol is similar to what I came up with for PubSub. It’s a simple extension of REST, but I haven’t heard of it being used elsewhere. The idea is that you model an append-only log file as an HTTP resource. The items that are logged are ‘events’ describing changes in the data model, in this case the subscriptions and articles.

The sync algorithm looks like this:

  1. Download all the data that’s been added to the remote log file since your last sync. Remember the file’s ETag.
  2. Parse that data into a sequence of log entries, and process them in order. Each entry names a model object (feed or article) and an action (subscribe, unsubscribe, mark read, mark unread). Apply those changes to the local data store.
  3. Query your local data store to find all the changes that have been made since your last sync. Ignore the remote changes you just applied in the previous step, and also any earlier local changes that duplicate a remote change (like marking the same article as read.)
  4. Generate a series of log entries for those changes and concatenate them into a data blob.
  5. Upload that blob, appending it to the remote log file. Remember the resulting ETag. In case of a conflict (someone else has changed the remote file since step 1), toss out the blob and return to step 1.

You can think of the log file as a queue or message stream that’s being collaboratively read and written by all of the clients. This sounds like something you’d need a fancy web-app to manage, but it turns out that all it takes is a typical HTTP 1.1 server and a trivial server-side script.

The download is a conditional GET, as used for fetching feeds themselves. The difference is that you use a “Range:” header to request only the bytes past the last known EOF. For example, if the last time you read the log it was 123456 bytes long, you add the header “Range: 123456-” to the request. This ensures that you only get back the new bytes that were added to the end. (And since this is a conditional GET, if the file hasn’t changed at all you just get back an empty 304 response.)

That’s all you need to do to track changes. Since the file is append-only, the only bytes you need to read are the ones added to the end. This request efficiently sends you just those bytes.

What’s cool is that this require no server-side software. If the log is a static file, any regular HTTP server like Apache will automatically handle GET requests for it, even byte-range ones. (Ranges are already used by browsers to resume interrupted downloads.) And it sends the response at high speed, since the server’s just streaming from a file, without multiple back-and-forth requests and without expensive database queries.

How about writing? Ideally you’d use the same approach, with a byte-range PUT that specifies that the request body should go at the end of the file. Unfortunately most servers don’t support this for static files, even though it’s basically just HTTP 1.1. But it’s really easy to implement. Any PHP crufter should be able to whip up a one-page script that simply responds to a POST by reading the request body and appending it to a local file (while doing the necessary ETag and range verification.) The great thing is that this script doesn’t have to know anything at all about RSS or subscriptions or unread counts; it’s completely generic. You can upgrade the data model without having to touch the script, and you could use the same script to sync anything, not just RSS.

(Yes, there is a semi-obvious drawback to this protocol: the file grows without limit. Surprisingly, this is not a problem most of the time, since clients only upload or download new data; the only real limit is the maximum file size or disk quota allowed by the server. But it does present a problem for a new client, whose first-time sync would download the entire file. This can be worked around by having new clients ignore very old data (only download the latest 10MB, say) or by periodically writing a compact subscription list to a separate URL.)

The Lost Lesson Of Instant Typing
Oct 14th, 2009 by jens

Farhad Manjoo writing in Slate about Google Wave:

The trouble is, everything you type into Wave is transmitted live, in real time—every keystroke was getting sent to Zach just as I hit it. This made me too self-conscious to get my thoughts across.
… Maybe I should just delete what I’d written and say, “Twitter works because it’s simple.” But I couldn’t do that, because Zach was watching me. He could see me struggling right now—he could see that I’d gotten myself stuck in a textual cul-de-sac and that I was desperately searching for a way out without looking foolish. Now I saw Zach beginning to type: “Don’t let the live-typing get you down!” The game was up; what was the point of making a point now? I ended my thought clumsily and then resolved never to attempt to say anything very deep on Wave.

The same thing happened seven years ago with the live-typing feature that I implemented in iChat 1.0 (which was only supported for Bonjour chats.) I thought it was an awesome idea, and I’d wanted to have it in a chat program since about 1997. But it turned out that, in actual use, people hated it, for exactly the reasons Manjoo describes: it makes you self-conscious. We took it out in the next release.

(Interestingly, I hate video chat for a very similar reason. Somehow, the fact that my picture is being shown in real time to the remote person makes me horrifically self-conscious, even though it wouldn’t bother me at all to talk face-to-face with that person. I don’t know whether it’s the little preview on my screen, or the fact that the person is spookily both present and not-present, but the few times I’ve tried video-chat have been really unpleasant.)

I’m usually on the side of more technology. I believe that our online communications tools are still horribly primitive and have only scratched the surface of what’s possible. But this was a case where more technology was bad.

The low-tech alternative that lots of people use in IM,
is to write in short fragments,
each a separate message,
so the other person can see them one by one
without waiting for you to finish the whole sentence.

The difference is that you’re in control over when to send a partial message, and the other person knows you’re in control, and so on. I still think it might be possible to do this in a higher-tech way, like using a hot-key to send a partial message on demand without having the funky line-breaks, but the current approach isn’t so bad as long as you’re not embarrassed about unintentional free verse.

I could have told the Wave people about what I’d learned, except I didn’t know Wave existed until April (shortly before the public announcement), and even then I was just some guy lost in the crowd at the demos….

Part of the problem, in both cases, is that live typing is one of those Cool Demo Features that looks really awesome when showing off the app. Features like that can be dangerous because they are legitimately very useful during the app’s gestation, when exciting demos are a key survival trait; but then they can’t be removed later on because they’re so well-known, even if they turn out to be useless. Sometimes these features aren’t actually harmful to the user experience, they just make the code more complex and harder to maintain. Instant typing is both, unfortunately. (The clever sync algorithms and rapid-fire network messages Wave uses would be needed even without live typing, but the fact that they have to run on every few keystrokes, not just every minute or so, pushes those things so much harder.)

Gossip For Lakitu
Aug 16th, 2009 by jens

Last year I wrote a series of blog posts about a peer-to-peer system called Cloudy that I was developing. I was going up the stack, from messaging to identity, but didn’t finish documenting all the layers I’d built. I mostly stopped working on Cloudy after I went back to gainful employment, but I keep thinking about this stuff.

“Lakitu”?

I’ve since heard about another unrelated project nicknamed Cloudy; and the whole term “cloud” has gotten so debased in the past year that it now stands for outsourcing to giant hidden server farms, which is the antithesis of what I stand for. So I’ve decided to use the name Lakitu instead. Nintendo fans will recognize Lakitu as a bit character in the Mario games—he’s a goggled turtle who rides a little one-seater cloud. This makes him an appropriate mascot for P2P technologies, I think.

[I’m sure Nintendo has a trademark on the character, but they don’t appear to have copyrighted the word “Lakitu”. He’s not even known by that name in Japan, where he’s called “ジュゲム” or “Jugem”. I have been unable to find out what “Lakitu” means or why they decided to use it in the English translation. I could also note threateningly that I have some intellectual-property issues of my own with Nintendo’s depiction of Lakitu’s smiling cloud, which is clearly infringing on my son’s comic-strip character Cloudy. So let’s call it a draw, Iwata-san?]

My last Cloudy post was about verifying people’s identities, and the next one was going to be about gossip. I’ve become unhappy about the rather kludgy way I designed gossip in Cloudy, so yesterday I started designing a new protocol for it, which I’m going to write about.

“Gossip”?

A gossip protocol is a means of broadcasting information in a distributed system. Pairs of computers periodically connect and swap new bits of information with each other; the result is that the information gets dispersed through the whole network (provided it’s a connected graph.) The tricky part is avoiding infinite loops and combinatorial explosions, and optimizing the way pairs of computers swap messages so it scales well.

I started defining a protocol, based on stuff I’ve been thinking about for a while. I don’t think it’s as advanced as what’s reported in research papers, but I’m hoping it will work well enough when used in a socially-driven network—one where the connections between machines are driven by the social connections between their users. Social networks have short horizons, so any particular participant only “sees” a constrained number of near-neighbors even though the entire network may be huge.

I’m making this protocol agnostic as to the type of messaging being used. BLIP will work well, but it ought to be possible to use Jabber or even email; anything that can send messages between two participants. It’s also agnostic as to message content, beyond a few simple assumptions that a message has an author, a timestamp, and some arbitrary “topic” tags.

For example, it ought to work fine at distributing tweet-like micro-blog posts.

Right now I have the protocol written down as an outline in Notebook. I’ll flatten it out, expand it and post it here in a day or two.

iTunes 9 Deja Vu
Aug 11th, 2009 by jens

AppleInsider reports on the iTunes 9 rumors:

“The social networking integration that we reported iTunes 9 would have seems to be part of a bigger social networking push by Apple,” the report states. “We’ve been informed that Apple has plans to tie iTunes 9 into a “Social” application that they plan to release in the future.”

This sounds like the kind of app (though separate from iTunes) that Jessica Kahn and I kept trying in vain to get Apple to build, circa 2003-2005. Maybe they’ll get some use out of our abandoned prototypes.

The report goes on to say that the new application would allow users to share their listening habits with friends [and] send music to friends”

Mike Estee and I had actually prototyped this in iChat in 2003, but the feature never got approved since there were so many more important things to add, like 3-way video conferencing. (Plus the fact that Apple execs turned white as a sheet if you said the words “send music” near them.)

Anyway, personal bitterness aside, I think it’s really amusing that Apple keeps shoving the kitchen sink into iTunes, since that has to be the single nastiest, hardest-to-extend codebase they have — it’s their last remaining Carbon app, with a foundation that dates back to Casady & Greene’s SoundJam, circa 1998.

Plugging a hole in GameKit
Mar 18th, 2009 by jens

The GameKit framework in iPhone OS 3.0 is very interesting to a Bonjour / P2P head like yrs truly. It basically provides a very easy-to-use API for ad-hoc group formation and many-to-many messaging on a local network. Great for games, of course, but also for many other types of social apps. (I just saw a report on a dev forum that somebody had whipped up a basic chat app in about 15 minutes.)

GameKit uses BlueTooth networking; that lets it work where there’s no WiFi, but it also limits the range. BlueTooth covers just a few meters, whereas a WiFi network connected to an Ethernet subnet can easily cover a whole floor of a building.

My MYNetwork framework seems like a good way to bridge that gap. The TCP connection classes provide the Bonjour discovery and makes point-to-point connections, and the BLIP protocol lets you send data blobs over those connections.

It should be pretty straightforward to build some classes that are plug-compatible with the GameKit network classes but use MYNetwork. Then iPhone apps could easily support both protocols, and compatible Mac apps could be developed. Anyone want to try it?

[Note: I’m only referring to information that was publicly discussed at Apple’s press event yesterday. I’ve read through the APIs, but I won’t go into details about them here in public.]

What will Web 3.0 be?
Feb 15th, 2009 by jens

So, Web 2.0’s heyday is over, and somewhere out there, Web 3.0 is slouching toward us waiting to be born. What will it be?

There’s really no such single thing as “Web x“, of course. And all predictions are really just wishes. That being said, my wish is that Web 3.0 will be about distributed systems. To oversimplify:

Web 1.0 built up big brand-name websites with their own content—things written by them, or repurposed from the media companies that owned them, or stuff to buy.

Web 2.0 embraced “user-created content” and interaction between users. The content creation has become less centralized, outsourced to whomever wants to register an account and post stuff, but the sites managing, storing and serving the content are still centralized.

Web 3.0, I hope, will take the decentralization to the software, and the storage. Monolithic web apps run by huge server farms—Facebook, Blogger, Twitter, Flickr, etc.—will be at least in part supplanted by apps that users run locally (or at least ‘nearby’) and which share data among each other.

Why is this important?

  • Centralization creates concentrations of power, and that’s dangerous. The people who run the servers have total control over your (and everyone’s) data. They can snoop at it (however private it’s supposed to be), they can sell it to advertisers, they can accidentally lose it, they can accidentally expose it to hackers.
  • Centralization leads to walled gardens. Your data on each service is intrinsically disjoint. It can be linked together, through hyperlinks and feeds and APIs and mashups, but only to the degree each service allows, and it’s never seamless.
  • Centralized services are hard to run. The more popular they get, the heavier the demands on the servers, and the worse the problems of abuse. (I see this every day in my job, even though the service I work on is one of the smallest Google runs.) This acts as a tax on innovation. Modern frameworks like Rails and Django make it easy to create a site, but to take it beyond just you and your friends, you have to get expensive hosting and deal with server clusters, database replication and so on. The early days of 3rd-party Facebook apps were a great example of this: no sooner would an app come online, than it’d drown under the load of its users and have to be resurrected by panicked owners upgrading their servers to more-expensive hosting plans.
  • Centralized services are usually closed-source. I’m not an open-source zealot, and I’ve spent my career working on closed-source software, but I don’t think it’s healthy for [nearly] all large web apps to keep their source code locked away. It discourages innovation and it makes it hard for open alternatives to compete (especially when you consider the huge intrinsic network effects that discourage switching.)

What do we need?

Decentralized systems need well-defined protocols and data formats for communicating. We’ve been making headway with that as part of Web 2.0—there’s an arsenal of technologies like REST, Atom, AtomPub, OpenID, OAuth, RDF, JSON and so on—but they’re not well integrated with each other. And we need higher level abstractions.

I’ve been researching CouchDB this week, and I’m getting more and more excited by it the more I learn. It combines data storage, REST-based APIs, scalability and data propagation through replication, and even application hosting. It’s actually a lot like Google’s internal infrastructure, but in an open and modular form.

You can use CouchDB as the back end of a traditional web service, glomming more and more instances of the server together for scalability; that’s the kind of architecture that Google and Amazon use. But you can also run instances independently from each other, and have them pull data from each other, very much like the way distributed version control systems like Git and Mercurial operate. As I’ve said before, once you have a decentralized system, you can easily design centralized systems of any form as special cases.

Since each CouchDB instance also runs as a web server, that means I can run my social network from my machine, and you can run yours from yours, and yet they can be the same social network. But I can keep my private data private, and I can hack on my software if I want, and the load on my server only scales with the size of my friend list, no matter how big the entire global network grows.

These are things I’ve been thinking of for a while (and my unfinished Cloudy app includes some of them), but CouchDB comes closer than any other software platform I’ve seen to making them implementable. It’s still unfinished (nearing version 0.9 right now), and some of the authentication and replication features that would be needed for this aren’t ready yet, but it really sounds like the people developing CouchDB Get It, and are working to make this vision of Web 3.0 come true.

[If this sounds interesting to you, go and read the preliminary draft of the upcoming O’Reilly book on CouchDB. Only the first few chapters exist yet, but they’re well-written and lay out the basics pretty well.]
Security hole in Safari RSS
Jan 13th, 2009 by jens

Brian Mastenbrook has discovered a really bad security hole in Safari RSS:

I have discovered that Apple’s Safari browser is vulnerable to an attack that allows a malicious web site to read files on a user’s hard drive without user intervention. This can be used to gain access to sensitive information stored on the user’s computer, such as emails, passwords, or cookies that could be used to gain access to the user’s accounts on some web sites. The vulnerability has been acknowledged by Apple.
All users of Mac OS X 10.5 Leopard who have not who have not performed the workaround steps listed below are affected, regardless of whether they use any RSS feeds. Users of previous versions of Mac OS X are not affected.

He hasn’t released details yet, presumably to give Apple time to release a patch, so I don’t know what the bug is. But it’s my fault, since I either wrote the bad code myself, or at least didn’t notice a mistake a co-worker made. And since I’m not at Apple anymore I can’t help fix it.

Shit. I’m sorry, everyone.

Beautiful snej soup, yum
Aug 9th, 2008 by jens

I’m fooling around with Soup, a newish micro-blogging service I just discovered. I’ve never signed up for tumblr or its other clones, but I’m kind of smitten with Soup, so I set up my own:

beautiful snej soup, yum

I’ve got it aggregating stuff from my del.icio.us, flickr and last.fm accounts, as well as this blog. And I’m directly posting some things I’ve run across today, via its very nice bookmarklet.

Part of the reason I got sucked in is that Soup has the single best new-user experience I’ve ever seen on the web. You just click the “try it” button on the home page, and you get your own soup blog. No signup, no registration, just instant gratification. Then you can slide open the control panel (that slider itself is a beautiful piece of UI), import from your other social sites, and fool with the settings, all in privacy. Only after you’re hooked do you need to press the Create button and choose a username and password, whereupon your soup goes live. It’s brilliant — the web equivalent of the “untitled document” UI introduced in the ‘70s by the Xerox Star.

Anyway, please take a look and join me! (It’s not obvious from the untitled-blog experience, but Soup has friends and groups like other social networks.)

»  Substance:WordPress   »  Style:Ahren Ahimsa