SIDEBAR
»
S
I
D
E
B
A
R
«
“Getting Into Modeling With CouchCocoa”
Jan 19th, 2012 by jens

I gave a webcast last month for O’Reilly, and it’s up on YouTube now so you can watch it at your leisure. Here’s the abstract:

It can be very liberating to store your Couchbase app’s data as free-form dictionaries, free of the rigid schema of a relational database. But there’s a lot to be said for the syntactic comforts and conveniences of modeling your data as native Cocoa objects with predeclared @properties, the way CoreData does. Don’t fret – the CouchCocoa framework lets you do both, even in the same database. During this webcast I’ll show you how to acquire the glamor of an object model, while still letting your NoSQL freak-flag fly.
Announcing TouchDB
Dec 19th, 2011 by jens

[I just posted this to the Couchbase Mobile community mailing list.]

TouchDB is a project I’ve been feverishly working on for a few weeks. It’s an investigation into the feasibility of a CouchDB-compatible database rewritten from the ground up for mobile apps. The comparison I like to make is that “if CouchDB is MySQL, then TouchDB is SQLite”. In fact, it uses SQLite as its underlying storage engine. You can read a longer justification for it on its wiki, as well as an FAQ and design document.

— It speaks CouchDB’s replication protocol. I’m pretty serious about that; I’m even documenting the protocol.
— It also understands a large subset of the REST API, enough so that it works with CouchCocoa. I’ve got a clone of Grocery Sync working as one of the demo apps in the project.
— The current implementation is for iOS. If the investigation pans out we’ll port it to Android, and possibly other platforms.

TouchDB is certainly not ready for prime-time yet, but here are some current statistics to whet your appetite:
• Code size of an ‘empty’ iOS app with nothing in it but TouchDB: ~150k.
• Time to initialize TouchDB and open a database, on iPad 2: ~100ms (cold) or ~60ms (warm).
• Size of source code: ~4000 lines of Objective-C (plus another ~2500 lines from some existing utility libraries.)

What’s left to do? Probably a lot — that infamous “second 90%”. Prominently:
• Attachments
• Reduce functions and grouping
• Filters for views and replication
• Performance tuning
See the issue tracker for more.

So, what does this mean for Couchbase Mobile? Honestly, we don’t know yet. It may be that TouchDB turns out to be so awesome that it replaces embedded-CouchDB entirely in Couchbase Mobile on iOS and Android. It may be that there are still scenarios where embedded-CouchDB works better and is worth the extra overhead for some developers, in which case we’ll still support it. This is not a product announcement; it’s a technical announcement of something that isn’t a product yet, because we like to do our development in the open. We’d love your feedback or even contributions.

My presentation from Keeping It Realtime
Nov 10th, 2011 by jens

I gave another talk about Couchbase/CouchDB at the Keeping It Realtime conference this week in Portland. This one is titled “_ch_ch_changes: CouchDB/Couchbase Notifications And Replications”, and the slides are now up on slideshare.

I had a great time. The conference itself was pretty exciting, even if some of the content was over my head (I’m not primarily a web developer, server-side isn’t how I roll, and I’ve only just started learning about node.js this week!) Plus: Portland. OMG, I love Portland.

Couchbase news (and my preso)
Aug 10th, 2011 by jens

My new employer is doing well:

MOUNTAIN VIEW, Calif. – August 10, 2011 – Couchbase, the leading NoSQL database company, today announced it has secured $14 million in a Series C round of financing led by venture capital firm Ignition Partners with participation from the company’s existing investors Accel Partners, Mayfield Fund, and North Bridge Venture Partners. The company has also reserved an additional $1 million for investment from strategic customers and partners. The new funding will be used to further invest in NoSQL product development, support the adoption and growth of Couchbase in enterprise organizations, and support international expansion.
On the heels of its inaugural CouchConf developer conference, held July 29 in San Francisco with more than 300 attendees from around the world, Couchbase announced a Series C round of financing, bringing the company’s total funding to $30 million.

CouchConf was a great time. I really enjoyed meeting developers, and learning more about Couchbase from the other presentations. The slides from my own talk on Couchbase Mobile for iOS are now online (minus the gratuitous Keynote transitions) if you’d like to take a look.

Fudge
May 16th, 2011 by jens

I’ve just released a new open-source project, a small one—Fudge-Cpp, a fast C++ library for reading and writing Fudge messages.

I hadn’t heard of Fudge either, till a few weeks ago, but it’s a type of thing that’s always interested me: a generic structured binary data format. A quick elevator pitch would be “it’s sorta like JSON, except more compact and faster to parse”. (It’s also sorta like Mac property-lists, YAML, etc.) So, it lets you turn collections of scalars, strings, arrays and dictionaries into a standardized blob of data that can be sent over a network or stored on disk or whatever.

From the Fudge website:

“Fudge is primarily useful in situations where you have:

  • Data exchanging between nodes in a distributed system; where
  • You want to be able to do meaningful work with data without knowing at compile time the precise message format; and
  • Performance and message size are important; or
  • A requirement to encode data and translate between efficient binary encodings, and more accepted text encodings (XML, JSON) depending on the communications channel.”

The obvious advantage of a binary format for this is that it’s faster to parse. Instead of having to tokenize the input and walk through it counting braces and commas, and converting sequences of digits into numbers, you just read big-endian numbers from the stream and interpret them as types and byte-counts. This can also make the data smaller, if the format is careful to use the smallest number of bytes to represent a number (which Fudge does). It’s especially compact for messages containing payloads of binary data like images, audio, or digital signatures, since those blobs don’t have to be converted to Base64 (which is 25% larger.)

A more subtle advantage is that the library can use less memory. In fact, Fudge-Cpp basically allocates no memory at all from the heap. How does it manage this? When the library returns the caller an object representing a data value from the parsed message (a scalar, string, array, etc.) it doesn’t allocate that object on the heap. Instead, it just returns a pointer to where that value is stored in the binary message itself. This really is a valid C++ object pointer; it’s just that the object’s members have the exact same layout as the corresponding Fudge data.

(Now, there is a bit of impedance mismatch that adds overhead to accessing the data this way. For one thing, on a little-endian CPU every multi-byte numeric value will need to be byte-swapped when accessed. And random access to dictionaries and object arrays is O(N) since they are basically linked lists. But the overhead is low.)

This library isn’t a world-changing project; it was more of a fun diversion for me over a few weekends. I’ve done this sort of thing before—Ottoman uses similar memory-saving pointer tricks, and AEGizmos was literally the first code I ever wrote at Apple back in 1991—and it’s always fun to twiddle bits this way.

MYCrypto update (0.5)
Apr 17th, 2011 by jens

I’ve been making little updates to the MYCrypto library for a few months, and after the latest batch I did some housekeeping—fixing iOS and 64-bit build errors, updating the docs—and decided to dignify them with a new version number, 0.5.

Notable improvements:

  • More certificate I/O functionality. You can now import and export PKCS12 (.p12) files, which are encrypted archives that contain a private key as well as its cert.
  • More certificate trust validation APIs, including read/write access to user trust settings.
  • Access to X.509 cert extensions like key-usage.
  • Can verify signatures that use algorithms other than SHA-1, and parse certs that use such algorithms.
  • MYMockKey, a testing fixture that lets you test code that uses digital signatures without having to generate or use real key-pairs.

If you infer from this list that I’ve been working on an app that manages X.509 certs, I won’t deny it :) Maybe I’ll have more to say about that soon.

I’m a qwitter
Mar 12th, 2011 by jens

I have backed up all the tweets from my Twitter account (@snej) to a local file, and am now mass-deleting all of them. This is a venerable form of protest that goes back to early BBSs like the WELL. Basically, I am no longer willing to donate my ‘valuable’ user-generated content to a centralized service that issues fuck-yous of this magnitude to its developers and users.

I could rant at length about the arrogance, stupidity and just plain creepiness of that message and the policies behind it, but I don’t know that it’s even worth it. Others have already done a pretty good job of deconstructing its marketroid Newspeak. I just can’t resist pointing out that two of the major components of Twitter’s content model—the @-mention and the #hashtag—were invented by early users and app developers, not by Twitter itself, then later integrated directly into the system to make them more useful. That’s a great example of collaborative development. Now, perversely, Twitter sees fit to tell app developers exactly how they can and can’t represent those same features in their UIs.

And yes, this is enforceable, because thanks to OAuth they can and will revoke an app’s access to Twitter at the flick of a switch. They brag about how they “revoke literally hundreds of API tokens / apps a week” [ibid]. I just now realized the implications of this, actually. OAuth may be more secure than traditional HTTP auth in that it doesn’t give apps access to your account password, but the centralization of control that it gives to service providers is really disturbing.

“But Jens”, you say, “you still have accounts on other centralized social networking sites such as Facebook, Tumblr, LiveJournal and flickr, many of which have also shown a similar disregard for users and developers. Why aren’t you deleting those accounts?”

Good question, anonymous readership. It comes down to three factors:

  1. These other services feel more like real apps, with idiosyncratic features. Twitter has always seemed more like (and promoted itself as, at least to developers) a general purpose platform. It’s a series of tubes for publishing and subscribing to 140-character blobs. I could ignore its stupid star-shaped topology, centralized control, and frustrating payload limitations … as long as it stuck to being a generic service. Now they’re taking that back. It’s as though the phone company is telling me what color of telephone I’m allowed to plug into their lines and what size the touch-tone buttons have to be. (Think that’s a silly example? The old monopoly AT&T actually did enforce that ludicrous degree of control until court rulings in 1956 and 1968.)
  2. Some of these services are, frankly, a lot more important to me than Twitter. I am basically on Twitter mostly to keep up with some friends/acquaintances who post about their daily lives there. (It makes me very sad that some of those friends once used to post far more meaningful content on a regular basis on LiveJournal. I miss those days.) But even though I keep my list of follows really small, my stream still has so many retweets and links to random URLs and unreadably-shorthanded opinions, that it’s often more frustrating than useful.
  3. Finally some of those services just haven’t done anything particularly evil yet. Turns out I can put up with innocent everyday failure pretty well when I’m not paying anything for it.

The big question in my mind is what to replace Twitter with. Ironically (and perhaps pathetically) I think I will end up reading Facebook more, because some of my Twitter friends are also there. At least until the next time Facebook does something egregiously evil.

In a larger sense, it should not be rocket science to build some plumbing that does what Twitter does—publish and subscribe small blobs—with an actually-decentralized architecture. There are a lot of smart developers out there, but to some extent we’ve been seduced into suckling at the proprietary API teats of big providers, at the expense of developing the next generation of open protocols.

Yeah, in my current day job I’m as guilty of this as anyone else. But at home I’ve got a garage full of various pieces of half-built tech that attempt to solve that problem in one form or another, if I could ever finish any of them. A lot of the trouble is motivation. Anyone want to help out?

The Music I Liked Of 2009
Dec 13th, 2009 by jens

Every year the Albums Of The Year lists seem more and more removed from my experience. (Most of the time I haven’t heard a single album on the list.) Worse, we’re now getting into the Of The Decade lists, making me realize how long this has been going on*. If you ask me the top albums of the ‘80s or ‘90s, I don’t have too much trouble rattling off a bunch of names. But this decade? I get confused and have to start thinking hard and looking through the back covers of my mix CDs. Why is that? [Ed.: it’s because you’re getting old. Duh.]

Let me start with this year, 2009. What was good? Hm; my prosthetic brain units at iTunes and last.fm tell me that it’s:

Dysrhythmia, "Psychic Maps" [jaw-dropping instrumental math-metal, will have you banging your head in 7/13 time.]

Isis, "Wavering Radiant" [post-metal? huge lowercase-’p’ progressive epics. so good I’m willing to overlook the cookie-monster vocals.]

Pelican, "Ephemeral" EP [is it post-rock with big riffs? or restrained brainy instrumental metal?]

Apricot Rail, "Apricot Rail" [ok, this is instrumental-post-rock for sure, but atypically cheerful, kind of like Do Make Say Think. they’ve even got oboes omg!]

The Happy Hollows, "Spells" [this is the token recognizably-indie-rock album on the list. are they secretly Deerhoof covering the Pixies? Or Lush covering Interpol? also, they are hella cute and I can’t wait to see them live]

Dirty Projectors, "Bitte Orca" [I still can’t figure this out, with its Afrobeat guitar lines, intricate interlocking vocals, weird rhythms and weirder lyrics. ok, that description makes it sound like "Remain In Light", which it isn’t at all like, but maybe is somehow.]

Elisa Luu, "Chromatic Sigh" [can I say "eclectic" with a straight face? really interesting electronic music, clearly song-like, ranging between poles of rock and ambient.]

Anduin + Jasper TX, "The Bending Of Light" [guitar-based drone. majestic.]

Brock van Wey, "White Clouds Drift On And On" [exactly what it sounds like. it comes with or without beats, your choice.]

Eluder, "Drift" [exactly what it sounds like, too. careful with this one, some people have had trouble returning to earth afterwards.]

Jónsi & Alex, "Riceboy Sleeps" [Sigur Rós guitarist and his boyfriend do Stars Of The Lid. delicious in small doses.]

Victoire, "A Door Into The Dark" EP [a string ensemble that embraces electronics as a natural part of their sound]

Sub, "Id" EP [nothing new, but basically the best Photek tracks I’ve heard in ten years.]

PS: there might be some last minute additions to this list if Kate Simko’s "Sounds Of The Atom Smashers" and Concern’s "Truth And Distance" live up to their potential.
PPS: ZOMG I forgot Sunn O)))’s stark and forbidding "Monoliths And Dimensions", whose final track "Alice" is deserving of a space up there.
PPPS: Yes, a mix of this stuff is forthcoming…

* It’s not like I’ve been having trouble finding great new music, or that I resent other people for all picking different music than me; but it’s a little sad to not be part of the zeitgeist. Long ago it was really important to me whether punk and new wave would break dinosaur rock’s lock on the mainstream, or whether little-known underground bands like the Cure or the Pixies would get the recognition they deserved. I think the peak of my with-it-ness was circa 1990-91 when I could rattle off the name of every shoegazer band that mattered and I treated every issue of Melody Maker as a shopping list to take with me to the import bin at Tower to find the next Cranes or Chapterhouse.

Ottoman Status
Dec 8th, 2009 by jens

Yes, I’m still working on Ottoman (my append-only multiversion-concurrent storage library). As the code grows in size and complexity, so it grows in its resistance to being changed, but as Piet Hein said and I never tire of quoting:

Problems worthy of attack
Prove their worth by fighting back.

I just pushed my latest changes up to bitbucket.org. What’s new?

Variable-sized top-level index. Previously, the hash data structure used a top level array of 256 hashtables. I’ve now made that ‘256’ a variable that scales with the number of records. This saves a lot of room for smaller datasets. (Sounds easy, right? I thought so too. But it ended up triggering significant changes to the file format and algorithms and took a lot of debugging.)

The return of the trees! part un. I started this as a project to learn about B+trees, but when I got to the point of implementing the append-only multiversion stuff I did it with hashtables first because it’s more straightforward that way, and the tree code went into the freezer. Well, it’s back now. There is a “new” Tree class that can be used to read and write persistent B+trees. It’s not yet integrated into the top-level Ottoman class, though, so you can only use it on its own, not intermixed with hash-tables, and without the API for tracking versions.

The next task is of course to integrate the tree API into the Ottoman class. But more than that, it will be entwined with the hashtable such that trees will serve as searchable indexes into the data store. So the hashtable will provide extremely fast but unordered access via a unique “record ID”, but you can build as many indexes as you like that will use ordered secondary keys (of your choice) to look up hash values.

At that point, I believe Ottoman will be ready to serve as a substrate for HTML local data storage, for implementing a CouchDB-compatible database (a la JSONDB), or for other fun databasey purposes.

ZSync
Nov 28th, 2009 by jens

ZSync is a new Mac/iPhone library that uses my BLIP P2P networking protocol:

“ZSync is an open source syncing library designed to allow easy syncing of data between an iPhone/iPod Touch and the OS X Desktop.
ZSync utilizes the BLIP library and Apple’s Sync Services to allow easy and seamless syncing of data.”

It’s still in early development though, with a first public release expected in January:

Right now the code is in a private GitHub repository while the initial framework and protocols are fleshed out. This is expected to go public in January of 2010. Until then we are keeping the development team very small so that we can flesh out the design without a lot of overhead.

This looks like it’ll be super useful for iPhone apps that want to integrate with their Mac siblings, especially since their design won’t require you to have the Mac app running while you sync.

»  Substance:WordPress   »  Style:Ahren Ahimsa