What if we started over from Email / Usenet?

I've been thinking for the past few days of what might happen if we were to go back to the rudiments of, say, email or Usenet, and start building up message formats and transport protocols from there, adding and removing as necessary. What you'd need to add or take away. What new capabilities we have now.

An obvious win would be the integrated metadata -- sender, recipients (defined ... somehow), dates, and potentially useful subject lines, though even in email that's hit-or-miss now.

I'm actually considering creating an mbox or maildir archive as one format for my G+ data takeout. Payloads in bodies, first sentence or so as a subject line, and then other bits. Run "notmuch" over that and have a highly-indexed, searchable trove. For my own use, not externally accessible. Text-only. But interesting.

And mutt (a console-mode email program) would be the client interface. Or any other email client. I'd have pretty much instant search by any of of mutt's standard fields: from, to, subject, date, full text. Could add a few additional custom fields -- attached URL(s), whether or not there's an image. Maybe even thread the discussion underneath using "In-reply-to" and "References" headers.

With a few tweaks, that could be a local Usenet spool, with similar access.

It's the sort of thing I'd long wanted out of G+.

And you're right, that's not the sort of capability we're getting from the present generation of distributed Web clients. For various reasons. Though I'm increasingly asking myself "why the goddamned hell not"?


Over the past few days, for various reasons, I've been speccing out just what the entire size of the Google+ text corpus would be. Stripped of its HTML, CSS, and JS packaging, and excluding images. RFC 822 is at the very least, a reasonable foundation for a message-based format.

For Communities, it appears that there are on the order of 300 million messages, most quite short (20-40 words), call it 250 bytes of content per message, on average. The posting rate for the nearly 1 million active communities appears to be 1/wk, and over six years and some change, we get about 320 million posts. That is about 80 GB of text. Larger than my typical mail spool (slightly), but not actually a horrendous amount of data. Particularly if you think of "Google Scale" as being, well, Google Scale.

Estimating the total G+ size is a bit sketchier, but it seems that non-Community posts may be 2.5x or so larger, which nets us about 1 TB total. Mind delivered over HTML this bloats tremendously, well into the petabyte range. There are 800 kB of HTML/JS/CSS packaging in a basic G+ page to start, and then you start adding images (30% of all posts), at 4-24 MB each. That's ... considerable. A more efficient transport would reduce that tremendously.

But if the platform had been designed with the thought of distributing content permanently to end users for their access, it ... would have been pretty doable. Subscribing to a stream, via Collections, Communities, Circles, or way back in the beginning of time, Sparks, could have happened.

There's a bit of a create, modify, update, destroy cycle to deal with: some posts are edited over time (most are not). And there's the question of keeping content nobody ever reads. But really, distributing content on a wide, if not global scale, is within the realm of reason.

And yet that's not what we have.

Why not?

#socialMedia #email #usnet #rfc822 #rfc850 #doItOver #plexodus #darcyProject

2