

My current jq project: create a Diaspora post-abstracter

Given the lack of a search utility on Diaspora*, my evolved strategy has been to create an index or curation of posts, generally with a short summary consisting of the title, a brief summary (usually the first paragraph), the date, and the URL.

I'd like to group these by time segment, say, by month, quarter, or year (probably quarter/year).

And as I'm writing this, I'm thinking that it might be handy to indicate some measure of interactions --- comments, reshares, likes, etc.

My tools for developing this would be my Diaspora* profile data extract, and jq, the JSON query tool.

It's possible to do some basic extraction and conversion pretty easily. Going from there to a more polished output is ... more complicated.

A typical original post might look like this, (excluding the subscribed_pods_uris array):

  "entity_type": "status_message",
  "entity_data": {
    "author": "dredmorbius@joindiaspora.com",
    "guid": "cc046b1e71fb043d",
    "created_at": "2012-05-17T19:33:50Z",
    "public": true,
    "text": "Hey everyone, I'm #NewHere. I'm interested in #debian and #linux, among other things. Thanks for the invite, Atanas Entchev!\r\n\r\nYet another G+ refuge.",
    "photos": []

Key points here are:

  • entity_type: Values "status_message" or "reshare".
  • author: This is the user_id of the author, yours truly (in this case in my DiasporaCom incarnation).
  • guid: Can be used to construct a URL in the form of https://<hostname>/posts/<guid>
  • created_at: The original posting date, in UTC ("Zulu" time).
  • public: Status, values true, false. Also apparently missing in a significant number of posts.
  • text: The post text itself.

A reshare looks like:

  "entity_type": "reshare",
  "entity_data": {
    "author": "dredmorbius@joindiaspora.com",
    "guid": "5bfac2041ff20567",
    "created_at": "2013-12-15T12:45:08Z",
    "root_author": "willhill@joindiaspora.com",
    "root_guid": "53e457fd80e73bca"

Again, excluding the .subscribed_pods_uris. In most cases, reshares are of less interest than direc posts.

Interestingly, I've a pretty even split between posts and reshares (52% status_message, that is, post).

My theory in creating an abstract is:

  • Automation is good.
  • It's easier to peel stuff off an automatically-created abstract than to add bits back in manually.
  • The compilation should contain only public posts and exclude reshares.


  • It's relatively easy to create a basic extract:
jq '.user.posts[].entity_data | .author, .guid, .created_at, text

Adding in selection and formatting logic gets ... more complicated.

Among other factors, jq is a very quirky language.

Desired Output Format

I would like to produce output which renders something like this for any given posts:

Diaspora Tips: Pods, Hashtags & Following

For the many Google Plus refugees showing up on Diaspora and Pluspora, some pointers: ...

https://diaspora.glasswings.com/posts/a53ac360ae53013611b60218b786018b (2018-10-10 00:45)

What if any options are there for running Federated social networking tools on or through #OpenWRT or related router systems on a single-user or household basis?

I'm trying to coordinate and gather information for #googleplus (and other) users looking to migrate to Fediverse platforms, and I'm aware that OpenWRT, #Turris (I have a #TurrisOmnia), and several other router platforms can run services, mostly #NextCloud that I'm aware. ...

https://diaspora.glasswings.com/posts/91f54380af58013612800218b786018b (2018-10-11 07:52)

The original posts can of course be viewed at the URLs shown.

What this is doing is:

  • Extracting the first line of the post text itself.
  • Stripping all formatting from it.
  • Bolding the result by surrounding it in ** Markdown.
  • Including the second paragraph, terminating it in an elipsis ....
  • Including a generated URL, based on the GUID, and here parked on Glasswings. (I might also create links to Archive.Today and Archive.Org of the original content.)
  • Including the post date, with time in YYYY-MM-DD hh:mm resolution.

Including the month and year where those change might also be useful for creating archives.

Specific questions / challenges:

  • How to conditionally export only public posts.
  • How to conditionally export only status_message (that is, original) posts, rather than reshares.
  • How to create lagged "oldYear" and "oldMonth" variables.
  • How to conditionally output content when computed Month and Year values > oldMonth and oldYear respectively. Goal is to create ## .year and ### .month segments in output.
  • How to output up to two paragraphs, where posts may consist of fewer than two separate text lines, and lines may be separated by multiple or only single linefeeds \r\n.
  • Collect and output hashtags used in the post.
  • Include counts of comments, reshares, likes, etc. I'm not even sure this is included in the JSON output.

There might be more, but that's a good start.

And of course, if I have to invoke other tools for part of the formatting, that's an option, though an all-in-jq solution would be handy.

#jq #json #diaspora #scripting #linux


What if any options are there for running Federated social networking tools on or through #OpenWRT or related router systems on a single-user or household basis?

I'm trying to coordinate and gather information for #googleplus (and other) users looking to migrate to Fediverse platforms, and I'm aware that OpenWRT, #Turris (I have a #TurrisOmnia), and several other router platforms can run services, mostly #NextCloud that I'm aware.

Is #diaspora itself viable on these systems? I'm thinking that may be ambitious.

If not, what are considerations for running a small node through a router? Primary considerations would be capacity planning, bandwidth and load impacts, and configuration and security considerations.

Hoping there's some expertise here.

#openwrt #networking #selfhosting #servers #linux