FSF Events: Free Software Directory meeting on IRC: Friday, January 26, starting at 12:00 EST (17:00 UTC)
Join the FSF and friends on Friday, January 26, from 12:00 to 15:00 EST (17:00 to 20:00 UTC) to help improve the Free Software Directory.
Join the FSF and friends on Friday, January 26, from 12:00 to 15:00 EST (17:00 to 20:00 UTC) to help improve the Free Software Directory.
It's not long to FOSDEM 2024, where Guixers will come together to learn and hack. As usual there's some great talks and opportunities to meet other users and contributors.
FOSDEM is Europe's biggest Free Software conference. It's aimed at developers and anyone who's interested in the Free Software movement. While it's an in-person conference there are live video streams and lots of ways to participate remotely.
The schedule is varied with development rooms covering many interests. Here are some of the talks that are of particular interest to Guixers:
The Declarative and Minimalistic Computing track takes place Sunday morning. Important topics are:
Guix-related talks are:
This year the track commemorates Joe Armstrong, who was the principal inventor of Erlang. His focus on concurrency, distribution and fault-tolerence are key topics in declarative and minimalistic computing. This article is a great introduction to his legacy. Along with " The Mess We're In ", a classic where he discusses why software is getting worse with time, and what can be done about it.
On Sunday afternoon, the Distributions devroom has another Guix talk:
Guix Days will be taking place on the Thursday and Friday before FOSDEM. This is an "unconference-style" event, where the community gets together to focus on Guix's development. All the details are on the Libreplanet Guix Wiki.
Come and join in the fun, whether you're a new Guix user or seasoned hacker! If you're not in Brussels you can still take part:
GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64, and POWER9 machines.
In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.
Today, a middle-aged note: when you are young, unless you been failed by The System, you enjoy a radiant confidence: everything you say burns with rightness and righteousness, that the world Actually Is This Way, You See, and if you think about it, it Actually Should Be This Other Specific Way. This is how you get the fervent young communists and Scala enthusiasts and ecologists and Ayn Randians. The ideas are so right that you become an evangelist, a prophet, a truth-speaker; a youtuber, perhaps.
Then, with luck, you meet the world: you build, you organize, you invest, you double down. And in that doubling, the ideas waver, tremble, resonate, imperceptibly at first, reinforced in some ways, impeded in others. The world works in specific ways, too, and you don’t really know them in the beginning: not in the bones, anyway. The unknowns become known, enumerate themselves, dragons everywhere; and in the end, what can you say about them? Do you stand in a spot that can see anything at all? Report, observe, yes; analyze, maybe, eventually; prophesize, never. Not any more.
And then, years later, you are still here. The things you see, the things you know, other people don’t: they can’t. They weren’t here. They aren’t here. They hear (and retell) stories, back-stories, back-back-stories, a whole cinematic universe of narrative, and you know that it’s powerful and generative and yet unhinged, essentially unmoored and distinct from reality, right and even righteous in some ways, but wrong in others. This happen in all domains: macroeconomics, programming languages, landscape design, whatever. But you see. You see through stories, their construction and relation to the past, on a meta level, in a way that was not apparent when you were young.
I tell this story (everything is story) as an inexorable progression, a Hegelian triptych of thesis-antithesis-synthesis; a conceit. But there are structures that can to get you to synthesis more efficiently. PhD programs try: they break you down to allow you to build. They do it too quickly, perhaps; you probably have to do it again in your next phase, academia or industry, though I imagine it’s easier the second time around. Some corporate hierarchies also manage to do this, in which when you become Staff Engineer, you become the prophet.
Of course, synthesis is not inexorable; you can stop turning the crank anywhere. Perhaps you never move from ideal to real. Perhaps, unmoored, you drift, painter rippling the waters. But what do you do when the crank comes around? Where to next?
Anyway, all this is to say that I have lately been backing away from bashfulness in a professional context: there are some perspectives that I see that can’t be seen or expressed by others. It feel very strange to write it, but I am even trying to avoid self-deprecation and hedging; true, I might not possess the authoritative truth on, I don’t know, WebAssembly, or Scheme language development, but nobody else does either, and I might as well just say what I think as if it’s true.
Getting old is not so bad. You say very cheesy things, you feel cheesy, but it is a kind of new youth too, reclaiming a birthday-right of being earnest. I am doubling down on Dad energy. (Yes, there is a similar kind of known-tenuous confidence necessary to raise kids. I probably would have forced into this position earlier if I had kids younger. But, I don’t mean to take the metaphor fa(r)ther; responsible community care for the young is by far not the sole province of the family man.)
So, for the near future, I embrace the cheese. And then, where to? I suspect excessive smarm. But if I manage to succeed in avoiding that, I look forward to writing about ignorance in another 5 years. Until then, happy hacking to all, and thank you for your forbearance!
Staff seat board member and senior sysadmin Ian Kelling shares his personal musings on the board process improvements, his experience working at the Free Software Foundation (FSF), why Service as a Software Substitute (SaaSS) should get more attention, some lessons learned from the GNU Tools Cauldron, FSF's legal defense of GCC, and why the FSF needs your financial support.
Starting with version 2.33, the GNU C library (glibc) grew the capability to search for shared libraries using additional paths, based on the hardware capabilities of the machine running the code. This was a great boon for x86_64, which was first released in 2003, and has seen many changes in the capabilities of the hardware since then. While it is extremely common for Linux distributions to compile for a baseline which encompasses all of an architecture, there is performance being left on the table by targeting such an old specification and not one of the newer revisions.
One option used internally in glibc and in some other performance-critical libraries is indirect functions, or IFUNCs (see also here) The loader, ld.so
uses them to pick function implementations optimized for the available CPU at load time. GCC's (functional multi-versioning (FMV))[https://gcc.gnu.org/wiki/FunctionMultiVersioning] generates several optimized versions of functions, using the IFUNC mechanism so the approprate one is selected at load time. These are strategies which most performance-sensitive libraries do, but not all of them.
With the --tune
using package transformation option, Guix implements so-called package multi-versioning, which creates package variants using compiler flags set to use optimizations targeted for a specific CPU.
Finally - and we're getting to the central topic of this post! - glibc since version 2.33 supports another approach: ld.so
would search not just the /lib
folder, but also the glibc-hwcaps
folders, which for x86_64 included /lib/glibc-hwcaps/x86-64-v2
, /lib/glibc-hwcaps/x86-64-v3
and /lib/glibc-hwcaps/x86-64-v4
, corresponding to the psABI micro-architectures of the x86_64 architecture. This means that if a library was compiled against the baseline of the architecture then it should be installed in /lib
, but if it were compiled a second time, this time using (depending on the build instructions) -march=x86-64-v2
, then the libraries could be installed in /lib/glibc-hwcaps/x86-64-v2
and then glibc, using ld.so
, would choose the correct library at runtime.
These micro-architectures aren't a perfect match for the different hardware available, it is often the case that a particular CPU would satisfy the requirements of one tier and part of the next but would therefore only be able to use the optimizations provided by the first tier and not by the added features that the CPU also supports.
This of course shouldn't be a problem in Guix; it's possible, and even encouraged, to adjust packages to be more useful for one's needs. The problem comes from the search paths: ld.so
will only search for the glibc-hwcaps
directory if it has already found the base library in the preceding /lib
directory. This isn't a problem for distributions following the File System Hierarchy (FHS), but for Guix we will need to ensure that all the different versions of the library will be in the same output.
With a little bit of planning this turns out to not be as hard as it sounds. Lets take for example, the GNU Scientific Library, gsl, a math library which helps with all sorts of numerical analysis. First we create a procedure to generate our 3 additional packages, corresponding to the psABIs that are searched for in the glibc-hwcaps
directory.
(define (gsl-hwabi psabi)
(package/inherit gsl
(name (string-append "gsl-" psabi))
(arguments
(substitute-keyword-arguments (package-arguments gsl)
((#:make-flags flags #~'())
#~(append (list (string-append "CFLAGS=-march=" #$psabi)
(string-append "CXXFLAGS=-march=" #$psabi))
#$flags))
((#:configure-flags flags #~'())
#~(append (list (string-append "--libdir=" #$output
"/lib/glibc-hwcaps/" #$psabi))
#$flags))
;; The building machine can't necessarily run the code produced.
((#:tests? _ #t) #f)
((#:phases phases #~%standard-phases)
#~(modify-phases #$phases
(add-after 'install 'remove-extra-files
(lambda _
(for-each (lambda (dir)
(delete-file-recursively (string-append #$output dir)))
(list (string-append "/lib/glibc-hwcaps/" #$psabi "/pkgconfig")
"/bin" "/include" "/share"))))))))
(supported-systems '("x86_64-linux" "powerpc64le-linux"))
(properties `((hidden? . #t)
(tunable? . #f)))))
We remove some directories and any binaries since we only want the libraries produced from the package; we want to use the headers and any other bits from the main
package. We then combine all of the pieces together to produce a package which can take advantage of the hardware on which it is run:
(define-public gsl-hwcaps
(package/inherit gsl
(name "gsl-hwcaps")
(arguments
(substitute-keyword-arguments (package-arguments gsl)
((#:phases phases #~%standard-phases)
#~(modify-phases #$phases
(add-after 'install 'install-optimized-libraries
(lambda* (#:key inputs outputs #:allow-other-keys)
(let ((hwcaps "/lib/glibc-hwcaps/"))
(for-each
(lambda (psabi)
(copy-recursively
(string-append (assoc-ref inputs (string-append "gsl-" psabi))
hwcaps psabi)
(string-append #$output hwcaps psabi))
'("x86-64-v2" "x86-64-v3" "x86-64-v4"))))))))
(native-inputs
(modify-inputs (package-native-inputs gsl)
(append (gsl-hwabi "x86-64-v2")
(gsl-hwabi "x86-64-v3")
(gsl-hwabi "x86-64-v4"))))
(supported-systems '("x86_64-linux"))
(properties `((tunable? . #f)))))
In this case the size of the final package is increased by about 13 MiB, from 5.5 MiB to 18 MiB. It is up to you if the speed-up from providing an optimized library is worth the size trade-off.
To use this package as a replacement build input in a package package-input-rewriting/spec
is a handy tool:
(define use-glibc-hwcaps
(package-input-rewriting/spec
;; Replace some packages with ones built targeting custom packages build
;; with glibc-hwcaps support.
`(("gsl" . ,(const gsl-hwcaps)))))
(define-public inkscape-with-hwcaps
(package
(inherit (use-glibc-hwcaps inkscape))
(name "inkscape-with-hwcaps")))
Of the Guix supported architectures, x86_64-linux and powerpc64le-linux can both benefit from this new capability.
Through the magic of newer versions of GCC and LLVM it is safe to use these libraries in place of the standard libraries while compiling packages; these compilers know about the glibc-hwcap
directories and will purposefully link against the base library during build time, with glibc's ld.so
choosing the optimized library at runtime.
One possible use case for these libraries is crating guix pack
s of packages to run on other systems. By substituting these libraries it becomes possible to crate a guix pack
which will have better performance than a standard package used in a guix pack
. This works even when the included libraries don't make use of the IFUNCs from glibc or functional multi-versioning from GCC. Providing optimized yet portable pre-compiled binaries is a great way to take advantage of this feature.
GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64 and POWER9 machines.
In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.
https://www.foxnews.com/tech/another-home-thermostat-found-vulnerable-to-attack
A network cable connection to any thermostat is still a safer and overall less expensive long term choice.
GNU cpio version 2.15 is available for download. This is a bug-fixing release. Short summary of changes:
Join the FSF and friends on Friday, January 19, from 12:00 to 15:00 EST (17:00 to 20:00 UTC) to help improve the Free Software Directory.
Today, a tiny tale: about 15 years ago I was working on Guile’s macro expander. Guile inherited this code from an early version of Kent Dybvig’s portable syntax expander. It was... not easy to work with.
Some difficulties were essential. Scope is tricky, after all.
Some difficulties were incidental, but deep. The expander is ultimately a function that translates Scheme-with-macros to Scheme-without-macros. However, it is itself written in Scheme-with-macros, so to load it on a substrate without macros requires a pre-expanded copy of itself, whose data representations need to be compatible with any incremental change, so that you will be able to use the new expander to produce a fresh pre-expansion. This difficulty could have been avoided by incrementally bootstrapping the library. It works once you are used to it, but it’s gnarly.
But then, some difficulties were just superflously egregious. Dybvig is a totemic developer and researcher, but a generation or two removed from me, and when I was younger, it never occurred to me to just email him to ask why things were this way. (A tip to the reader: if someone is doing work you are interested in, you can just email them. Probably they write you back! If they don’t respond, it’s not you, they’re probably just busy and their inbox leaks.) Anyway in my totally speculatory reconstruction of events, when Dybvig goes to submit his algorithm for publication, he gets annoyed that “expand” doesn’t sound fancy enough. In a way it’s similar to the original SSA developers thinking that “phony functions” wouldn’t get published.
So Dybvig calls the expansion function “χ”, because the Greek chi looks like the X in “expand”. Fine for the paper, whatever paper that might be, but then in psyntax
, there are all these functions named chi
and chi-lambda
and all sorts of nonsense.
In early years I was often confused by these names; I wasn’t in on the pun, and I didn’t feel like I had enough responsibility for this code to think what the name should be. I finally broke down and changed all instances of “chi” to “expand” back in 2011, and never looked back.
Anyway, this is a story with a very specific moral: don’t name your functions chi
.
GNU anubis version 4.3 is available for download. This is a maintenance release, including some new features:
Used in CONTROL section, this boolean statement enables or disables the use of the Pluggable Authentication Module interface for accounting and session management.
Sets the name of the file with shared keys used for decrypting replies from the auth service. It is used in traditional mode if anubis receives an encrypted response from the client's identd server (e.g. if they are running pidentd with encryption).
GNU mailutils version 3.17 is available for download. This is a maintenance release, including some new features:
If not explicitly specified, the TLS mode to use ( ondemand , connect , etc.) is derived from the configured port. E.g., for imap4d , port 143 implies ondemand mode, and port 993 implies connection mode.
The global tls-mode setting is used only when the mode cannot be determined otherwise, i.e. neither per-server tls-mode is given nor the port gives any clues as to the TLS mode to use.
Check out the important work our volunteers accomplished at today's Free Software Directory (FSD) IRC meeting.
A remembered set is used by a garbage collector to identify graph edges between partitioned sub-spaces of a heap. The canonical example is in generational collection, where you allocate new objects in newspace , and eventually promote survivor objects to oldspace. If most objects die young, we can focus GC effort on newspace, to avoid traversing all of oldspace all the time.
Collecting a subspace instead of the whole heap is sound if and only if we can identify all live objects in the subspace. We start with some set of roots that point into the subspace from outside, and then traverse all links in those objects, but only to other objects within the subspace.
The roots are, like, global variables, and the stack, and registers; and in the case of a partial collection in which we identify live objects only within newspace, also any link into newspace from other spaces (oldspace, in our case). This set of inbound links is a remembered set.
There are a few strategies for maintaining a remembered set. Generally speaking, you start by implementing a write barrier that intercepts all stores in a program. Instead of:
obj[slot] := val;
You might abstract this away:
write_slot(obj, sizeof obj, &obj[slot], val);
As you can see, it’s quite an annoying transformation to do by hand; typically you will want some sort of language-level abstraction that lets you keep the more natural syntax. C++ can do this pretty well, or if you are implementing a compiler, you just add this logic to the code generator.
Then the actual write barrier... well its implementation is twingled up with implementation of the remembered set. The simplest variant is a card-marking scheme, whereby the heap is divided into equal-sized power-of-two-sized cards , and each card has a bit. If the heap is also divided into blocks (say, 2 MB in size), then you might divide those blocks into 256-byte cards, yielding 8192 cards per block. A barrier might look like this:
void write_slot(ObjRef obj, size_t size,
SlotAddr slot, ObjRef val) {
obj[slot] := val; // Start with the store.
uintptr_t block_size = 1<<21;
uintptr_t card_size = 1<<8;
uintptr_t cards_per_block = block_size / card_size;
uintptr_t obj_addr = obj;
uintptr_t card_idx = (obj_addr / card_size) % cards_per_block;
// Assume remset allocated at block start.
void *block_start = obj_addr & ~(block_size-1);
uint32_t *cards = block_start;
// Set the bit.
cards[card_idx / 32] |= 1 << (card_idx % 32);
}
Then when marking the new generation, you visit all cards, and for all marked cards, trace all outbound links in all live objects that begin on the card.
Card-marking is simple to implement and simple to statically allocate as part of the heap. Finding marked cards takes time proportional to the size of the heap, but you hope that the constant factors and SIMD minimize this cost. However iterating over objects within a card can be costly. You hope that there are few old-to-new links but what do you know?
In Whippet I have been struggling a bit with sticky-mark-bit generational marking, in which new and old objects are not spatially partitioned. Sometimes generational collection is a win, but in benchmarking I find that often it isn’t, and I think Whippet’s card-marking barrier is at fault: it is simply too imprecise. Consider firstly that our write barrier applies to stores to slots in all objects, not just those in oldspace; a store to a new object will mark a card, but that card may contain old objects which would then be re-scanned. Or consider a store to an old object in a more dense part of oldspace; scanning the card may incur more work than needed. It could also be that Whippet is being too aggressive at re-using blocks for new allocations, where it should be limiting itself to blocks that are very sparsely populated with old objects.
There is a tradeoff in write barriers between the overhead imposed on stores, the size of the remembered set, and the precision of the remembered set. Card-marking is relatively low-overhead and usually small as a fraction of the heap, but not very precise. It would be better if a remembered set recorded objects, not cards. And it would be even better if it recorded slots in objects, not just objects.
V8 takes this latter strategy: it has per-block remembered sets which record slots containing “interesting” links. All of the above words were to get here, to take a brief look at its remembered set.
The main operation is RememberedSet::Insert
. It takes the MemoryChunk
(a block, in our language from above) and the address of a slot in the block. Each block has a remembered set; in fact, six remembered sets for some reason. The remembered set itself is a SlotSet
, whose interesting operations come from BasicSlotSet
.
The structure of a slot set is a bitvector partitioned into equal-sized, possibly-empty buckets. There is one bit per slot in the block, so in the limit the size overhead for the remembered set may be 3% (1/32, assuming compressed pointers). Currently each bucket is 1024 bits (128 bytes), plus the 4 bytes for the bucket pointer itself.
Inserting into the slot set will first allocate a bucket (using C++ new
) if needed, then load the “cell” (32-bit integer) containing the slot. There is a template parameter declaring whether this is an atomic or normal load. Finally, if the slot bit in the cell is not yet set, V8 will set the bit, possibly using atomic compare-and-swap.
In the language of Blackburn’s Design and analysis of field-logging write barriers, I believe this is a field-logging barrier, rather than the bit-stealing slot barrier described by Yang et al in the 2012 Barriers Reconsidered, Friendlier Still!. Unlike Blackburn’s field-logging barrier, however, this remembered set is implemented completely on the side: there is no in-object remembered bit, nor remembered bits for the fields.
On the one hand, V8’s remembered sets are precise. There are some tradeoffs, though: they require off-managed-heap dynamic allocation for the buckets, and traversing the remembered sets takes time proportional to the whole heap size. And, should V8 ever switch its minor mark-sweep generational collector to use sticky mark bits, the lack of a spatial partition could lead to similar problems as I am seeing in Whippet. I will be interested to see what they come up with in this regard.
Well, that’s all for today. Happy hacking in the new year!
Check out the important work our volunteers accomplished at today's Free Software Directory (FSD) IRC meeting.
The dates and location of LibrePlanet 2024 have been announced!
I'm very pleased to announce the release of a new version of GNU PSPP. PSPP is a program for statistical analysis of sampled data. It is a free replacement for the proprietary program SPSS.
Changes from 1.6.2-pre2 to 2.0.0:
Please send PSPP bug reports to bug-gnu-pspp@gnu.org.