#gnu

federatica_bot@federatica.space

gnulib @ Savannah: GNU gnulib: gnulib-tool has become much faster

If you are developer on a package that uses GNU gnulib as part of its build system:

gnulib-tool has been known for being slow for many years. We have listened to your complaints. We have rewritten gnulib-tool in another programming language (Python). It is between 8 times and 100 times faster than the previous implementation.

Both implementations behave identically, that is, produce the same generated files and the same output. Nothing changes in your way to use Gnulib; it's only faster.

In order to reap the new speed:

1. Make sure you have Python (version 3.7 or newer) installed on your machine.

2. Update your gnulib checkout. (For some packages, it comes as a git submodule named 'gnulib'.) Like this:

$ git checkout master

$ git pull

Set the environment variable GNULIB_SRCDIR, pointing to this checkout.

If the package is using a git submodule named 'gnulib', it is also advisable to do

$ git commit -m 'build: Update gnulib submodule to latest.' gnulib

(as a preparation for step 4, because the --no-git option does not work as expected in all variants of 'bootstrap').

3. Clean the built files of your package:

$ make -k distclean

4. Regenerate the fetched and generated files of your package. Depending on the package, this may be a command such as

$ ./bootstrap --no-git --gnulib-srcdir=$GNULIB_SRCDIR

or

$ export GNULIB_SRCDIR; ./autopull.sh; ./autogen.sh

or, if no such script is available:

$ $GNULIB_SRCDIR/gnulib-tool --update

5. Continue with

$ ./configure

$ make

as usual.

Enjoy! The rewritten gnulib-tool was implemented by Dmitry Selyutin, Collin Funk, and me.

#gnu #gnuorg #opensource

mkwadee@diasp.eu

I can finally say I've upgraded successfully to #Fedora40. It was not without hassle this time and it started with what seemed to be a system that did not even give me a prompt after #rebooting, although the update process had seemed to go smoothly and quickly. Luckily, the virtual screens were working and so I could #login to a #shell. Although #sddm didn't seem to be working, #kdm was and so I was able to open a desktop session, but only in #Gnome. There were still problems running #kde #programs such as #konsole and #korganizer as some #qt #libraries were missing. Installing those allowed the programs to run. Also, no #audio devices were being detected and so I couldn't play any clips, etc.

Today I found that #kwin was also lacking a library (#qtsensors) and so after that was installed and I rebooted the machine, I found that sddm was working again and so I could log in to the #kde session that I usually do. Also, the #sound issue was solved by removing the directory #wireplumber from ~/.local/state.

#GNU #Linux #Fedora #F40 #FreeSoftware

federatica_bot@federatica.space

gnulib @ Savannah: GNU gnulib: calling for beta-testers

If you are developer on a package that uses GNU gnulib as part of its build system:

gnulib-tool has been known for being slow for many years. We have listened to your complaints. A rewrite of gnulib-tool in another programming language (Python) is ready for beta-testing. It is between 8 times and 100 times faster than the original gnulib-tool.

Both implementations should behave identically, that is, produce the same generated files and the same output. You can help us ensure this, through the following steps:

1. Make sure you have Python (version 3.7 or newer) installed on your machine.

2. Update your gnulib checkout. (For some packages, it comes as a git submodule named 'gnulib'.) Like this:

$ git checkout master

$ git pull

Set the environment variable GNULIB_SRCDIR, pointing to this checkout.

If the package is using a git submodule named 'gnulib', it is also advisable to do

$ git commit -m 'build: Update gnulib submodule to latest.' gnulib

(as a preparation for step 5, because the --no-git option does not work as expected in all variants of 'bootstrap').

3. Set an environment variable that enables checking that the two implementations behave the same:

$ export GNULIB_TOOL_IMPL=sh+py

4. Clean the built files of your package:

$ make -k distclean

5. Regenerate the fetched and generated files of your package. Depending on the package, this may be a command such as

$ ./bootstrap --no-git --gnulib-srcdir=$GNULIB_SRCDIR

or

$ export GNULIB_SRCDIR; ./autopull.sh; ./autogen.sh

or, if no such script is available:

$ $GNULIB_SRCDIR/gnulib-tool --update

If there is a failure, due to differences between the 'sh' and 'py' results, please report it to bug-gnulib@gnu.org.

6. If this invocation was successful, you can trust the rewritten gnulib-tool and use it from now on, by setting the environment variable

$ export GNULIB_TOOL_IMPL=py

7. Continue with

$ ./configure

$ make

as usual.

And enjoy the speed! The rewritten gnulib-tool was implemented by Dmitry Selyutin, Collin Funk, and me.

#gnu #gnuorg #opensource

federatica_bot@federatica.space

parallel @ Savannah: GNU Parallel 20240422 ('Børsen') [stable]

GNU Parallel 20240422 ('Børsen') has been released. It is available for download at: lbry://@GnuParallel:4

Quote of the month:

I’m a big fan of GNU parallel!

-- Scott Cain @scottjcain@twitter

New in this release:

  • Bug fixes and man page updates.

GNU Parallel - For people who live life in the parallel lane.

If you like GNU Parallel record a video testimonial: Say who you are, what you use GNU Parallel for, how it helps you, and what you like most about it. Include a command that uses GNU Parallel if you feel like it.

About GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel.

If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops.

GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.

For example you can run this to convert all jpeg files into png and gif files and have a progress bar:

parallel --bar convert {1} {1.}.{2} ::: *.jpg ::: png gif

Or you can generate big, medium, and small thumbnails of all jpeg files in sub dirs:

find . -name '*.jpg' |

parallel convert -geometry {2} {1} {1//}/thumb{2}_{1/} :::: - ::: 50 100 200

You can find more about GNU Parallel at: http://www.gnu.org/s/parallel/

You can install GNU Parallel in just 10 seconds with:

$ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \

fetch -o - http://pi.dk/3 ) > install.sh

$ sha1sum install.sh | grep 883c667e01eed62f975ad28b6d50e22a

12345678 883c667e 01eed62f 975ad28b 6d50e22a

$ md5sum install.sh | grep cc21b4c943fd03e93ae1ae49e28573c0

cc21b4c9 43fd03e9 3ae1ae49 e28573c0

$ sha512sum install.sh | grep ec113b49a54e705f86d51e784ebced224fdff3f52

79945d9d 250b42a4 2067bb00 99da012e c113b49a 54e705f8 6d51e784 ebced224

fdff3f52 ca588d64 e75f6033 61bd543f d631f592 2f87ceb2 ab034149 6df84a35

$ bash install.sh

Watch the intro video on http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial (man parallel_tutorial). Your command line will love you for it.

When using programs that use GNU Parallel to process data for publication please cite:

O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014.

If you like GNU Parallel:

  • Give a demo at your local user group/team/colleagues
  • Post the intro videos on Reddit/Diaspora*/forums/blogs/ Identi.ca/Google+/Twitter/Facebook/Linkedin/mailing lists
  • Get the merchandise https://gnuparallel.threadless.com/designs/gnu-parallel
  • Request or write a review for your favourite blog or magazine
  • Request or build a package for your favourite distribution (if it is not already there)
  • Invite me for your next conference

If you use programs that use GNU Parallel for research:

  • Please cite GNU Parallel in you publications (use --citation)

If GNU Parallel saves you money:

About GNU SQL

GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries.

The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell.

When using GNU SQL for a publication please cite:

O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32.

About GNU Niceload

GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the limit.

#gnu #gnuorg #opensource

federatica_bot@federatica.space

www-zh-cn @ Savannah: It is easy to contribute to GNU

I will be delivering my talk, "It is easy to contribute to GNU," Saturday, May 4, 2024, 12:15--13:00 EDT (16:00 UTC), at the LibrePlanet 2024 conference, and I hope you’ll check it out!

LibrePlanet is a conference about software freedom, happening on May 4 & 5, 2024. The event is hosted by the Free Software Foundation (FSF), and brings together software developers, law and policy experts, activists, students, and computer users to learn skills, celebrate free software accomplishments, and face upcoming challenges. Newcomers are always welcome, and LibrePlanet 2024 will feature programming for all ages and experience levels.

Please register in advance at <https://libreplanet.org/2024/>.

wxie

#gnu #gnuorg #opensource

linuxmao.org@diaspora-fr.org

Éditorial d'avril 2024

#art #art_libre #artiste #artlibre #cc-by-sa #chanson #copyleft #creative-commons #creative_commons #creativecommons #culture #culture-libre #culture_libre #culturelibre #francophone #français #gnu #gnu-linux #gnulinux #gpl #informatique-musicale #informatique_musicale #informatiquemusicale #libre #libre-art #linux #linux-mao #linux_mao #linuxaudio #linuxmao #logiciel-libre #logiciel_libre #logiciellibre #mao #mao-linux #mao_linux #maolinux #musicien #musiciens #musique #musique-libre #musique_libre #numerique #productionmusicale

Comment tourner la page ? Quand renoncer à la musique ? Est-ce seulement possible ?

Je pense souvent à l’un de mes amis, violoniste, qui a pris récemment sa retraite, après une carrière bien remplie, brillante et internationale, débutée fin 60’s début 70’s. Une cinquantaine d’années de bons et loyaux services, la mentonnière sous les joues.

Son petit étui ne le quitte pourtant jamais et, pour peu que l’un de nous se pose devant un piano ou empoigne une guitare, l’instrument sort immédiatement de sa boite et les heures passent, enluminées de rythmes et de mélodies improvisés.

Alors pourquoi mettre un terme à une aussi flamboyante carrière dont le dernier épisode s’est joué au sein d’un groupe d’Afrique du Sud basé à Londres ?

Parce qu’il ne veut plus voyager. Avions, cars, trains, autos, terminé ! Passer le restant de ses jours peinard, dans sa petite maison sous le soleil du Sud Ouest. Rencontrer ses potes trop longtemps négligés, plaisanter, improviser des apéros, laisser les heures s’étirer nonchalamment, enfin.

Mais, encore une fois, l’étui reste à portée de main, l’archet et le violon surgissant hors de leur boîte à la moindre sollicitation.

Bien sûr, nous ne menons pas tous ce type de carrière, certains en rêvent, d’autres y renoncent sans même tenter le coup, beaucoup pratiquent sans espérer en vivre et quelques uns misent tout sur une hypothétique réussite, sacrifiant toute forme de vie intime dans cette quête.

Quoi qu’il en soit, nos existences sont toutes tissées de rythmes et de mélodies, d’airs et de paroles qui ne nous quitteront jamais, dans le bonheur ou les chagrins, les joies ou les peines.

Alors, la réponse aux questions initiales :

Eh bien non, nous ne renoncerons jamais !

federatica_bot@federatica.space

Simon Josefsson: Reproducible and minimal source-only tarballs

With the release of Libntlm version 1.8 the release tarball can be reproduced on several distributions. We also publish a signed minimal source-only tarball, produced by git-archive which is the same format used by Savannah, Codeberg, GitLab, GitHub and others. Reproducibility of both tarballs are tested continuously for regressions on GitLab through a CI/CD pipeline. If that wasn’t enough to excite you, the Debian packages of Libntlm are now built from the reproducible minimal source-only tarball. The resulting binaries are hopefully reproducible on several architectures.

What does that even mean? Why should you care? How you can do the same for your project? What are the open issues? Read on, dear reader…

This article describes my practical experiments with reproducible release artifacts, following up on my earlier thoughts that lead to discussion on Fosstodon and a patch by Janneke Nieuwenhuizen to make Guix tarballs reproducible that inspired me to some practical work.

Let’s look at how a maintainer release some software, and how a user can reproduce the released artifacts from the source code. Libntlm provides a shared library written in C and uses GNU Make, GNU Autoconf, GNU Automake, GNU Libtool and gnulib for build management, but these ideas should apply to most project and build system. The following illustrate the steps a maintainer would take to prepare a release:

git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
./bootstrap
./configure
make distcheck
gpg -b libntlm-1.8.tar.gz

The generated files libntlm-1.8.tar.gz and libntlm-1.8.tar.gz.sig are published, and users download and use them. This is how the GNU project have been doing releases since the late 1980’s. That is a testament to how successful this pattern has been! These tarballs contain source code and some generated files, typically shell scripts generated by autoconf, makefile templates generated by automake, documentation in formats like Info, HTML, or PDF. Rarely do they contain binary object code, but historically that happened.

The XZUtils incident illustrate that tarballs with files that are not included in the git archive offer an opportunity to disguise malicious backdoors. I blogged earlier how to mitigate this risk by using signed minimal source-only tarballs.

The risk of hiding malware is not the only motivation to publish signed minimal source-only tarballs. With pre-generated content in tarballs, there is a risk that GNU/Linux distributions such as Trisquel, Guix, Debian/Ubuntu or Fedora ship generated files coming from the tarball into the binary *.deb or *.rpm package file. Typically the person packaging the upstream project never realized that some installed artifacts was not re-built through a typical autoconf -fi && ./configure && make install sequence, and never wrote the code to rebuild everything. This can also happen if the build rules are written but are buggy, shipping the old artifact. When a security problem is found, this can lead to time-consuming situations, as it may be that patching the relevant source code and rebuilding the package is not sufficient: the vulnerable generated object from the tarball would be shipped into the binary package instead of a rebuilt artifact. For architecture-specific binaries this rarely happens, since object code is usually not included in tarballs — although for 10+ years I shipped the binary Java JAR file in the GNU Libidn release tarball, until I stopped shipping it. For interpreted languages and especially for generated content such as HTML, PDF, shell scripts this happens more than you would like.

Publishing minimal source-only tarballs enable easier auditing of a project’s code, to avoid the need to read through all generated files looking for malicious content. I have taken care to generate the source-only minimal tarball using [git-archive](https://git-scm.com/docs/git-archive). This is the same format that GitLab, GitHub etc offer for the automated download links on git tags. The minimal source-only tarballs can thus serve as a way to audit GitLab and GitHub download material! Consider if/when hosting sites like GitLab or GitHub has a security incident that cause generated tarballs to include a backdoor that is not present in the git repository. If people rely on the tag download artifact without verifying the maintainer PGP signature using GnuPG, this can lead to similar backdoor scenarios that we had for XZUtils but originated with the hosting provider instead of the release manager. This is even more concerning, since this attack can be mounted for some selected IP address that you want to target and not on everyone, thereby making it harder to discover.

With all that discussion and rationale out of the way, let’s return to the release process. I have added another step here:

make srcdist
gpg -b libntlm-1.8-src.tar.gz

Now the release is ready. I publish these four files in the Libntlm’s Savannah Download area, but they can be uploaded to a GitLab/GitHub release area as well. These are the SHA256 checksums I got after building the tarballs on my Trisquel 11 aramo laptop:

91de864224913b9493c7a6cec2890e6eded3610d34c3d983132823de348ec2ca  libntlm-1.8-src.tar.gz
ce6569a47a21173ba69c990965f73eb82d9a093eb871f935ab64ee13df47fda1  libntlm-1.8.tar.gz

So how can you reproduce my artifacts? Here is how to reproduce them in a Ubuntu 22.04 container:

podman run -it --rm ubuntu:22.04
apt-get update
apt-get install -y --no-install-recommends autoconf automake libtool make git ca-certificates
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
./bootstrap
./configure
make dist srcdist
sha256sum libntlm-*.tar.gz

You should see the exact same SHA256 checksum values. Hooray!

This works because Trisquel 11 and Ubuntu 22.04 uses the same version of git, autoconf, automake, and libtool. These tools do not guarantee the same output content for all versions, similar to how GNU GCC does not generate the same binary output for all versions. So there is still some delicate version pairing needed.

Ideally, the artifacts should be possible to reproduce from the release artifacts themselves, and not only directly from git. It is possible to reproduce the full tarball in a AlmaLinux 8 container – replace almalinux:8 with rockylinux:8 if you prefer RockyLinux:

podman run -it --rm almalinux:8
dnf update -y
dnf install -y make wget gcc
wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8.tar.gz
tar xfa libntlm-1.8.tar.gz
cd libntlm-1.8
./configure
make dist
sha256sum libntlm-1.8.tar.gz

The source-only minimal tarball can be regenerated on Debian 11:

podman run -it --rm debian:11
apt-get update
apt-get install -y --no-install-recommends make git ca-certificates
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
make -f cfg.mk srcdist
sha256sum libntlm-1.8-src.tar.gz 

As the Magnus Opus or chef-d’œuvre, let’s recreate the full tarball directly from the minimal source-only tarball on Trisquel 11 – replace docker.io/kpengboy/trisquel:11.0 with ubuntu:22.04 if you prefer.

podman run -it --rm docker.io/kpengboy/trisquel:11.0
apt-get update
apt-get install -y --no-install-recommends autoconf automake libtool make wget git ca-certificates
wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8-src.tar.gz
tar xfa libntlm-1.8-src.tar.gz
cd libntlm-v1.8
./bootstrap
./configure
make dist
sha256sum libntlm-1.8.tar.gz

Yay! You should now have great confidence in that the release artifacts correspond to what’s in version control and also to what the maintainer intended to release. Your remaining job is to audit the source code for vulnerabilities, including the source code of the dependencies used in the build. You no longer have to worry about auditing the release artifacts.

I find it somewhat amusing that the build infrastructure for Libntlm is now in a significantly better place than the code itself. Libntlm is written in old C style with plenty of string manipulation and uses broken cryptographic algorithms such as MD4 and single-DES. Remember folks: solving supply chain security issues has no bearing on what kind of code you eventually run. A clean gun can still shoot you in the foot.

Side note on naming: GitLab exports tarballs with pathnames libntlm-v1.8/ (i.e.., PROJECT-TAG/) and I’ve adopted the same pathnames, which means my libntlm-1.8-src.tar.gz tarballs are bit-by-bit identical to GitLab’s exports and you can verify this with tools like diffoscope. GitLab name the tarball libntlm-v1.8.tar.gz (i.e., PROJECT-TAG.ARCHIVE) which I find too similar to the libntlm-1.8.tar.gz that we also publish. GitHub uses the same git archive style, but unfortunately they have logic that removes the ‘v’ in the pathname so you will get a tarball with pathname libntlm-1.8/ instead of libntlm-v1.8/ that GitLab and I use. The content of the tarball is bit-by-bit identical, but the pathname and archive differs. Codeberg (running Forgejo) uses another approach: the tarball is called libntlm-v1.8.tar.gz (after the tag) just like GitLab, but the pathname inside the archive is libntlm/, otherwise the produced archive is bit-by-bit identical including timestamps. Savannah’s CGIT interface uses archive name libntlm-1.8.tar.gz with pathname libntlm-1.8/, but otherwise file content is identical. Savannah’s GitWeb interface provides snapshot links that are named after the git commit (e.g., libntlm-a812c2ca.tar.gz with libntlm-a812c2ca/) and I cannot find any tag-based download links at all. Overall, we are so close to get SHA256 checksum to match, but fail on pathname within the archive. I’ve chosen to be compatible with GitLab regarding the content of tarballs but not on archive naming. From a simplicity point of view, it would be nice if everyone used PROJECT-TAG.ARCHIVE for the archive filename and PROJECT-TAG/ for the pathname within the archive. This aspect will probably need more discussion.

Side note on git archive output: It seems different versions of git archive produce different results for the same repository. The version of git in Debian 11, Trisquel 11 and Ubuntu 22.04 behave the same. The version of git in Debian 12, AlmaLinux/RockyLinux 8/9, Alpine, ArchLinux, macOS homebrew, and upcoming Ubuntu 24.04 behave in another way. Hopefully this will not change that often, but this would invalidate reproducibility of these tarballs in the future, forcing you to use an old git release to reproduce the source-only tarball. Alas, GitLab and most other sites appears to be using modern git so the download tarballs from them would not match my tarballs – even though the content would.

Side note on ChangeLog: ChangeLog files were traditionally manually curated files with version history for a package. In recent years, several projects moved to dynamically generate them from git history (using tools like git2cl or gitlog-to-changelog). This has consequences for reproducibility of tarballs: you need to have the entire git history available! The gitlog-to-changelog tool also output different outputs depending on the time zone of the person using it, which arguable is a simple bug that can be fixed. However this entire approach is incompatible with rebuilding the full tarball from the minimal source-only tarball. It seems Libntlm’s ChangeLog file died on the surgery table here.

So how would a distribution build these minimal source-only tarballs? I happen to help on the libntlm package in Debian. It has historically used the generated tarballs as the source code to build from. This means that code coming from gnulib is vendored in the tarball. When a security problem is discovered in gnulib code, the security team needs to patch all packages that include that vendored code and rebuild them, instead of merely patching the gnulib package and rebuild all packages that rely on that particular code. To change this, the Debian libntlm package needs to Build-Depends on Debian’s gnulib package. But there was one problem: similar to most projects that use gnulib, Libntlm depend on a particular git commit of gnulib, and Debian only ship one commit. There is no coordination about which commit to use. I have adopted gnulib in Debian, and add a git bundle to the *_all.deb binary package so that projects that rely on gnulib can pick whatever commit they need. This allow an no-network GNULIB_URL and GNULIB_REVISION approach when running Libntlm’s ./bootstrap with the Debian gnulib package installed. Otherwise libntlm would pick up whatever latest version of gnulib that Debian happened to have in the gnulib package, which is not what the Libntlm maintainer intended to be used, and can lead to all sorts of version mismatches (and consequently security problems) over time. Libntlm in Debian is developed and tested on Salsa and there is continuous integration testing of it as well, thanks to the Salsa CI team.

Side note on git bundles: unfortunately there appears to be no reproducible way to export a git repository into one or more files. So one unfortunate consequence of all this work is that the gnulib *.orig.tar.gz tarball in Debian is not reproducible any more. I have tried to get Git bundles to be reproducible but I never got it to work — see my notes in gnulib’s debian/README.source on this aspect. Of course, source tarball reproducibility has nothing to do with binary reproducibility of gnulib in Debian itself, fortunately.

One open question is how to deal with the increased build dependencies that is triggered by this approach. Some people are surprised by this but I don’t see how to get around it: if you depend on source code for tools in another package to build your package, it is a bad idea to hide that dependency. We’ve done it for a long time through vendored code in non-minimal tarballs. Libntlm isn’t the most critical project from a bootstrapping perspective, so adding git and gnulib as Build-Depends to it will probably be fine. However, consider if this pattern was used for other packages that uses gnulib such as coreutils, gzip, tar, bison etc (all are using gnulib) then they would all Build-Depends on git and gnulib. Cross-building those packages for a new architecture will therefor require git on that architecture first, which gets circular quick. The dependency on gnulib is real so I don’t see that going away, and gnulib is a Architecture:all package. However, the dependency on git is merely a consequence of how the Debian gnulib package chose to make all gnulib git commits available to projects: through a git bundle. There are other ways to do this that doesn’t require the git tool to extract the necessary files, but none that I found practical — ideas welcome!

Finally some brief notes on how this was implementated. Enabling bootstrappable source-only minimal tarballs via gnulib’s ./bootstrap is achieved by using the GNULIB_REVISION mechanism, locking down the gnulib commit used. I have always disliked git submodules because they add extra steps and has complicated interaction with CI/CD. The reason why I gave up git submodules now is because the particular commit to use is not recorded in the git archive output when git submodules is used. So the particular gnulib commit has to be mentioned explicitly in some source code that goes into the git archive tarball. Colin Watson added the GNULIB_REVISION approach to ./bootstrap back in 2018, and now it no longer made sense to continue to use a gnulib git submodule. One alternative is to use ./bootstrap with --gnulib-srcdir or --gnulib-refdir if there is some practical problem with the GNULIB_URL towards a git bundle the GNULIB_REVISION in bootstrap.conf.

The srcdist make rule is simple:

git archive --prefix=libntlm-v1.8/ -o libntlm-v1.8.tar.gz HEAD

Making the make dist generated tarball reproducible can be more complicated, however for Libntlm it was sufficient to make sure the modification times of all files were set deterministically to a timestamp found in the git repository. Interestingly there seems to be a couple of different ways to accomplish this, Guix doesn’t support minimal source-only tarballs but rely on a .tarball-timestamp file inside the tarball. Paul Eggert explained what TZDB is using some time ago. The approach I’m using now is fairly similar to the one I suggested over a year ago.

Doing continous testing of all this is critical to make sure things don’t regress. Libntlm’s pipeline definition now produce the generated libntlm-*.tar.gz tarballs and a checksum as a build artifact. Then I added the [000-reproducability](https://gitlab.com/gsasl/libntlm/-/blob/73cc1f220c7173ebb5437b3c23fda8194d742f07/.gitlab-ci.yml#L392) job which compares the checksums and fails on mismatches. You can read its delicate output in the job for the v1.8 release. Right now we insists that builds on Trisquel 11 match Ubuntu 22.04, that PureOS 10 builds match Debian 11 builds, that AlmaLinux 8 builds match RockyLinux 8 builds, and AlmaLinux 9 builds match RockyLinux 9 builds. As you can see in pipeline job output, not all platforms lead to the same tarballs, but hopefully this state can be improved over time. There is also partial reproducibility, where the full tarball is reproducible across two distributions but not the minimal tarball, or vice versa.

If this way of working plays out well, I hope to implement it in other projects too.

What do you think? Happy Hacking!

#gnu #gnuorg #opensource