#developers

waynerad@diasp.org

"Comparing algorithms for extracting content from web pages."

Remember, kids, it's only legal to extract content from web pages if the Terms of Service permit it.

That said, extractors compared: BTE (Python), Goose3 (Python), jusText (Python), Newspaper3k (Python), Readability (JavaScript), Resiliparse (Python), Trafilatura (Python), news-please (Python), Boilerpipe (Java), Dragnet (Python), ExtractNet (Python), Go DOM Distiller (Go), BoilerNet (Python + JavaScript), and Web2Text (Python).

Looks like if you want to extract content from web pages, you should be using Python.

Comparing algorithms for extracting content from web pages

#solidstatelife #developers

waynerad@diasp.org

"At Antithesis, we build an autonomous, deterministic simulation testing (DST) tool. Determinism is so in the water here that it has even seeped into our front-end: our reactive notebook. In this case, determinism was a tool that enabled us to build the low-latency, reactive experiences our users enjoy."

"Reactivity is traditionally defined as a system reacting in response to changing data. In the UI/UX world, reactivity is considered a feature of some libraries (denoting automatic interface updates as data changes), rather than a programming style."

"We're seeing glimmers of instant reactivity in dev tools. First it was syntax highlighting that updates without saving; later it was syntax checks, autocomplete, and linters. Now we even have AI copilots suggesting code as you type. But great developers know there's something more important than what color the code is or how your linter feels: what's most important is what the code does when it runs."

"By running your code on keystroke, the Antithesis Notebook's reactive paradigm informs you of just that, and with an immediacy that's essential to shortening iteration cycles and flattening learning curves. When you're in a reactive regime, you're immediately forced to reckon with the result of your code. The age-old saying of 'test early, test often' becomes the default."

"It turns out that if you build something reactive enough, something magical falls out: reproducibility. In this case, maintaining the illusion of having just run every line of Notebook code from top-to-bottom mandates that if we actually did restart the Notebook and run from top-to-bottom, then we should be in the same state."

Wow, that's a strong claim. They have a demo you can interact with, and it seems to work as advertised.

"This stands in stark contrast to Jupyter, the best known notebook out there, where users decide which cells to run and in which order. Imagine Google Sheets allowing you to decide which cells were up-to-date. Chaos. For Jupyter, this scheme produces enough hidden state to motivate research on the resulting bugs. One study found that only 24% of sampled Jupyter notebooks ran without exceptions."

Introducing our reactive Notebook: the paradigm devs deserve

#solidstatelife #developers #reactive

anonymiss@despora.de

80% of #developers are #unhappy. The #problem is not AI, nor is #coding

source: https://shiftmag.dev/unhappy-developers-stack-overflow-survey-3896/

According to the #Workplace Satisfaction Survey, 80% of professional programmers are unhappy. One in three respondents actively hates their #job, while almost half survive in survival mode. This leaves only 20% of those who claim to be somewhat happy. Although programmers are well-paid and often able to work remotely, many are still dissatisfied. Why is that so?

#economy #development #software #program #news #survey #system #technology #ai #coder

kurt@pod.thing.org

Why Is the #Amiga so Beloved in the #Demoscene?

The Commodore Amiga and its Undying Adoration by Anarchist Creatives

The Commodore Amiga is a successful line of personal #computers from the mid-1980s to 1990s. At the time of its introduction, the Amiga had superior #graphics and #sound capabilities compared to all other home computers. This magical computer enabled its young users to create graphics, #music and #animations with a level of sophistication that was previously unattainable in the home.

To this day, the Amiga holds a great significance in the demoscene, a niche #subculture of #developers, musicians and #artists who create real-time audiovisual presentations called demos.

This essay investigates the gestation of the demoscene, the Amiga platform's revolutionary beginnings, its emotional resonance within a dedicated community, and its broader influence on the field of computer graphics and music. Why is the Amiga still one of the most beloved and significant platforms in the demoscene?

https://marincomics.com/amiga-demoscene/

#demo #demos #art #arts

waynerad@diasp.org

GitHub Copilot causes code churn? The term "code churn" is a fancy way of saying Copilot writes crappy code. Copilot writes crappy code, developers fail to notice it (at first), check it in, then discover it's crappy (within 2 weeks -- that's the arbitrary time window chosen for the study), causing them to go in and fix it, thus causing the code to "churn", get it?

Copilot Causes Code Churn? This Study Is Concerning... Theo - t3․gg

#solidstatelife #ai #genai #llms #openai #copilot #developers

waynerad@diasp.org

"Engineering leaders have long sought to improve the productivity of their developers, but knowing how to measure or even define developer productivity has remained elusive. Past approaches, such as measuring the output of developers or the time it takes them to complete tasks, have failed to account for the complex and diverse activities that developers perform. Thus, the question remains: What should leaders measure and focus on to improve developer productivity?"

"Today, many organizations aiming to improve developer productivity are finding that a new developer-centric approach focused on developer experience (also known as DevEx) unlocks valuable insights and opportunities."

So, subjective experience?

"Developer experience encompasses how developers feel about, think about, and value their work. In prior research, we identified more than 25 sociotechnical factors that affect DevEx. For example, interruptions, unrealistic deadlines, and friction in development tools negatively affect DevEx, while having clear tasks, well-organized code, and pain-free releases improve it."

"A common misconception is that DevEx is primarily affected by tools. Our research, however, shows that human factors such as having clear goals for projects and feeling psychologically safe on a team have a substantial impact on developers' performance."

They go on to say the "three dimensions of DevEx" are feedback loops, cognitive load, and flow state.

DevEx: What actually drives productivity

#solidstatelife #computerscience #developers

waynerad@diasp.org

"If this future was visible 15 years ago in research, what is visible now that is coming in 15 years? In my view, I believe that most of the central issues about programming and software engineering will not be about code construction, but about everything before and after construction: namely, requirements and verification. Deciding what to make, why to make it, and whether what is made actually achieves these goals, these are the next frontier of software."

"But these two big challenges have very different 'attack surfaces', if you will. Verification has long been studied in software engineering research, and I'm highly confident that its decades of sophisticated techniques will be brought to bear on large language model-driven synthesis to eventually create highly productive iterative loops of querying and verification, automating much of the construction and evaluation of programs. Give the research community 10-15 more years and we will see consistently high quality programs for this 80% of routine programs emerging from these models."

"But what this will do is put great pressure on requirements."

Large language models will change programming … a lot

#solidstatelife #ai #nlp #llms #developers

quetzop1@diasp.org

Internet

I'm really annoyed by the #Internet of today:

  • #Trackers and #data #collection everywhere
  • #JavaScript-heavy #Web #applications instead of document-oriented #websites
  • No #JavaScript most often translates to an empty page with a single sentence: "Please activate JavaScript"; the page content however is often nothing that actually requires JavaScript, the website creators just want to feel like actual #application #developers, so they re-build much of what the #browser already supplies with #inefficient and #bug heavy JavaScript code
  • Content almost always behind a #login wall
  • More often than not only very superficial #information
  • #Ads
  • Thousands of 3rd party JS files included, most of which have the only purpose of tracking you across websites
  • #Misinformation and #biased #information everywhere
  • Deliberately misleading advertisment, sich as "save 80% now", and artificial time pressure)
  • "Best viewed on #Google #Chrome"
  • "Login with Facebook"
  • Newsletter subscription and cookie pop-ups featuring #dark #patterns
  • #Search #engine #optimization ( #SEO ) acts in the worst interest of the user by skewing search results
  • Artificial restriction of web #app functionality to promote their native apps
  • Large parts of the Web are only accessible by #smartphone
  • You have to provide your #phone #number to login
  • If you didn't provide a phone number, your account is being blocked right after the initial login because we suspect you being malicious actor because why not (=> #Instagram, #Facebook)
  • #Proprietary #platforms are required to participate in public #online life (Amazon, Google, Facebook, Instagram, Twitter, YouTube)
  • One-sentence-paragraphs and sloppy language (especially found in #Medium #articles)
  • "We care about your #privacy" actually means: "We were forced by law to do this shit, we just want to collect and store as much information on you as possible to make money off of you now or in an undescript future"
  • JavaScript code minimizer
  • Large font sizes, much whitespace, large illustrative, but useless images, HD screen required to browse most websites
  • Lack of #government #regulation and #law #enforcement, too many malicious actors (#spam, #phishing, etc.)
  • Emotional content to increase #interaction, #clickbait

Once being an open platform geared towards information exchange and bringing people into contact, most of the public Internet today is nothing but annoying useless #marketing, #advertising and #data #collection. Providing information, connecting people, and making life convenient is definitely NOT the primary goal of whoever is big on the Internet today. It's shocking to see how much of it is only to sell you stuff or to sell your information.

And the worst is: we are even paying them to do this shit. #Marketing spending will be reflected in product prices, and with much of marketing being done in 1st world countries, a substantial amount of the price goes into this destructive industry.

I could go on with this for hours. Really sick of it.

canoodle@nerdpol.ch

Rant: Open Source and the concept of: Release early, release often or publish early & publish often -> continuous development/continuous integration (CD/CI) -> tight loops ok but still - linking to nirvana without redirection & badly written software that everyone uses - another case of - nothing works "ok" - klarer fall von "nichts funktioniert ok"

https://administrator.de/forum/wol-geht-nicht-mit-broadcast-adresse-101944.html

-> it’s catastrophic, when webpages change their url setup…

https://www.heise.de/netze/Wake-on-WAN–/artikel/89304/0

because it will result in

“nothing works” “ok”

this does not have nothing to do with luck, but with:

  1. bad url management:
    • wordpress does an pretty good job there, as whenever the user changes the url (more keywords?) it will also redirect from the older past urls to the new url
      • that is how it is SUPPOSED to be for EVERY website of the (not so) “ethernal” part of the internet called www
  2. elastic search seems to be a very very badly written software that does not do any sort of software quality checks?
    • or maybe it’s wrongful integration? (but maybe it just sucks)
    • why is every developer-user using it?
  • PS: as mankind still ponders and evolves (by making mistakes) how to best deal with computers
    • yes someone said “publish early” & “publish often” (doing this with the blog… also… often too often and too early X-D)
      • or: “Release early, release often” (wiki)
        • “tight feedback loop between developers and testers or users” (wiki) - yeah sure as a developer that might be a good thing, as a user… really doubt it… - there are highly intelligent respected developers that pioneered this concept… it might work for small teams… (of one)
        • “This philosophy was popularized by Eric S. Raymond in his 1997 essay The Cathedral and the Bazaar, where Raymond stated “Release early. Release often. And listen to your customers”.[4]”“This philosophy was originally applied to the development of the Linux kernel and other open-source software, but has also been applied to closed source, commercial software development.””The alternative to the release early, release often philosophy is aiming to provide only polished, bug-free releases.[5] Advocates of RERO question that this would in fact result in higher-quality releases.[4]
      • has this lead to every developer going in the: continuous development/continuous integration direction? (definately sounds like it)
        • it really should be called CD/CI not CI/CD because first comes the development, then the integration (but well hewego: CI/CD@RedHat)
        • still pondering if it’s really a good idea - well if software quality sticks to UNIX principles of K.I.S.S (most do not and have NO IDEA what non-K.I.S.S means for their software-project or company: - it is the difference between: - lost in chaos of complexity = dysfunctionality - vs a lean stream of running smooth software-company - src: https://homepage.cs.uri.edu/~thenry/resources/unix_art/ch01s07.html - plus test-driven development: 100.000 use case checks tested afterwards automatic & semi-automatic & manual - than that probably works (but then that is what needs to be done anyway to ensure good software quality) - plus: maybe a feedback channel that does not de-motivate - always say something positive first - then the critique
        • signal.org is a very cool mobile & desktop messenger (that usually works pretty well) but: - what is already annoying: if updates per program are 100MBytes and more… (always downloads the full thing (signal.org desktop client) no differential updates?)
  • word of advice: never blindly follow “the trends”
    • always think for yourself, “does it make sense”?
      • test it if it works for you, if not, drop it, what’s the point?

imho gotta to do both…

#linux #gnu #gnulinux #opensource #administration #sysops #rant #software #quality #mess #archive #heise #url #urls #redirects #ci-cd #cd-ci #CICD #CDCI #dev #systems #system #company #developers #developer #buckminster #buckminister

Originally posted at: https://dwaves.de/2022/02/03/rant-open-source-and-the-concept-of-release-early-release-often-or-publish-early-publish-often-continuous-development-continuous-integration-cd-ci-tight-loops-ok-but-still-linking-to-n/

canoodle@nerdpol.ch

Rant: Open Source and the concept of: Release early, release often or publish early & publish often -> continuous development/continuous integration (CD/CI) -> tight loops ok but still - linking to nirvana without redirection & badly written software that everyone uses - another case of - nothing works "ok" - klarer fall von "nichts funktioniert ok"

https://administrator.de/forum/wol-geht-nicht-mit-broadcast-adresse-101944.html

-> it’s catastrophic, when webpages change their url setup…

https://www.heise.de/netze/Wake-on-WAN–/artikel/89304/0

because it will result in

“nothing works” “ok”

this does not have nothing to do with luck, but with:

  1. bad url management:
    • wordpress does an pretty good job there, as whenever the user changes the url (more keywords?) it will also redirect from the older past urls to the new url
      • that is how it is SUPPOSED to be for EVERY website of the (not so) “ethernal” part of the internet called www
  2. elastic search seems to be a very very badly written software that does not do any sort of software quality checks?
    • or maybe it’s wrongful integration? (but maybe it just sucks)
    • why is every developer-user using it?
  • PS: as mankind still ponders and evolves (by making mistakes) how to best deal with computers
    • yes someone said “publish early” & “publish often” (doing this with the blog… also… often too often and too early X-D)
      • or: “Release early, release often” (wiki)
        • “tight feedback loop between developers and testers or users” (wiki) - yeah sure as a developer that might be a good thing, as a user… really doubt it… - there are highly intelligent respected developers that pioneered this concept… it might work for small teams… (of one)
        • “This philosophy was popularized by Eric S. Raymond in his 1997 essay The Cathedral and the Bazaar, where Raymond stated “Release early. Release often. And listen to your customers”.[4]”“This philosophy was originally applied to the development of the Linux kernel and other open-source software, but has also been applied to closed source, commercial software development.” “The alternative to the release early, release often philosophy is aiming to provide only polished, bug-free releases.[5] Advocates of RERO question that this would in fact result in higher-quality releases.[4]
      • has this lead to every developer going in the: continuous development/continuous integration direction? (definately sounds like it)
        • it really should be called CD/CI not CI/CD because first comes the development, then the integration (but well hewego: CI/CD@RedHat)
        • still pondering if it’s really a good idea - well if software quality sticks to UNIX principles of K.I.S.S (most do not and have NO IDEA what non-K.I.S.S means for their software-project or company: - it is the difference between: - lost in chaos of complexity = dysfunctionality - vs a lean stream of running smooth software-company - src: https://homepage.cs.uri.edu/~thenry/resources/unix_art/ch01s07.html - plus test-driven development: 100.000 use case checks tested afterwards automatic & semi-automatic & manual - than that probably works (but then that is what needs to be done anyway to ensure good software quality) - plus: maybe a feedback channel that does not de-motivate - always say something positive first - then the critique
        • signal.org is a very cool mobile & desktop messenger (that usually works pretty well) but: - what is already annoying: if updates per program are 100MBytes and more… (always downloads the full thing (signal.org desktop client) no differential updates?)
  • word of advice: never blindly follow “the trends”
    • always think for yourself, “does it make sense”?
      • test it if it works for you, if not, drop it, what’s the point?

#linux #gnu #gnulinux #opensource #administration #sysops #rant #software #quality #mess #archive #heise #url #urls #redirects #ci-cd #cd-ci #CICD #CDCI #dev #systems #system #company #developers #developer

Originally posted at: https://dwaves.de/2022/02/03/rant-open-source-and-the-concept-of-release-early-release-often-or-publish-early-publish-often-continuous-development-continuous-integration-cd-ci-tight-loops-ok-but-still-linking-to-n/

grey@pod.tchncs.de

Would the Diaspora developers ever consider adding support for .WEBM uploads?
Obviously supporting videos would be a hassle most of the time, but .WEBM files are so tiny compared to other formats.
There are plenty of times when a video can be converted into both a .GIF and a .WEBM and the .WEBM is always smaller than the .WEBM.
And since .WEBMs are allowed on Diaspora, why not .WEBM files?
I just converted a bunch of .AVI files to .WEBM and this shit is tiny. It goes from 100MB to like 5.4MB, that's crazy.
#diaspora #diasp #development #developers #video #gif #gifs #webm #video #videos #thefederation

podmin@societas.online

Working on #diaspora #migration currently. On my local machine I just managed to import a profile from another machine by using the user settings UI.
Still polishing many things.
In the next few days I will import photos and provide a draft pull request to the diaspora #developers.

riveravaldez@joindiaspora.com

A comment about the news on Audacity development, acquisition by MuseGroup, etc. (previously we could think on MuseScore) and free software work and communities in general, about which I would much like any insights and opinions:

Judging for the reactions here[1] and here[2], it seems that forking is one considered option.
But development (of software and communities) it's already hard enough to sustain and make grow, so, it's reasonable to try to avoid fragmentation as long as it's sensible and convenient.

Under capitalism this happens once and again. Developers want to eat and live properly (from their own work) and capitalists make use of that necessity to capture work already done (and future), i.e., profit, with the perspective of an initial inversion (sometimes, at least) and a viable structure.

But the reason-to-be of capital is the capture of surplus value (sorry for the jargon, please allow me) in an always increasing cycle, and that's inevitably in conflict with the social common interest and solidarity that's at the kernel of free software culture and communities.

In a very superficial analysis I guess the Muse Group[3] is in first place this guy Eugeny Naidenov that apparently founded "Ultimate Guitar" (nothing FLOSS until here) and then went in a succession of acquisitions just to increase his Group's portfolio, and those acquisitions are both of free software developments/platforms and proprietary, so, doesn't seems like a free software endeavor but just "business as usual"...

Free software communities will have to show their abilities to defend themselves. This is historic, and I really desire for the best.

[1] https://github.com/audacity/audacity/discussions/889
[2] https://github.com/audacity/audacity/discussions/932
[3] https://mu.se/

#news #audacity #developers #development #musegroup #musescore #copyleft #gpl #software #freesoftware #work #workingclass #capitalism #communities #talk #music #daw #licenses #opensource