Dnext

#ocr

October 19, 2024 3:00am

Smart Glasses Read Text

#raspberrypi #assistivetechnolgy #ocr #smartglasses #hackaday
posted by pod_feeder_v2

Smart Glasses Read Text

You normally think of smart glasses as something you wear as either an accessory or, if you need a little assistance, with corrective lenses. But [akhilnagori] has a different kind of smart eyewear…

Azure Cerulean

June 5, 2024 3:21pm

kawaiidesune/zxx: The ZXX typeface.

The ZXX typeface. The site it was on, z-x-x.org, is dead. It appears cybersquatters are redirecting the domain name to some kind of browser-based spyware plugin. It now resides on GitHub and at zxx.vero.moe.

https://github.com/kawaiidesune/zxx

#NSA #Font #obfuscation #OCR #Windows

brainwavelost

April 15, 2024 11:08am

NormCap: Screen Capture Tool For Text Using OCR

https://www.linuxuprising.com/2023/05/normcap-screen-capture-tool-for-text.html

#linux #ocr #text from #images

NormCap: Screen Capture Tool For Text Using OCR

NormCap is a free and open source screen capture tool for text. Instead of capturing an image of the screen, this application captures the text displa

Emmanuel Florac

April 6, 2024 11:47am

OCR PDFs and images directly in your browser

#OCR #libre #freesoftware

https://tools.simonwillison.net/ocr?language=fra

Hackaday (unofficial)

February 15, 2024 12:00pm

Make Your Bookshelf Clickable

#softwaredevelopment #softwarehacks #gpt #imageprocessing #ocr #hackaday
posted by pod_feeder_v2

Make Your Bookshelf Clickable

We’ll confess that we have a fondness for real books and plenty of them. So does [James], and he decided he needed a way to take a picture of his bookshelves and make each book clickable to f…

Hackaday (unofficial)

September 20, 2023 3:00pm

You’ve Got Mail: Reading Addresses With OCR

#featured #history #interest #originalart #ocr #postoffice #usps #hackaday
posted by pod_feeder_v2

You’ve Got Mail: Reading Addresses With OCR

Last time I delivered on this column, I told you about the USPS’ attempts to fully automate a post office. Of course, that’s a bit of a misnomer, since it took 1,500 employees to actual…

Wayne Radinsky

September 15, 2023 3:16am

An OCR system that can convert PDFs of scientific papers dense with mathematical equations has been developed. For mathematical equations, it outputs the LaTeX format.

"Next to HTML, PDFs are the second most prominent data format on the internet, making up 2.4% of common crawl. However, the information stored in these files is very difficult to extract into any other formats. This is especially true for highly specialized documents, such as scientific research papers, where the semantic information of mathematical expressions is lost. Existing Optical Character Recognition (OCR) engines, such as Tesseract OCR, excel at detecting and classifying individual characters and words in an image, but fail to understand the relationship between them due to their line-by-line approach. This means that they treat superscripts and subscripts in the same way as the surrounding text, which is a significant drawback for mathematical expressions. In mathematical notations like fractions, exponents, and matrices, relative positions of characters are crucial. Converting academic research papers into machine-readable text also enables accessibility and searchability of science as a whole. The information of millions of academic papers can not be fully accessed because they are locked behind an unreadable format. Existing corpora, such as the S2ORC dataset, capture the text of 12M2 papers using GROBID, but are missing meaningful representations of the mathematical equations. To this end, we introduce Nougat, a transformer based model that can convert images of document pages to formatted markup text."

The researchers have released a pre-trained model capable of converting a PDF to a lightweight markup language.

"Our method is only dependent on the image of a page, allowing access to scanned papers and books."

"To the best of our knowledge there is no paired dataset of PDF pages and corresponding source code out there, so we created our own from the open access articles on arXiv. For layout diversity we also include a subset of the PubMed Central (PMC) open access non-commercial dataset. During the pretraining, a portion of the Industry Documents Library (IDL) is included."

The model they came up to do this is called Nougat, "an end-to-end trainable encoder-decoder transformer based model for converting document pages to markup." It's basically a vision transformer model.

A lot of the paper is concerted with technicalities such as splitting pages and ignoring headers and footers with page numbers and various compression and distortion artifacts, blur, and noise, that can exist in the image to be OCRed.

To measure the performance of the model, they calculated edit distance, BLEU score, METEOR score, and F1-score.

"The edit distance, or Levenshtein distance, measures the number of character manipulations (insertions, deletions, substitutions) it takes to get from one string to another. In this work we consider the normalized edit distance, where we divide by the total number of characters."

"The BLEU metric was originally introduced for measuring the quality of text that has been machinetranslated from one language to another. The metric computes a score based on the number of matching n-grams between the candidate and reference sentence.

METEOR is "another machine-translating metric with a focus on recall instead of precision."

The F1-score incorporates both precision and recall, and "We also compute the F1-score and report the precision and recall."

They compared with a previous OCR system, GROBID with LaTeX OCR. For edit distance, GROBID with LaTeX OCR got 0.727, while Nougat Small (250 million parameters) got 0.117 and Nougat Base (350 million parameters) got 0.128 on math equations. On edit distance, smaller is better. For BLUE, the numbers were 0.3 for GROBID + LaTeX OCR, 56.0 for Nougat Small and 56.9 for Nougat Base -- larger is better. On METEOR, the numbers were 5.0 for GROBID + LaTeX OCR, 74.7 for Nougat Small and 75.4 for Nougat Base -- larger is better. For F1, the numbers were 9.7 for GROBID + LaTeX OCR, 76.9 for Nougat Small, and 76.5 for Nougat Base -- larger is better.

This sounds like something that could be incredibly useful.

Nougat: Neural optical understanding for academic documents

#solidstatelife #ai #computervision #ocr #latex

utzer [Friendica]

March 21, 2023 4:21pm

Machmal wäre ja eine gute #Formelsammlung für #Elektrotechnik echt hilfreich, sowas was früher allgemein #Tabellenbuch genannt wurde und in der #Berufsschule oder anfangs im #Studium verwendet wurde. Aber nen Buch mit rumschleppen ist eher nicht schön. Gibt es da was gutes als PDF, sonst kaufe ich mir ein Buch und scanne das ein, wenn man den Rücken abschneidet geht das ja recht gut. #OCR ist nicht perfekt, aber funktioniert mittlerweile auch ganz OK.

Hackaday (unofficial)

October 6, 2022 2:00pm

Immersive Cursive: Growing Up Loopy

#hackadaycolumns #doublestoreya #doublestorya #handwritingfonts #ocr #uncial #hackaday
posted by pod_feeder_v2

Immersive Cursive: Growing Up Loopy

Growing up, ours was a family of handwritten notes for every occasion. The majority were left on the kitchen counter next to the sink, or in a particular spot on the all-purpose table in the breakf…

Hackaday (unofficial)

September 27, 2022 3:00pm

Cursing the Curse of Cursive

#featured #interest #originalart #rants #cursive #dnealian #handwriting #ocr #palmermethod #spencerian #zanerbloser #hackaday
posted by pod_feeder_v2

Cursing The Curse Of Cursive

Unlike probably most people, I enjoy the act of writing by hand — but I’ve always disliked signing my name. Why is that? I think it’s because signatures are supposed to be in curs…

utzer [Friendica]

May 5, 2022 7:45am

I wonder if someone could setup an #OCR #bot on a #Friendica server, after all it got the #Mastodon API for apps, so it should be possible.
Anyone here to volunteer and setup an ocr bot to transcribe text in pictures? Would be great to be triggered by hash tag or mention.

Hackaday (unofficial)

April 21, 2022 9:00pm

Scanning Receipts Proves Trickier Than Anticipated

#digitalcamerashacks #softwarehacks #api #ocr #opticalcharacterrecognition #receipt #scanner #hackaday
posted by pod_feeder_v2

Scanning Receipts Proves Trickier Than Anticipated

It’s one of those things that certainly sounds simple enough: take a picture of a receipt, run it through optical character recognition (OCR), and send the resulting information to whatever e…

utzer [Friendica]

March 11, 2022 3:40pm

Is there an #OCR bot in the #Fediverse that does OCR and #translation in one go?

Danie

February 8, 2022 3:55pm

Use ‘TextSnatcher’ to easily Copy Text from Images to Your Clipboard on Linux

Being able to extract text from photos, PDFs and the like isn’t something new. Indeed, many ace tools exist for the job, including several well-regarded command line ones available on Linux. But being able to do it very easily? That is new.

With modern operating systems like macOS and Android making image OCR an integrated feature of their native image viewer tools or photo managers, it’s understandable that some folks new to Ubuntu, Linux Mint, and other distros expect similar functionality.

And with TextSnatcher, they do. The tool performs optical character recognition (OCR) in seconds, allowing you to quickly copy text from anything visible on your screen to your system clipboard, ready to paste elsewhere.

This application’s interface couldn’t be easier to use: you open it, click the “snatch” button, then use your DEs default screenshot tool to take a full screenshot or partial screenshot (recommended) focusing on just the text you want to copy.

See https://www.omgubuntu.co.uk/2022/02/textsnatcher-copy-text-from-images-linux

#technology #opensource #ocr #linux
#Blog, ##linux, ##ocr, ##opensource, ##technology

Dr. Roy Schestowitz

November 27, 2021 2:31pm

12 Best #FreeSW #OCR Tools • Tux Machines ⇨ http://www.tuxmachines.org/node/158441 #GNU #Linux #TuxMachines

Hackaday (unofficial)

October 31, 2021 9:00pm

Extracting Data from Smart Scale Gives Rube Goldberg A Run For His Money

#homehacks #fitness #ocr #rubegoldberg #screenscraping #screenshot #smartscale #tesseractocr #hackaday
posted by pod_feeder_v2

Extracting Data From Smart Scale Gives Rube Goldberg A Run For His Money

[Kevin Norman] got himself a smart body scale with the intention of logging data for his own analysis, but discovered that extracting data from the device was anything but easy. It turns out that t…

Hackaday (unofficial)

October 31, 2021 6:00pm

Raspberry Pi Reads What It Sees, Delights Children

#raspberrypi #espeak #ocr #texttospeech #hackaday
posted by pod_feeder_v2

Raspberry Pi Reads What It Sees, Delights Children

[Geyes30]’s Raspberry Pi project does one thing: it finds arbitrary text in the camera’s view and reads it out loud. Does it do so flawlessly? Not really. Was it at least effortless to …

Hackaday (unofficial)

October 21, 2021 9:00am

British Licence Plate Camera Fooled By Clothing

#news #transportationhacks #licenseplate #numberplate #ocr #hackaday
posted by pod_feeder_v2

British Licence Plate Camera Fooled By Clothing

It’s a story that has caused consternation and mirth in equal measure amongst Brits, that the owners of a car in Surrey received a fine for driving in a bus lane miles away in Bath, when in f…

Michael Keukert

September 28, 2021 7:03am

Wow, iOS 15 does built in #OCR now? Discovered it by accident.

Dr. Roy Schestowitz

August 16, 2021 6:39pm

#Tesseract 5.0 #OCR Engine Bringing Faster Performance With "Fast Floats" • 𝗧𝘂𝘅 𝗠𝗮𝗰𝗵𝗶𝗻𝗲𝘀 ⇨ http://www.tuxmachines.org/node/154573 #deletegithub #microsoft http://techrights.org/wiki/index.php/Delete_Github

0 Persons are tagged with #ocr

#ocr

kawaiidesune/zxx: The ZXX typeface.

OCR PDFs and images directly in your browser

Use ‘TextSnatcher’ to easily Copy Text from Images to Your Clipboard on Linux