Replies: 1 comment 2 replies
-
|
It's the way many PDF are stored internally and thus takes a lot of dedicated coding to make it more like the visual pixel placements. Clearly Adobe has tons of coders and thus can keep improving thousands of lifetimes worth of code. Foxit also have big teams working on catering for oddities. I will guess that document is a Scan with OCR and the OCR will on selection seem ragged blocks? The easiest way to get simple lines is try pdftotext either from xpdf (or poppler binaries on git hub) and SumatraPDF could be set to export a single page (or less/more) to convert via that command into a txt file. In the past I wrote one for export via mutool https://github.com/GitHubRulesOK/SumatraPDF-Plus/blob/master/Plus/ExportTxt.cmd |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
So I have a really weird problem with SumatraPDF:

I work as a translator and need to create translation tables regularly from PDFs. Which means that I simply copy the text from a PDF into a Word table. I use SumatraPDF to view PDFs, because it's fast and easy to use.
However, recently I noticed that for some reason, SumatraPDF switches the numbers into different lines.
If I copy the text above from SumatraPDF and paste it, I get this result:
Then,
the
Homunculus
resolves
advancements on the Mastery tracks. It 2
has an Experiment marker on the Water
track (10), and its Mastery marker is on
the “1” space.
However, if I copy and paste it from a different PDF software, e.g. Foxit Reader, it looks like this:
Then, the Homunculus resolves 2
advancements on the Mastery tracks. It
has an Experiment marker on the Water
track (10), and its Mastery marker is on
the “1” space.
So as you can see, SumatraPDF first of all adds a line break after every wort in the first line (probably, because there's so much space between the words) and then also moves the "2" from the first line after the last word "It" from the second line.
Is this a known issue? And is there any way to solve it?
Beta Was this translation helpful? Give feedback.
All reactions