-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converts PDF without Fonts/Text #42
Comments
I'm having thesame issue with @dms-ts. Please any ideas? |
Same issue, however only for some types of PDFs. Regular PDF files uploaded from the user's device can be converted fine as they are, however for some reason this library fails to convert PDFs created with React PDF. |
I'm having the same issue - any advice? I've installed Microsoft Fonts and have checked that Arial is installed on my EC2 Ubuntu system running node but still no luck. I'm looking for a package that doesn't save to the file system and can import a PDF from URL and export an array of images. I'm very happy with this package with the exception of missing some text (obviously a big problem), but happy to switch an alternative if anyone has any advice? |
I changed the verbosity of the PDF.js command to 1 so that I could get the following error messages, the once relating to Helvetica match the text that is missing. These are my error messages:
I think my system is saying that it would substitute the Helvetica with Arial:
So not sure whats going on... I'll keep trying to find a solution and post back if I find something. |
Think I found a fix that is legit: I changed line 100 in the file pdf-img-convert.js:
It looks like this should be okay from the 2018 answer here. |
So that didn't work, as mentioned in the earlier part of that 2018 thread that change will break other documents' fonts. |
I'm able to resolve this issue using this instruction mozilla/pdf.js#4244 (comment) final version: diff --git a/pdf-img-convert.js b/pdf-img-convert.js
index 01e8c64c9ffa13ea226a689fa08e78d97213dabe..97939693584b700a985fe3ef3a2fe054a26ddf41 100644
--- a/pdf-img-convert.js
+++ b/pdf-img-convert.js
@@ -29,6 +29,7 @@ const Canvas = require("canvas");
const assert = require("assert").strict;
const fs = require("fs");
const util = require('util');
+const path = require('path');
const readFile = util.promisify(fs.readFile);
@@ -95,9 +96,9 @@ module.exports.convert = async function (pdf, conversion_config = {}) {
// At this point, we want to convert the pdf data into a 2D array representing
// the images (indexed like array[page][pixel])
-
+ let packagePath = path.dirname(require.resolve("pdfjs-dist/package.json"));
var outputPages = [];
- var loadingTask = pdfjs.getDocument({data: pdfData, disableFontFace: true, verbosity: 0});
+ var loadingTask = pdfjs.getDocument({data: pdfData, disableFontFace: true, verbosity: 0, standardFontDataUrl: packagePath + '/standard_fonts/'});
var pdfDocument = await loadingTask.promise
@ol-th would you accept a PR for this? |
I would also like to bump this issue, I will have to look for another library to use if this issue doesn't get solved Love the simplicity of using this library, just hope this issue can get resolved
|
Hope you find it useful. That patch successfully converts our 300+ pdf daily |
how can i implement your change @deathemperor? I can't edit the file directly, if i indeed have to implement that change myself i'd prefer not to do that, so If you have an alternative suggestion thanks for your response though @deathemperor |
@deathemperor if you could send a PR for this fix that would be great. I'll test it out and add it to a new release if all good. |
I use https://www.npmjs.com/package/patch-package to maintain patches like these until the repo officially supports. |
sure, here's the PR #50 |
Hi guys, has this been merged into latest? |
Hi @deathemperor, thank you so much for leading me to https://www.npmjs.com/package/patch-package I managed to implement it successfully to continue using the library seemlessly. much appreciated |
I'm glad it helped! |
I'm trying to convert some shipping labels to png, it converts the barcodes and images, but no text/fonts. I already installed Font fix but it doesn't works.
The text was updated successfully, but these errors were encountered: