In a previous post, I explored ways to print many MHT files. I wanted to revisit the matter, perhaps finding a more streamlined way to complete the project. This time, I was working with HTM and HTML rather than MHT files.
I began with the PrintHTML option mentioned in a comment to that previous post. For some reason, it wasn’t working for me. Example webpages led me to wonder whether PrintHTML worked in Windows 7. I had previously used VeryPDF, but I believed there was a limit on how many files you could convert with that program before you would need to buy a copy. VeryPDF offered a bewildering variety of PDF tools, and I was not sure how much they all cost, but one of the recommended ones cost $299. I decided to keep looking.
I flailed around quite a bit. I looked at HTMLDOC, but it turned out to date from 2006, and thus apparently did not support more modern webpages. I looked at a number of webpages suggesting various command line approaches that did not work. I did not keep notes, but I did burn up several hours on efforts that other people said would work, and that seemed like they should work, but that just were not working for me. Possibly they were using Windows XP; possibly there was some other reason that escaped me.
Ultimately, I came back to the wkHTMLtoPDF approach described in the previous post. There appeared to be a newer version, so I downloaded and installed the Windows version (wkhtmltox-0.11.0_rc1-installer.exe). As I could see from the Everything file finder, the installation gave me C:\Program Files (x86)\wkhtmltopdf\wkhtmltopdf.exe. I put a copy of that file in C:\Windows. That way, it would run on the command line in any folder where I might be working. To test it, I typed “wkhtmltopdf” on the command line in another directory. It gave me an error:
wkhtmltopdf..exe – System Error
The program can’t start because libgcc_s_dw2-1.dll is missing from your computer. Try reinstalling the program to fix this problem.
The actual problem, I found, was that I had not copied that and three other .dll files from C:\Program Files (x86)\wkhtmltopdf to C:\Windows. Once I took care of that, I was back in the game. I started with a simplified version of the command I had tried last time:
start /wait wkhtmltopdf “D:\Workspace\BIOS.htm” “D:\Workspace\Output\Testfile.pdf”
where BIOS.htm was the name of a file I was trying to print. It worked, but it defaulted to the European A4 paper size, so I had to use the additional instructions I had used last time:
start /wait wkhtmltopdf -s Letter -T 25 -B 25 -L 25 -R 25 –minimum-font-size 10 “D:\Workspace\BIOS.htm” “D:\Workspace\Output\Testfile.pdf”
and that worked. Unlike the situation I’d had last time, with MHT files, it looked like these HTML files were going to print directly into a nice-looking PDF file, with no further hassle.
Now I had to come up with my list of commands, one for each file to be converted to PDF. For this, I used my usual combination of DIR and Excel to produce a set of commands that I could then paste into Notepad, save with a name like converter.bat, and execute. This process gave me a series of commands, each like the one shown above.
Actually, I used that sort of procedure for several different steps: to identify the .HTM and .HTML files to be converted; to write batch commands that moved them all to D:\Workspace; to produce the PDFs; and then to move the PDFs back to where the .HTMs and .HTMLs had been, to replace them. Since I had not specified a source path in the command shown above, I would have to run that converter.bat file in the folder containing the HTMs. That went without any major problems. End of project.