As described in another post, I was using the Bullzip PDF printer to print a bunch of PDFs. It would normally not make sense to PDF files that were already in PDF format. In this case, I was hoping that the attempt to print a PDF — even if only to another PDF — would produce error messages that would identify any PDFs that might have become corrupted.
PDFing the PDFs proved more complex than expected. I began by using a spreadsheet, applying techniques described in another post, to produce one command line per file. So if I had a thousand PDFs to print, now I would have a thousand command lines, one for each PDF, and each of those lines would repeat all of the commands necessary to print and otherwise process a single PDF. Then I would copy those thousand commands from my spreadsheet into Notepad, save it as a .bat file, and run it.
That was actually not a terrible way to proceed. If I needed to change something in the commands, I would just modify the spreadsheet formula, or do a global search and replace throughout the batch file. It’s not as though the commands were going to become too long for the Windows 7 command line: reports indicated that I could put more than 32,000 characters on a single line.
But at a certain point, the command lines became unwieldy, and anyway it seemed that some of the things I wanted to do would not fit very well within the single-line structure. It was time to transition to a looping batch file. A FOR loop would give me more room to develop the necessary commands, and would permit the batch file to repeat those commands for each of the PDF files I wanted to test.
This post presents the batch file that I wound up with, and explains each of the lines in it. Here is that batch file, with built-in DOS/Windows commands in all caps (e.g., ECHO):
:: PRINTER.BAT ECHO off SETLOCAL EnableDelayedExpansion IF EXIST ErrorLog.txt DEL ErrorLog.txt START "Killer" /min Killer.bat FOR /f "tokens=1,2 delims=|" %%i IN (PDFlist.txt) DO ( ECHO Printing file no. !filecount! : %%j SET /a filecount=!filecount!+1 IF NOT EXIST R:\TempPDF\%%j printto "%%i" "Bullzip PDF Printer" >> ErrorLog.txt 2>&1 FOR /l %%m IN (1,1,120) DO ( IF NOT EXIST R:\TempPDF\%%j TIMEOUT 1 > NUL ) ECHO. ECHO. ) TASKKILL /f /im Killer.bat
The core of it was the first FOR loop. That loop contained, among other things, a printto command. As described in the other post, printto was a free batch printing utility. Printto required me to specify the file being printed. I found that it was also best to specify the printer to use, even if it was the default printer. So here, the printto command designates “%%i” as the input file and “Bullzip PDF Printer” as the output printer.
Items in this batch file that used percent symbols (%) and exclamation marks (!) were variables. So in the printto command, %%i was a variable whose value would change each time the FOR loop cycled. Each time, %%i would contain the name of the next file on the list of PDFs, and that would be the file that printto would print. I designated Bullzip as the printer because, as described in the other post, I was able to set it so that it would not ask questions or open up the printed PDFs. That way, the printing process would go pretty quickly.
The rest of that FOR loop — really, the rest of the batch file — was built around that printto command. That main FOR loop began with this command:
FOR /f "tokens=1,2 delims=|" %%i IN (PDFlist.txt) DO (
Note the open parenthesis at the end of that line. This open parenthesis announced the beginning of the commands that would run again during each cycle of the FOR loop. That main FOR loop ended with the last closed parenthesis, near the end of the batch file. The concept of the FOR command was basically this: FOR each item of data IN the data source DO the following commands.
The line just quoted would begin the FOR loop with an instruction to look in PDFlist.txt. That was my list of PDF files. Using techniques detailed in the post describing my spreadsheet approach, each line of PDFlist.txt contained the full path and file name for a single PDF, such as D:\Folder\File 1.pdf. The /f option stated here, on the first line of the FOR loop, told the batch file to analyze PDFlist.txt line by line. That was different from the other, minor FOR loop appearing later in the batch file. That one used the /l option to instruct the FOR loop to work with numbers, not with the contents of a file.
The FOR line just quoted also contained references to “tokens” and “delims.” In this case, there were two tokens — that is, two distinct items, delimited (i.e., separated) by the the vertical bar ( | ) character — on each line of PDFlist.txt. The first item, as just described, was the full path and file name of each PDF. The second item, also developed in my spreadsheet and pasted into the PDFlist.txt file, was just the file name. With some complicated programming, I probably could have eliminated the redundancy in that, but this was easier. So here’s an illustration of how a line in PDFlist.txt might have looked:
D:\Folder\File 1.pdf|File 1.pdf
So token 1, in the FOR command, would refer (in this example) to “D:\Folder\File 1.pdf,” and token 2 would refer to “File 1.pdf.” I decided to use the vertical bar to separate them because I wanted a character that would not appear in the path or file names of any of the PDFs listed in PDFlist.txt, and I knew that the vertical bar was among the handful of reserved characters that were not permitted in Windows filenames. Here is the full list of Windows reserved characters:
< > : " / \ | ? *
Of course, a few of those characters (notably : and \, as in D:\Folder\File 1.pdf) were used in full Windows path names. That is, they appeared on every line of PDFlist.txt, and thus were not good choices for dividing a full path from a filename. I was not sure whether the vertical bar was a good choice because, like a number of other characters (e.g., * and ? and ^), it did have meaning within the Windows command context. Fortunately, it seemed to work here.
The first line of the FOR loop explicitly assigned the first token to the first variable, which I chose to designate as %%i on that first line. (I could have chosen %%a or %%I or other lower- or uppercase letters instead; it seemed that %%i was used most commonly, perhaps denoting “input.”) So in the case of the sample line displayed above, %%i would be shorthand for “D:\Folder\File 1.pdf.” This meant that %%j (which I did not have to designate explicitly) would be shorthand for the other token on the PDFlist.txt line being analyzed (in this example, “File 1.pdf”). These variables would have entirely different meanings after the batch file finished processing that particular line in PDFlist.txt; at that point, they would loop to whatever values appeared on the next line of PDFlist.txt.
The %%i and %%j variables played different roles. As just described, I needed %%i as input for printto, because %%i contained not only the name of the file to PDF-print but also the folder where printto could find it. Otherwise, though, %%j was more useful — not only as a short reference in the line that would tell me where we were in the list of PDFs:
ECHO Printing file no. !filecount! : %%j
but also as the name of the PDF-printed PDF that I expected to see in the output folder. Two separate lines looked for that output file:
IF NOT EXIST R:\TempPDF\%%j printto "%%i" "Bullzip PDF Printer" >> ErrorLog.txt 2>&1
IF NOT EXIST R:\TempPDF\%%j TIMEOUT 1 > NUL
Both of those lines said, “If the output PDF does not appear in R:\TempPDF, then further action is necessary.” The first one was going to run the essential printto command only if the output file was not already there: no point printing it twice. The second one was going to invoke a one-second TIMEOUT (i.e., delay). The TIMEOUT command did not have a quiet mode, so I used the > NUL approach to send its noisy reports off into oblivion. As just shown, there was a similar appendage at the end of the first of those two lines referring to %%j:
>> ErrorLog.txt 2>&1
That said, “If printto produces any reports or information about its attempt to print the PDF file, send that information to the ErrorLog.txt file.” The “>>” symbol said, “Add that information to the end of whatever is already in ErrorLog.txt,” as distinct from erasing ErrorLog.txt and providing no prior history. At the end of the whole printing process, I wanted ErrorLog.txt to contain all of the messages from the entire process.
The 2>&1 symbol at the end of that line was something of a mystery to me. I accepted that it worked; I just did not spend the time to explore in detail why it worked, or what other similarly cryptic options there might be. Much the same was true of the % and ! symbols surrounding variables. I had a sense of the concept of delayed expansion of variables; I saw that the related line, “SETLOCAL EnableDelayedExpansion,” was necessary to get the program to work as I wanted; but I arrived at a working format, for some variables in this batch file, through trial and error rather than clear understanding of what I was doing.
When the batch file ran, it put information like this on the screen:
Printing file no. 5 : CaseyPaper.pdf
where the “5” was the value of the !filecount! variable and CaseyPaper.pdf was the value of the %%j variable in the relevant line of the batch file:
ECHO Printing file no. !filecount! : %%j
With this message being constantly updated, I could compare the number of the file being printed against the number of complete files reported for the output folder in the bottom (status) bar in Windows Explorer, so as to insure that an approximately correct number of PDFs were arriving in the output folder. There were some files that just did not want to print; I realized at some point that I was going to have to do a filename comparison (perhaps using the spreadsheet approach again, or perhaps with a name check using a program like DoubleKiller), to see which ones did not make it. But for the most part, it appeared that this approach would keep printto from burying Bullzip in large numbers of print commands that it could not keep up with.
There seemed to be some problematic PDFs, and also some that took a long time to print. That’s why I built a two-minute lag into the batch file’s subordinate FOR loop:
FOR /l %%m IN (1,1,120) DO ( IF NOT EXIST R:\TempPDF\%%j TIMEOUT 1 > NUL
That loop told the computer to start with the number 1, increase by one step, and count up to 120. This would allow a one-second timeout for each time that it failed to find the expected PDF in the output folder; but then, after that two-minute (i.e., 120-second) delay, the batch file would give up and move on. Early tinkering suggested that very large PDFs could take a while to run. There would be no further timeouts, and the subordinate loop would thus complete very rapidly, as soon as the expected PDF did appear in the output folder.
The batch file contained one other complication. I had discovered that, for some reason, Bullzip needed to fire up Adobe Acrobat (my default PDF reader) every time it produced a PDF. It would do that even though I had instructed it not to do so. And then, for some reason, as long as Acrobat was up and running, the batch file would not advance to the next PDF. Indeed, it would not even advance beyond the printto line in the batch file; it would just freeze there. I was not sure whether other PDF printer or reader programs would present similar problems. After much flailing around, including some approaches that worked more slowly, I concluded that the best way to manage this situation was to run a separate batch file, Killer.bat, by adding this command to the start of the batch file:
START "Killer" /min Killer.bat
Hence Killer.bat would start minimized (/min), so as not to keep popping up in the middle of my screen. Killer.bat contained these lines:
:: KILLER.BAT :start TIMEOUT 2 TASKKILL /f /im acrobat.exe GOTO :start
In other words, Killer.bat would keep looping forever, until closed by some outside force (such as the final line in the main batch program); and every two seconds, Killer.bat would make another try to kill Acrobat. (Needless to say, if I needed to work with other PDFs while this process was underway, I would have to use a tool other than Acrobat.) The printto command seemed to require only a fraction of a second to print the PDF, so I hoped this approach would work without problems in most cases. If necessary, I could run the batch file again; it would quickly pass over the PDFs that it found had already been produced in the output folder.
The batch file seemed to run OK. There were some performance questions that I would have to learn more about as I went on. Another post continues the discussion.