I was searching for optimal compression. In that search, I decided to compare WinRAR x64 5.40 against 7-Zip x64 16.02. This post describes that comparison.
This discussion includes references to WinRAR’s settings. In a recent version of WinRAR, I found relevant settings by going into Windows Explorer > select a file > right-click > WinRAR > Add to Archive. (Of course, that would not work if the user had configured WinRAR not to be available via the context menu.)
Both 7-Zip and WinRAR offered multiple configuration options. I began by choosing what I thought would be optimal for each. For WinRAR, I chose RAR5, Best compression method, 1024MB dictionary, no options other than delete after compression. For 7-Zip, I started with 7z format, Ultra compression, LZMA2 (the default) (chosen because of 7-Zip’s limitation on options if I chose LZMA), 1024MB dictionary, word size = 256 bytes, non-solid, 6/8 CPU threads (so that it would not completely dominate the system), delete after compression. (7-Zip Help, available via F1, provided more information on word size and other aspects of the 7-Zip interface.)
I chose a non-solid setting because WinRAR’s help file explained that solid archives had a significant risk for my purposes. The solid archive would “significantly increase compression when adding a large number of small, similar files.” But on the downside, a solid archive would be not only slower, when I went looking to retrieve a single file, but also riskier, because the presence of one damaged file in the archive renders all following files inaccessible.
I ran 7-Zip and WinRAR with those settings, one at a time, on a reasonably fast machine (Intel Core i7 4790 with 16GB RAM). The computer was not particularly busy with anything else, aside from the Firefox and Chrome browsers, in which I was reading from a few tabs I had already opened. I ran the programs on the following file sets. The first bullet point explains the various pieces of information included in each of these test result reports. These are actual times, not times reported by the programs, though the latter seemed to be off by only a few seconds, mostly due to lag in program startup. File sizes are as reported by Windows Explorer > select file > right-click > Properties.
- The first set of files tested: 200 JPGs of varying sizes, totaling 193MB before compression. WinRAR: 0:16 to compress into a 187MB file (i.e,. 97% of original size); 0:02 to decompress. 7-Zip: 0:33 > 187MB (97%); 0:06. Conclusions: it was probably not worth compressing these JPGs; but if I did compress them, WinRAR compressed and decompressed as well as 7-Zip, at a much higher speed.
- Ten large MP3 files (each 2-channel, 32 KHz, 256 Kbps CBR), totaling 1.36GB. WinRAR: 2:22 > 1.31GB (96%); 0:11. 7-Zip: 3:19 > 1.31GB (96%); 1:10. Conclusions: largely the same as with the JPGs (above).
- Four large PDFs produced by Adobe Acrobat, totaling 3.47GB. WinRAR: 9:09 > 3.44GB (99%); 0:35. 7-Zip: stalled with an error, after some minutes of trying: “The system cannot allocate the required amount of memory.” (I was not sure how long the program sat with that error message displayed, because the error came up under my browser, and there was no flashing taskbar notification.) The problem seemed to be in the dictionary size. I was not sure why WinRAR seemed to work on this system with a 1024MB dictionary, while 7-Zip indicated that it would need 8472MB of RAM even for a 256MB dictionary; nor was I sure why this error had arisen only now. It appeared that WinRAR’s interpretation might be that a 1024MB dictionary would be used as long as sufficient RAM was available, and otherwise the program would drop back to a smaller dictionary, but I was not sure whether that was the explanation. It seemed that 7-Zip would stall halfway through a compression, if some other program grabbed some of the available RAM. As already noted, that had not happened here; nothing new was running. Anyway, I saw that 7-Zip switched to a 64MB dictionary and a word size of 64 when I chose Ultra compression, so I went with those changes; 7-Zip said they would require only 4413MB of RAM. Results: 11:03 > 3.43GB (98%); 2:24. Conclusions: same as above, except that 7-Zip decompression was especially slow here.
- 3,106 (mostly small) PDFs produced by Adobe Acrobat, totaling 3.49GB. WinRAR: 10:21 > 3.45GB (99%); 0:41. In 7-Zip, I decided to switch to Maximum and LZMA compression, and accepted the default change to only 2 CPU threads — which was the maximum allowed by the program for LZMA. Results: I canceled this process at 15 minutes, with an estimated four minutes remaining in the job. Plainly, these changes had not improved 7-Zip’s performance. I tried again with Ultra compression, going back to LZMA2 with 128MB dictionary, 256 word size, 6 CPU threads, still a non-solid archive. Results: 11:48 > 3.43GB (98%); 2:35. I tried again with the same settings, but this time let 7-Zip have all 8 CPU threads. Results: 11:22 > 3.43GB. Conclusion: WinRAR achieved almost the same compression as 7-Zip in about 10% less time.
- One large MPG, 982MB. WinRAR: 2:59 > 925MB (94%); 0:10. 7-Zip, using the settings just listed, including 8 CPU threads: 2:54 > 921MB (94%); 0:53. Conclusion: 7-Zip yielded slightly faster and denser compression.
- One large ISO, 801MB (specifically, the Linux Mint 13 Cinnamon live DVD). WinRAR: 1:58 > 788MB (98%); 0:05. 7-Zip: 2:03 > 793MB (99%); 0:03. Conclusion: WinRAR seemed slightly faster and better at compressing this filetype, but 7-Zip was faster at decompressing, though both were very fast.
- One large MP4, 1.52GB. WinRAR: 5:37 > 1.52GB (100%); 0:02. 7-Zip: 4:13 > 1.51GB (99%); 0:03. Conclusion: 7-Zip was notably faster and slightly more efficient in compressing MP4 files.
- One large AVI, 1.75GB. WinRAR: 6:35 > 1.48GB (85%); 0:21. 7-Zip: 5:54 > 1.45GB (83%); 1:19. Conclusion: while still much slower on decompression, 7-Zip turned in compression results that were only marginally denser but were noticeably faster than WinRAR.
- One large WAV, 2.03GB. WinRAR: 14:10 > 1.67GB (82%); 0:21. 7-Zip: 6:32 > 1.65GB (81%); 1:32. Conclusion: 7-Zip was much faster than WinRAR at compressing this file, and much slower at decompressing, while achieving about the same compression ratio.
- 313 Microsoft Word (.doc and .docx) files, totaling 730MB. WinRAR: 1:32 > 331MB (45%); 0:04. 7-Zip: 3:38 > 320MB (44%); 0:08. Conclusion: 7-Zip was less than half as fast as WinRAR in compressing and decompressing these files. Presumably similar results would follow for other Microsoft Office filetypes (e.g., XLS, PPT).
These findings were subject to further discoveries (below).
Overall, it appeared that WinRAR would be noticeably faster than 7-Zip, when compressing files that were not very compressible. That could sound silly, but often what I wanted from WinRAR was simple consolidation of multiple files, sometimes even multiple folders, into a single compressed file. WinRAR would be even faster, for such purposes, if I added poorly compressible filetypes (e.g., *.rar) to WinRAR’s list of filetypes exempt from compression. I did not see that 7-Zip had such an option.
When compressing large amounts of material, even a few percentage points could make a difference in the number of optical discs consumed. My tests on the JPGs and MP3s yielded 3% and 4% compression, respectively, by both programs. On the PDFs, 7-Zip achieved 2% compression vs. WinRAR’s 1%. Those minor gains could be useful in some situations.
For my purposes, however, in a variety of day-to-day uses, I decided that compression yielding just a few percentage points of extra space was not as important as notably faster performance. I reached that decision because sometimes I would be waiting on the computer to finish zipping or unzipping files, whereas I rarely if ever noticed whether a compressed archive was a few percentage points larger or smaller than it might have been.
Based on the foregoing results, this decision suggested that I should add several filetypes to WinRAR’s list of filetypes not to bother trying to compress, except perhaps in the unusual case where I would be creating a solid archive. (The list of filetypes not to be re-compressed was available in the context menu dialog mentioned above.) Except as otherwise revealed in further testing with other files, it seemed I should exclude these filetypes from compression in WinRAR: JPG, PDF, MP3, MP4, ISO, and (presumably) RAR (and I would add the Linux GZ to that list). By contrast, it appeared worthwhile to compress MPG, AVI, WAV, DOC, and (presumably) ZIP. Ultimately, here’s what I entered in the space titled “files to store without compression”:
.jpg .pdf .mp3 .mp4 .iso .flv .wmv .wma .rar .7z .gz
Note that the list of filetypes to be skipped included PDF, in my case, because I was using Adobe Acrobat to create PDFs. Some PDF-creation programs produced less efficiently packaged output, in which case compression might make a visible difference. It was also reportedly possible to achieve better compression of Acrobat PDFs, as well as of numerous other file types (e.g., JPG, ZIP, PNG) by using the command-line Precomp program before using 7-Zip or WinRAR.
How did WinRAR and 7-Zip compare, on those filetypes that did seem worth compressing? To summarize the foregoing results, for MPG, 7-Zip’s compression was slightly faster and denser. For AVI and WAV, 7-Zip was at best slightly denser, but was much faster. For DOC, 7-Zip was slightly denser but much slower.
So I might have been content with 7-Zip, if I hadn’t already bought a WinRAR license several years earlier. Since I did already have my own licensed copy, I also appreciated other WinRAR features, such as the ability to save profiles with different compression settings, to automatically delete archive files after restoring their contents, to include a recovery record as a protection against file corruption, and to restore multiple selected RAR archives, each to its own individual folder, in a single operation. And for day-to-day usage, it seemed the option of excluding poorly compressible filetypes could save me a fair amount of time.
RAR vs. RAR5 and Dictionary Size
At this point, I started to become more embroiled in the differences that could result if I tweaked various settings in WinRAR and 7-Zip. I did not attempt to make a thorough study of this. I engaged in the following comparisons; I found myself starting down a slippery slope that could result in days of arcane testing and retesting; and then I backed out. As a result, the following material may be informative, but it is neither structured nor thorough.
The question on my mind, as I started into these additional investigations, was whether RAR5 (using the 1024MB maximum dictionary size) was actually faster than RAR (using the maximum 4096KB dictionary size) in WinRAR. I had seen at least one comment suggesting it might not be. I ran comparisons on one large WAV and on a set of DOC files. On the DOC files, RAR took 15 seconds and produced a RAR only 26% of the original size (77.4MB > 20.4MB), while RAR5 took 3 seconds to achieve the same compression. Clearly, RAR5 was better for compressing these DOCs. On the WAV, however, RAR took 0:29 to achieve 61% of original size (531MB > 326MB), while RAR5 took 1:37 to achieve 66% (348MB).
This WAV result suggested that I should have used RAR rather than RAR5 for some of the foregoing comparisons against 7-Zip. It seemed that 7-Zip might not be faster than WinRAR (using RAR rather than RAR5) in the WAV case specifically. To make sure the difference wasn’t just due to dictionary size, I re-ran that WAV compression, using RAR5 with a 4MB dictionary. That yielded a significant improvement in time and a slight decrease in compression, to 0:49 (531MB > 353MB = 66%), but still didn’t match RAR. It did appear that RAR was better for WAV but not DOC.
Continuing the comparison, I ran a RAR vs. RAR5 comparison, holding dictionary size constant at 4MB, for AVI. On a 398MB AVI file, RAR and RAR5 achieved equivalent results: 0:32 > 278MB (70%). RAR5 fared worse with a 64MB dictionary: 277MB (70%) in 0:56. Trying that same AVI with a 64MB dictionary (word size = 64), 7-Zip was much slower than it had been in my previous test (above): 1:48 > 262MB file (66%).
I also ran a RAR vs. RAR5 comparison for MPG. On a 539MB MPG file, RAR achieved 0:41 > 528MB (98%) while RAR5 achieved the same compression just one second faster. I re-ran the test using RAR5 with a 1024MB dictionary. That yielded 1:12 > 525MB (97%). It seemed that RAR5 with a larger dictionary might achieve slightly improved MPG compression, compared to RAR, at the cost of much slower processing.
The general competitiveness of WinRAR with RAR5 against 7-Zip suggested that, for most file types, RAR would probably not be better than RAR5. But I had reached a point of uncertainty regarding other variables. One source recommended a WinRAR dictionary size of 32MB, so I went with that, pending further insight.
Taken together, these results presented a motley picture. 7-Zip compression might or might not be faster or denser than WinRAR, depending on numerous variables that I could adjust in either program, and perhaps depending as well on the numbers and sizes of files being compressed and, of course, on their filetype.
These were matters that an individual could investigate manually, as I had done. But it seemed obvious that computers could do the job better and faster. What I lacked was a program capable of consulting an online database and/or sampling the files to be compressed, and then choosing among the compression tools it might detect on my computer (e.g., 7-Zip, WinRAR, WinZip) and/or in the cloud. This program might parcel out the files to be compressed, using RAR5 for one subset and LZMA2 for another, before pulling the subparts together into a single ultimate archive.
On a separate point, this post limits itself to 7-Zip and WinRAR. Among other possibilities, I had become acquainted with PeaZip as an alternative available across multiple platforms. PeaZip handled multiple formats, and offered a benchmarks webpage that was especially enthusiastic about what it described as the “experimental” ARC format. This ARC format was apparently distinct from an old ARC format used in the 1980s. The contemporary ARC format seemed to be supported, not only by PeaZip, but also by IZArc. Both PeaZip and IZArc drew relatively good ratings from Gizmo. If ARC established itself as a reliable compression format offering both high compression and high speed, then eventually, it seemed, WinRAR and/or 7-Zip would need to offer that format or be replaced.