I had a set of old MP3s, bearing the same names as files in my current set. For instance, both sets might have a copy of “Dead Kennedys–Take This Job and Shove It.mp3.” But DoubleKiller said they were not byte-for-byte identical. Did they differ only because, for instance, I might have run MP3Gain to adjust the volume on one set but not the other? Or did one of them have some deep problem that I could detect only by listening to each of them (and hundreds if not thousands of others) in their full length? Could I safely get rid of the old one, or should I hold onto it, in case someday I would finally hear or otherwise detect some irreparable problem in the current version (in which case I would be glad I held onto the old one)?
Previously, I had explored several programs to test MP3 audio files on my Windows 7 system. That effort was inconclusive. A few years had passed since then. I thought I might try again. Ideally, I would get a definitive analysis of all my MP3s, with an understandable and readily verifiable report as to what was wrong with any of them, and a clear sense of whether I could and should fix those problems.
This post consists of two parts. In Part One, I describe in detail the explorations I pursued, as I attempted to understand and use the results of various MP3-testing programs, when applied to a test set of MP3s. In Part Two, I use the learning from Part One in a more organized manner, to examine a “real” set of MP3s that I actually cared about. Readers who are not interested in the details of my struggles may find all they need in Part Two. (Part Two does not attempt to link to each relevant phase of the discussion, but of course a search within the text of this post should lead the user to the more detailed information in Part One.)
Note that I did not attempt to test all of these MP3s across a variety of players. For playback, I used IrfanView and, sometimes, Windows Media Player. These appeared to be relatively robust players. But MakeUseOf indicated that others might have very different results, depending on what they were using:
You’d be surprised how many of your files are already affected. Luckily, many of these errors stay below the radar, and these files will continue to play as if nothing were wrong. Most applications don’t rely too heavily on a particular MP3 tag. But if a file suddenly has choppy sound or doesn’t play at all, and if a collection of songs can’t be added to iTunes, then you know what’s happening.
Users having those sorts of problems may want to change their hardware and/or software, or may instead want to try some of the following analyses in their own situations. I could only explore my own situation.
There was also a question of what I should look for. For instance, a Bliss article introducing MP3 Diags says,
While the music files could still be played, an audible pop could be heard at the start of each track, which is pretty horrible to listen to . . .
Maybe some of my MP3s had that problem. But I wasn’t looking for file improvement. As described below, that could get complicated and risky. I was just trying to figure out if these MP3s would play.
Checkmate MP3 Checker
First Look at MP3 Diags
Starting Over: The Big Test
MP3val Command Line
Preparing an MP3 Diags Spreadsheet
Translating into a Common Language
Addendum: Tag Cleanup
Detecting Major Problems
Comparing the Other Programs
Repairing MP3s with MP3val
Repairing MP3s with MP3 Diags
Review of MP3 Diags Results
Checkmate MP3 Checker
I began this inquiry by trying to find out whether any important new MP3 testing programs or techniques had emerged since my previous effort. AlternativeTo didn’t yield anything obvious. A search led to a Lifewire article (Harris, 2016) recommending Checkmate MP3 Checker. I decided to retry that. I downloaded a copy of Checkmate version 0.19 (last updated in October 2009) from Softpedia, noticing that it was rated only 2.9 stars by 13 users.
Checkmate ran as a standalone (i.e., portable); no installation. I navigated to a folder containing a bunch of MP3s and used Ctrl-A to select them all. There seemed to be a bug in the file selection mechanism. For some reason, some files were not highlighted. I had to scroll down the list with the mouse and use Ctrl-Click to select the ones that were not highlighted. (Eventually I would see what the problem was. The non-selected files were WAVs, not MP3s. Since I insisted, Checkmate tried testing them, but it correctly concluded that they were not valid MP3s. Note that IrfanView would play them regardless of filename extension.)
With all files selected, I right-clicked on them and chose Scan. Checkmate ran through the list of MP3s pretty quickly, indicating for each its status (Broken, OK, or No MP3), MP3 version and layer (e.g., MP3 version 1.0 layer 3), bitrate (e.g., 56k), VBR/CBR, sample rate (e.g., 32000 Hz), frames, duration, and size. It appeared to allow multiple sessions to run concurrently. A few spot checks suggested its file information was accurate when compared to the results provided by MediaTab.
Checkmate was very basic. It did not recurse subdirectories; I had to compare each one in separate operations. It did not seem to have a way to stop a scan; I had to kill the program or else wait for it to finish. It did not allow scrolling and did not indicate progress while running; the window was essentially frozen until the job was done. It did not offer a way to filter the results; instead, I had to click on the Result column heading a couple of times to bring the items that were not OK to the top of the list. It did not give me an option to export the results as a file; instead, I used a screen scraper (namely, Aqua Deskperience, purchased ten years earlier) to scrape text from that portion of the screen into a text file, for comparison against other MP3 checkers.
Checkmate detected a total of three files it labeled as No MP3 and 18 files it labeled as Broken. Right-clicking on an MP3 gave me options to open, scan, delete, rename, or view its properties. Of the first three, IrfanView (i.e., my default MP3 player) was unable to play two, but the third was merely very quiet. IrfanView was able to play all of the 18 files labeled Broken. I didn’t listen all the way through all of those 18, so possibly they did contain defects of some sort.
First Look at MP3 Diags
Softpedia indicated that MP3 Diags had been updated since my last detailed effort to use it. I downloaded and unzipped it. Its MP3DiagsWindows.exe file ran as a portable. It did not offer a Help feature, so I could not easily determine what version I was running. The changelog.txt file appeared to say I was running version 1.2.03 (July 29, 2016).
The MP3 Diags main page listed a number of reviews, mostly from 2009. Many of these (and also some more recent sources) agreed that MP3 Diags was for people who knew what they were doing and/or were going to devote quite a bit of time and attention to the project of making sure their MP3s had no errors. But there were also a few who seemed willing to just let the program fix everything it found wrong. This was not what the MP3 Diags developer recommended. He said, “If you like your files and they don’t bother you, then you probably shouldn’t change them.” So that was really the question: do I like my files? If I had time to listen through them all, I would have the answer to that question.
I ran MP3 Diags against a set of MP3s. Actually, several sets: MP3 Diags allowed me to select multiple folders for analysis. Then it presented me with the results. It seemed that virtually all of my MP3s had at least one problems. To see what sorts of things MP3 Diags would look for, I clicked on the Configuration icon at the upper right corner of the screen (i.e,. the crossed wrench and screwdriver) > Ignored Notes tab. It appeared that, altogether, MP3 Diags currently checked for a total of 110 different problems.
I saw that, by default, MP3 Diags looked for things that I was already aware of and didn’t necessarily consider a problem (e.g., “Low quality MPEG audio stream” — I knew I had some poor MP3s), as well as some problems that the developer himself considered trivial (e.g., error code bh, about which he said, “This is probably best ignored”). (As an aside, WWWalter offers a tutorial for those who worry that they have poor-quality MP3s upscaled to higher bitrates without any improvement in actual sound quality.)
Starting Over: The Big Test
I was fairly confident that most of my MP3s would play. In that sense, MP3 Diags seemed to be overkill. So I decided to try comparing a few more programs. To do a proper job of it, I started over, this time creating a test set of 5,133 MP3 files. These were assorted real, backup, and castoff MP3s, from the Windows system drive and from present and old backup sets and sets that I had tinkered with previously. The set included short and long MP3s, old and new, ranging from less than 40kbps to more than 256kbps. There was a lot of redundancy in this collection, in the sense that the same file might be included several times, before and after conversion to different bitrates, in some cases with some editing changes.
I wanted them all in one folder, so that simple programs like Checkmate could test them all. To do this, I used the Everything file finder to produce a single list of MP3s from various drives, and then pasted them into the desired folder using Q-Dir instead of Windows Explorer, so as to be given the option to copy identically named files with a number appended, producing non-conflicting filenames. Then I ran DoubleKiller to insure that there were no identical duplicate MP3s in the set.
I hoped that this large test set would contain at least a few MP3s with real problems. But if it didn’t, in a way that would answer my question. If I could pull together this many MP3s, some of which were nearly 20 years old, and if almost none of them had problems, that would probably explain why the world did not seem to be desperately seeking the latest and greatest MP3 file checking tool.
I started the big comparison by running Checkmate and MP3 Diags again. Checkmate ripped right through the list. With nothing else seriously competing for CPU and disk attention, on a Windows 7 x64 system with a Core i7-4790 CPU, 16GB of RAM, and with all MP3s stored on a SATA III SSD, both of these programs processed those MP3s within just a few minutes. I was not watching the clock, and the programs produced no statement of scan results, but the scans ran in both cases at a rate of perhaps 3-5GB per minute.
In terms of results, for this new set of files, Checkmate detected 3 “No MP3” files and 64 Broken MP3s.
The MP3 Diags scan produced a full list of all 5,119 MP3 files. (The remaining 14 files of the total of 5,133, I would soon discover, were (as before) WAVs that I had included by accident.) The online User’s Guide told me that I could export that list by going into Configuration > Others tab > Misc group > Show “Export” button. (While I was there, I also resized the icons to a less annoying 32 pixels and unchecked the Auto-size box.) After exiting Configuration, the Export button appeared near the top left corner of the screen. The button gave me the option of exporting in XML, TXT, or M3U (i.e., playlist) format. Regrettably, the export (in any format) failed to include the MP3 Diags error codes. That would make things harder for me when I tried to analyze the results in a spreadsheet (see Conclusion).
In my previous explorations in 2012 and 2014, I had found a number of free MP3 testing programs. The 2014 evaluation noted that the ones with the most positive user ratings seemed to be MP3 Diags, MP3-Check, and MP3val. I revisited the 2014 Softpedia links for various programs, as provided in my 2014 post; reviewed the alternatives that users mentioned at AlternativeTo; and also looked at a few sites that came up in response to a search. Collectively, these sources suggested that it might also be worth looking at MP3 Tester, MP3Utility, and MP3Test.
I downloaded and installed MP3-Check (last updated May 2014, version 1.0.41, 3.6 stars from 69 ratings on Softpedia). The developer offered a tutorial. It appeared that MP3-Check offered checks for ID3v1 and ID3v2 tags, for quality (i.e., above or below specified bitrates and sampling rates), for minimum acceptable channels (i.e., 1 or 2 mono channels, joint stereo, or stereo), and for gain (i.e., determining whether the MP3Gain tag is present, specifying the desired playback volume). Alternately, clicking on the tool button at the upper right corner of the window opened a dialog offering me the opportunity to invoke my own preferred tools for ID3 tags, bit- and sample rates, and gain volume. I did not have any other tools for these purposes, so I selected nothing there.
For purposes of determining basic playability, none of these checks really mattered to me. I saw, however, that the tutorial and the program window indicated that MP3-Check would also determine whether the MP3s were “non-standard or broken.” If so, they would be described as “unordinary MP3s,” without further elaboration. That seemed to place the program’s capabilities at least on a par with Checkmate.
I decided to try it. I designated the folder containing my test set of 5,133 files. MP3-Check went right to work, and was done in 43 seconds. I clicked on the blue disk icon in the upper right corner, expecting to get a Save As dialog. Instead, the program immediately wrote its results to one .log file and five different .m3u files in the folder containing my test MP3s. I was not interested in the .m3u files containing the results of its bitrate, sample rate, or channel mode tests, but I was interested in examining the lists of tag, log, gain, and unordinary files. It appeared that the .m3u files all just contained lists of the files that had failed the test in question — so, for example, the unordinary file listed only the three MP3s that the program considered unordinary (which happened to be the same three that Checkmate had labeled as “No MP3”). But the log file seemed to combine the information contained in the .m3u files, so I would be focusing on that when it came time to examine the program’s findings (below). (Note: the option to “Create status logfile” did produce specific file-by-file numbers on bitrate, sample rate, and gain volume.)
Softpedia indicated that MP3val (3.6 stars from 16 raters) had not been updated since 2009 (version 0.1.8). I verified that at the program’s homepage. It had been slow in my 2014 test; I wondered how it would fare on my newer hardware. Having already written up its general functionality in that previous post, this time I focused on its file processing and results.
Using the wxMP3val frontend (February 2015, version 3.7), I pointed the program at the desired folder. It really was slow, compared to the other programs I had tested so far: it took a few minutes to even to display the 5,119 MP3s it detected, and when I clicked the Actions > Scan menu pick, it proceeded slowly. I had already run it once, using the older version 2.4.3, and frankly I liked the older version’s progress indicator better: it estimated and continued to report the number of files and minutes elapsed and remaining. The older version 2.4.3 estimated it would need 13:28, and actually finished in 12:12. The newer version 3.7 only provided a statement that it had “Processed 857 files of 5119,” or whatever the numbers might be at any moment. I didn’t time it, but the newer version seemed to run at roughly the same speed as the older one.
When the scan was done, the graphical user interface (i.e., the a/k/a GUI, or program window) provided only a cryptic indication that the state of a specific MP3 was either OK or PROBLEM. Right-clicking on a PROBLEM file offered only the same options available through the menu. Those options did not include any way to get more information on the problems identified in any particular file. There was also no option to export the list of results, nor did I see a count of how many PROBLEM files had been detected. It was not possible to copy and paste the information shown in the program window to a separate text file. Clicking on the State column heading did not re-sort the files, so that I could have all of the PROBLEM files at the top of the list. In short, to get an accurate count, it seemed I would have to page down through the 5,119 files, counting manually. A quick visual scan suggested that MP3val had probably identified a few hundred problem files.
There did not seem to be much documentation on the GUI, and the Help menu pick only pointed toward the websites I had already visited. The basic idea seemed to be that, once the scan was complete, I could choose an icon, or the Actions > Repair menu pick, to let MP3val start fixing the problems it had detected.
It appeared, generally, that wxMP3val would flag a file as PROBLEM if it had any of the specific errors listed in the documentation for the command line version (i.e., for MP3val, without the frontend). (A copy of the documentation was also included with the program files, as manual.html.) That documentation seemed to indicate that the program was actually detecting a number of errors — not remotely as many as MP3 Diags, but some of the same kinds of errors at least (involving e.g., headers and frames).
There was a different frontend, called MP3val-frontend. This was included in the MP3val download. I tried again with that. Like most other programs in this review, it was done within just a minute or two. Like the wxMP3val frontend, unfortunately, it did not offer options for list sorting or exporting. It did at least provide copy-and-pasteable text in a window at the bottom of the screen, detailing the errors found in the single file highlighted in the list window.
MP3val Command Line
Those results made clear that, if I wanted MP3val to give me a list of problematic files and more detailed information on those files, I would have to try the command line. To do that, I copied mp3val.exe to the folder containing the MP3 files being tested, opened a command window in that folder, and ran this command:
mp3val *.* -lMP3val.log
It balked at a file whose name began with a hyphen (e.g., “–filename.mp3”). When I changed that to start the filename with a letter, it ran. It was done in a minute or so. The resulting log file was a mess, when viewed in Notepad (as usual, with text wrapping turned off): its text ran together, without any line breaks. But when I copied and pasted it into Microsoft Word, apparently some hidden line break characters came alive, because now each warning was on its own line. I copied and pasted from Word to Excel.
The text was not configured for spreadsheet analysis: instead of providing all information for one file on one line, there might or might not be several different error messages for a single MP3, each on its own line. Each line began with either INFO: or WARNING:. It appeared that every one of my 5,133 files had at least an INFO line, stating the filename and path and providing basic information (number of frames, MPEG layer, CBR, ID3v1 or ID3v2 tag).
Altogether, there were 11,670 lines. Subtracting the 5,134 INFO lines (one of which was presumably for the mp3val.exe program in that folder) left 6,536 WARNING lines. I saw that, as one would expect, the WAV files were producing a number of warnings. MP3val had not limited itself to MP3 files, but instead had followed my command (above) and had attempted to process all files in the folder. I revised the command to stop the program from looking at WAVs:
mp3val *.mp3 -lMP3val.log
That gave me a total of 5,119 INFO lines, one for each of the 5,119 MP3s in the folder. Now there were 5,161 WARNING lines. Some files had no WARNINGs, and some had only one. The maximum number of WARNINGs for any file was only 6. I would be analyzing these results in more detail when I prepared my spreadsheet (below).
MP3 Tester’s homepage did not say what version was current, or when it had been last updated, but I did get a hint from the fact that its list of compatible operating systems ranged from Windows 95 to Windows XP. But Softpedia said its current version was 1.05 beta 5, last updated February 2016, so maybe the situation wasn’t that bad. Softpedia’s rating was confusing — it seemed to say the program had only one rater, but that his/her rating had somehow managed to give the program 3.4 stars — but, at any rate, Softpedia’s reviewer gave it 3.5. I downloaded and installed the free demo version, rather than pay $8 to buy the full version.
The Options button offered these tabs: Frame Structure, Rates, Length/Size, DSP, Numbering, Tags, and Duplicates. By default, only Frame Structure > Report files with bad frame structure was selected. That tab offered the option of ignoring errors at the beginning or ends of files, and of reporting only the first problem (within a file, I assume it meant). The Rates tab offered options of reporting files with sample or bit rates under or over specified values. The Length/Size tab offered to report files under or over specified durations (e.g., 60 seconds), and under or over specified sizes (e.g., 10MB). The DSP tab allowed me to report files that end suddenly. The Numbering tab would look out for files whose names did not start with a number, or that had duplicate or missing numbers. I assumed this was for sequential files (e.g., 001Recording.mp3, 002Recording.mp3, etc.). The Tags tab offered checks for ID3 v1 and v2 tags (no tag, no track name, no album name, no artist name), and for pictures (does or does not have an attached picture) and lyrics. The Duplicates tab was apparently going to see whether files had similar names, ID3 tags, or contents.
In several of those regards, MP3 Tester offered options that not even MP3 Diags offered. In the interests of comparison, I ran the program with the default option (frame structure) selected, and also with a checkmark in the DSP box regarding files that ended suddenly. In addition, I selected all of the boxes on the Tags tab. Unfortunately, this produced an error message, telling me that the demo version would allow no more than two tests to be conducted simultaneously. So I clicked Reset and just went with the Frame Structure and DSP options. Then I clicked the Start button. This gave me another notice, indicating that, in the demo version, the selected tests would run on only about 70% of the files that I had selected. I said OK.
The MP3 Tester testing approach was exhaustive. If it encountered an error, it seemed to report every second in which that error appeared. For instance, it would say Frame Error (3:42/5:36), apparently indicating that there was an error at minute 3, second 42, of a file whose full duration was 5:36; and then its next error message would be 3:43/5:36, followed by 3:44/5:36. It seemed to find that a number of MP3s ended suddenly. It ran slowly, but I was impressed by its detail.
There were some options to explore and save the details of what MP3 Tester had found. If I right-clicked on a specific error message, I got options to conduct various file operations (i.e., play, delete, move), and also to produce output (i.e., clear, save, or print the report). Unfortunately, choosing Save gave me a blank error message, containing nothing other than a red X mark that led me to understand nothing would be saved. Possibly that was another feature available only in the full version. There seemed to be no built-in capability to copy and paste error messages from the screen.
I was able to print the output as a PDF. The volume was stunning. I had aborted after the program had analyzed only about 700 files; but already, by that point, the resulting PDF was more than 200 pages long. I was able to convert the PDF’s contents to text in Adobe Acrobat, but for some reason the resulting text file contained only a few dozen file names.
There did not appear to be a way to get a summary report that would be comparable to the output of other programs checked above. The program also did not seem to offer any file repair options. Given the limited features of the free version, the program did not appear helpful for my purposes.
Softpedia indicated that MP3Utility, currently offering version 1.80 beta / 1.72 Build 1 (last updated May 2013), had an average of 4.0 stars from six raters. The program’s About button indicated that version 1.72 dated from 2000.
The Options button gave me options to recursively test subdirectories, move files with errors to a specified directory, ignore errors in the last 1% of the file, specify the number of bytes to search, to “attempt resync.” and to specify a maximum field length for the ID3v2 tag. I went with its default values, loaded the directory containing my 5,133 test files, and ran it.
It ran pretty quickly, taking about two minutes to scan the 5,119 files it detected as MP3s. I clicked the Save Log button and saved its results. As with other programs discussed above, I would be reviewing the results in more detail (below).
Softpedia reported that MP3Test version 1.7.0 (updated January 2016) was shareware costing €15.00. It had a rating of 3.3 stars from five raters. I downloaded and installed it.
I took a look at its Options. These included a choice of CPU priority, to keep it from hogging system resources; an option to integrate into the Windows shell; options to bulk-rename files, and to move or delete bad and/or good files; and to ignore files with less than my threshold (e.g., 1%) of errors, and to narrow the list only to Resync errors, whatever that meant. I decided to stick with its default settings.
When I tried the program’s Select Files button, it choked for a few seconds on my list of 5,133 files; then it came back and said, “Couldn’t open file Ô! 1 (read) errors occured while analyzing files! Done.” No files were listed.
I tried instead using its Search Folder option. I designated the folder containing my 5,133 files and clicked Go. It ran. Like MP3 Tester (above), it appeared to be accumulating a vast number of error messages. After a few minutes, it had processed only a few hundred files; it estimated that it would need another 56:18 to finish. I stopped it at that point and took a look at its results. Out of 193 files, it had found that 171 were damaged and 22 were error-free. These sets were visible in separate tabs.
With the All Songs tab selected, I right-clicked on a file and saw that I had file-handling options (e.g., move, delete), exporting options, and a Show Song Infos option. The Show Song Infos choice opened a dialog summarizing not only the file’s information (including its percentage and number of errors, filesize, bytes OK, fnumber of frames, bitrate, etc.) but also showing the contents of its ID3v1, ID3v2, and APE tags. I tried the Copy List to Clipboard option, and pasted the results into Excel. This turned out to be just a summary statement of the same data (especially percent of errors), without details. I tried the Excel Export option. That had the same effect, but this spreadsheet was clearer and more organized than the clipboard-paste approach. I could see, now, that the program had identified anywhere from zero to nearly 6,000 errors within a single file. There was no detail as to the nature or significance of the alleged errors; just a summary indication of the number of errors, with a percentage figure that seemed to indicate how much of the file was free from errors.
The program’s Help feature opened a colorful page summarizing what the program could do. The Help said that the log window would be cleared when it exceeded 3200 lines. I was not sure what that would imply for the idea of keeping a log of the program’s output when processing 5,133 files.
When I closed the program, I received a dialog informing me that this shareware version would be free to use for 30 days. It was not clear whether using it after that period would produce anything more than that dialog, reminding the user to register the software.
Without more detailed information about the alleged errors, or any way to repair them (as opposed to moving or deleting them), I did not believe this program would be very useful for my purposes. That said, it was appealing, well-designed, and seemed to work well.
Preparing an MP3 Diags Spreadsheet
My explorations of several MP3 testing programs (above) had given me several lists of allegedly problematic MP3s, with some details on what might be the matter with those files. Now it was time to interpret and apply what those programs seemed to be saying. To do that, I developed a spreadsheet in Microsoft Excel. I decided to structure this spreadsheet around the results from MP3 Diags, since they were so much more detailed than what I got from any other tool. Note that the Conclusion offers links to download copies of the spreadsheet; this discussion largely seeks to provide background information.
MP3 Diags provided two sorts of information, for my purposes. First, there was the list of diagnostic codes that MP3 Diags displayed onscreen. Second, there was the log file containing a list of errors and information messages that MP3 Diags produced, regarding my test set of 5,133 files. In both cases, it was a challenge to get this information into a spreadsheet-friendly form.
In the case of the diagnostic codes, I did not initially realize that MP3 Diags was not really showing me “all” of its codes, or notes, when I clicked the All Notes button at the left side of its main screen. Its User’s Guide said that these were just the notes pertaining to the file currently selected, in the list at the top of the main screen. But that proved to be incorrect. For example, when I looked at a file called Taste of Rurality.mp3, the grid at the top right corner of the main window showed that code fa was the only one applicable to this particular file — and yet the All Notes list displayed at least twenty different codes. I was not sure what that list was. It was not the list of all notes applicable to any of the 5,133 files.
To see the full list of all possible MP3 Diags diagnostic codes, including those that did not apply to the current set of files, I had to go into the Configuration icon (crossed wrench and screwdriver, near the upper right-hand corner) > Ignored Notes tab. Unfortunately, it was not possible either to export that list or to copy and paste its contents. For the benefit of future explorers, I developd another post containing the full list of MP3 Diags diagnostic codes, with an explanation of how I produced that list.
Meanwhile, the MP3 Diags log file resulting from its scan of my 5,133 files was quite verbose. I wanted to reduce its contents, so that the important information about each file could be summarized on a single line in the spreadsheet. That would enable me to compare the MP3 Diags output against the results from other MP3 testing programs.
The log’s contents were largely structured around the MP3 files being analyzed. First, there was a line naming a file, followed by lines stating what MP3 Diags had observed in that file. The first line(s) following the filename would report the file’s structural details (e.g., “MPEG-1 Layer III, Stereo, 32000Hz”). At this point, if MP3 Diags was not able to figure out what sort of file it was dealing with, it would produce one or more long messages, essentially describing the problem(s) it had encountered.
After those file information lines, there was a dashed line, and then the log would list diagnostic messages. In most cases, a diagnostic message would consist largely of the textual description of a diagnostic code. For example, after stating the name of an MP3 file and describing its structure (e.g., MPEG-1 Layer III, etc.), the log might contain a line saying this:
W No ID3V2.3.0 tag found, although this is the most popular tag for storing song information.
Except for the beginning letter W, that was simply the description of diagnostic code fa. So, as described in the other post, mostly my spreadsheet just needed to look up that text in a table, and produce the corresponding diagnostic code. In a few cases, I had to invent codes, to accommodate certain messages for which MP3 Diags did not yet offer two-letter codes. (Those invented codes have three letters, the last of which is “x.”) With those lookups, the spreadsheet succeeded in capturing the MP3 file’s name and diagnostic codes on a single line.
In the example just given, the log’s diagnostic message began with a solitary letter W. It would turn out that this was short for “warning.” Other diagnostic messages began with a solitary E (“error”) or, in a few cases, S (“support”). The MP3 Diags list of diagnostic codes did not clearly indicate which codes involved errors, which would presumably be more serious than warnings; but these solitary letters appeared to provide that additional information. The User’s Guide explained that the difference was also visible in the MP3 Diags screen itself: errors were in red print, warnings were in black print, and support notes were in blue print. Presumably a user would want to focus especially on errors.
To boil down the log file’s information about an MP3 to a single line in my Excel spreadsheet, I began (after doing the code lookup) by examining those files that had produced a large number of messages. For instance, one file produced 217 messages. There were only 118 possible error codes, so plainly some codes were being repeated. I resorted the list so that diagnostic codes would appear in alphabetical order, under each file name, and then eliminated redundancies. I filtered the column containing the diagnostic codes for unique values. Now I could see that, altogether, the log had reported 46 different types of errors (including the eight partially redundant ones I had invented). In other words, fewer than half of the problems that MP3 Diags could detect had been identified in my 5,133 files. Strangely, the MP3 Diags main screen showed 40, not 38 (i.e., 46 – 8), code columns.
Anyway, now that I had my list of codes that actually applied to these 5,133 files, I alphabetized it and converted it to a horizontal list, which I then used as a set of column headings. I was able to mark the diagnostic code summarizing each row. (Again, the spreadsheets are available for download in the Conclusion. The spreadsheets themselves contain more detailed instructions. These comments are intended just to help clarify how they work. The design of the MP3val spreadsheet, below, was better. If I were going to be doing a lot of this work, I would redesign this one to correspond to that one.)
The results indicated that codes were distributed as follows:
- Codes appearing in fewer than 5 MP3s: ag, ai, aox, ba, bc, bd, da, dfx, dg, dj, eb, ec, ee, fax, hg, hh (16 total)
- Codes appearing in 5 to 19 MP3s: ad, al, bh, di, hd, ia, kd, kdx (8 total)
- Codes appearing in 20-49 MP3s: aa, ae, dkx, ib, ja, kb, of (7 total)
- Codes appearing in 50-99 MP3s: ac, bg, cb, iax, jax, jb, kc (7 total)
- Codes appearing in 100-300 MP3s: ak, ea, ha, kbx (4 total)
- Codes appearing in >4000 MP3s: ab, an, fa, ob (4 total)
As suggested especially by that last line, there seemed to be a lot of overlap. That is, files that triggered certain codes were likely to trigger certain others. To close in on that issue, it was time to collapse all of the diagnostic codes pertaining to an MP3 onto the single row containing the MP3’s filename. Since I could now see that no MP3 had more than 14 diagnostic codes, I achieved that collapse by using 14 columns, each looking at the code in the next lower row below the filename, and producing nothing once another filename was encountered (using __ as the code assigned to a filename).
Now that I had all of the codes pertaining to each file on the same row with the filename, I could delete the rows that did not have the filenames. Then (remembering, here, that the rows containing codes had previously been sorted alphabetically), I set up columns to capture all nonredundant pairings among the codes applicable to each file.
Once that was done, I set up another sheet to create all possible pairings and to count the occurrences of each pairing in my main sheet. Ultimately, though, I left this part of the analysis undeveloped.
Translating into a Common Language
As described above, I had useful results from MP3 Diags, Checkmate, MP3-Check, MP3val, and MP3Utility. I placed the results from each such program into a different tab in my spreadsheet, and began to compare their results.
MP3 Diags offered by far the largest number of diagnostic codes. Although there appeared to be some redundancy in those codes (e.g., ed & ee; kb & kc), plainly its set of 110 codes put it into a class by itself. Coming in second, the MP3val documentation listed 16 different errors that the program could detect. I decided to try to translate between those two programs.
It quickly developed that the problems identified by MP3val error messages were not necessarily identical to those identified by MP3 Diags codes. Attempting a different approach, I compared the lists of files they considered problematic. To assist in this comparison, I assigned one-letter codes, A through P, as shorthand for the 16 types of errors that could MP3val detect. The resulting list was as follows:
I decided to start with code F, since it was most common and often occurred by itself. MP3val had marked a total of 4,519 files with nothing other than code F. All but 50 of those files were accounted for with just two sets of MP3 Diags codes: 587 were coded with the an-fa-ob combination, and 3,882 added ab to that combination. The ab code reported a low-quality audio stream; the an code reported the absence of MP3Gain-style normalization information; fa reported the absence of an ID3v2.3.0 tab; and ob reported the absence of any supported tag capable of storing song information.The numbers of files in which individual errors appeared was also potentially informative: A (3 files), B (2), C (38), D (148), E (34), F (4702), G (2), H (0), I (71), J (3), K (114), L (30), M (0), N (2), O (0), P (0).
Those results confirmed, first, that, as expected, MP3 Diags was catching things that MP3val was not. I did not actually care about the ab code; I knew I had a number of low-quality MP3s. Likewise, I was not presently concerned with the an code; I expected to use MP3Gain later (though of course the code would be useful if I wanted to explore the possibility that MP3Gain was the reason why two sets of files were not identical). The fa code seemed to address a subset of the issues raised by the ob code. In that sense, it was redundant, though no doubt some users would appreciate its focus on the apparently preferred ID3v2.3.0 tag. In short, it seemed the MP3val F code might be functionally equivalent to the MP3 Diags ob code.
That interpretation raised the question of whether MP3val F and MP3 Diags ob were being logged for exactly the same files. The answer appeared to be, almost. Six MP3s were flagged with F but not with ob. MP3 Diags did flag those six with numerous other codes, however — including, in every case, fa and ha. Code ha indicated that the file did have the “pretty limited” ID3v1 tag. To refine the last sentence of the previous paragraph, then, and with apologies for my lack of technical knowledge of MP3 tags sufficient to provide firsthand verification, it seemed that ob would indicate the presence of no tag capable of storing song information, while F might overlook the presence of an ID3v1 tag.
It was not clear why the F and ob codes were not producing exactly the same results. I was beginning to sense that codes in MP3 Diags and MP3val might not be fully linkable.
Flipping the situation, there were also four MP3s that MP3 Diags flagged with ob but MP3val did not flag with F. Indeed, except for one occurrence of C, MP3val did not flag those four with anything at all. The other programs’ results were of limited use here: MP3-Check did agree that three of those files were missing ID3v1 tags and had gain problems, and that the fourth one was missing an ID3v2 tag and had a gain problem, whereas Checkmate and MP3Utility found nothing wrong with any of them.
That look at the F code suggested that MP3val and MP3 Diags were pretty close for the most part. I had flirted with the belief that MP3val provided a simpler and equally competent analysis of files, and it might. The surplus of MP3 Diags codes pertaining to tags left me vaguely confused, with a sense that there might be some slippage or inconsistency in there somewhere. But after this brief look, I continued to favor MP3 Diags overall.
Addendum: Tag Cleanup
The notes in this section are previously unpublished holdovers from a previous exploration of other MP3 files, with a particular focus on MP3 tags. These notes are not related to this post’s general focus on MP3 playability. I have decided to preserve these notes for reference. The main discussion resumes, below, following this brief excursus. Readers interested in the main discussion may want to just skip this section.
* * *
It appeared that some MP3 Diags notes described problems that might be best addressed by using other programs. In particular (a) there were a number of notes about tags, and (b) the note labeled with the letters “an” suggested that some of my MP3s should be normalized with something like MP3Gain.
In MP3 Diags, I clicked on the double green arrows pointing left, so as to move all notes to the Ignore Notes side of the screen. A search did not provide much guidance on the question of which of these notes I should now move back to the right side, so that MP3 Diags would highlight those in particular.
According to ID3.org, “An ID3 tag is a data container within an MP3 audio file.” In the currently dominant version (ID3 2.3), tags include such information as song title, artist name, album name, year, and genre. “Tagging” means putting this non-audio information into the audio file.
I had not bothered tagging my files. In the case of MP3s containing music, I had simply named the files using the format Artist–SongTitle.mp3. I had encountered media players that seemed to require the tag, not just the filename, in order to properly display what was playing. So I didn’t mind copying the filename information (i.e., artist and song title) into the corresponding tags, if that process could be automated.
Gizmo chose TagScanner as its recommended free MP3 tag editor, with Mp3tag as an alternative. AlternativeTo agreed that those two were the top tag-specific tools, but indicated that Mp3tag was far more widely known than TagScanner. (Like Gizmo, Lifehacker and MakeUseOf listed some other choices.) Softpedia confirmed both of those impressions: the latest version of Mp3tag (including its portable version) had been downloaded nearly three times as often as TagScanner, but the latter had a slightly higher user rating (4.4 stars vs. 4.2). I decided to start with TagScanner.
TagScanner’s homepage did not seem to offer any support links. There was also no top-level menu pick for Help. The top bar indicated that I was using version 6.0.15. A search led to a manual for version 5.1.594 (also on Scribd). Like the webpage, its Key Features section confirmed that I could use TagScanner to rename files based on tags or, conversely, generate tags based on file names. I was interested in the latter option, since I liked my file names, whereas MP3 Diags didn’t like my tags.
The tabs differed somewhat between TagScanner versions 5 and 6. In version 6, it appeared that the Generate tab was going to provide what I needed. Its default option (on the right side) was Generate Tag from Filename. I chose that and revised its Scheme of Filename to say %artist%–%title% so that it could interpret my filenames. I moused over the folder icon at the upper left corner. It said, Browse for Folder. That’s what I wanted. I selected the desired folder, containing several subfolders of MP3s. It detected the files, and its status bar (at the bottom) told me it was scanning them. When it was done, I hit Ctrl-A to select all of the files it had identified. Then I clicked Preview. It looked like it had got it right: except for a few cases where I had not followed my own naming rules, there were band names in the Artist column and song names in the Title column. I clicked Generate.
I re-ran MP3 Diags. MP3 Diags found error “ea” for virtually every file in the list. In MP3Diags-speak, error “ea” was this: “ID3V2 tag doesn’t have an APIC frame (which is used to store images).” I wasn’t sure, but I believed that was different from the previous scan, and I wondered whether TagScanner had introduced that problem. The TagScanner user manual contained no reference to APIC frames. A search led to an indication that this particular problem would probably have to be resolved manually, one file at a time, unless you happened to know how to program in Perl.
* * *
Detecting Major Problems
The foregoing look at tag-related problems reminded me that I did not have the knowledge, time, or interest to conduct a thorough audit or comparison of the error codes offered by MP3 Diags and MP3val. Regarding tags, especially, I had survived without worrying about them, and I expected to continue to do so. It seemed advisable to return to my primary and more basic concern. I wanted to know whether any of my MP3s were bad, in the sense of being unplayable or having other significant issues due to corrupt or imperfect MP3 encoding.
I decided to start with a look at how the various programs responded to those three MP3s that Checkmate had labeled as “No MP3” and that MP3-Check had labeled as “Unordinary.” Those three, it seemed, must be really screwed up.
MP3 Diags gave each of those three files the fa and ob tag-related codes discussed above. More ominously, MP3 Diags gave all of them codes ac (“No MPEG audio stream found”) and kb (“Unknown stream found”). MP3 Diags also gave one of them the ak code (“Invalid MPEG stream. Stream has fewer than 10 frames”). MP3val gave all three its C F I J codes. For two of them, MP3Utility said, “Can’t locate first frame header within 5,000 bytes of beginning of file”; but for the third (and for all other MP3s in which MP3Utility noticed any kind of problem), it merely reported a sync error.
I wondered whether these were the only files to which MP3 Diags gave its ac and kb codes, and likewise for the MP3val C F I J codes — or, more precisely, for the seemingly most relevant I and J codes. I found that MP3 Diags gave its ac code to a total of 72 files, and labeled a total of 137 files (including all of those 72) with its kb code. MP3val gave its I code to 71 files, but its J code to only those three.
I attempted to listen to portions of those three files. The programs were right about two of them: neither IrfanView nor Windows Media Player was able to play them. The third one played, but it was muted and brief, and contained a number of clicks that may have been flaws in the MP3 code. These results suggested several conclusions:
- The MP3 Diags ac and kb codes seemed to err on the side of being hypercritical. That is, they identified what may have been real problems with many files — but in doing so, they failed to focus attention on truly unplayable files.
- The MP3val J code agreed, with Checkmate and MP3-Check, that these three files were in a special category. They still included that one false positive — that is, they still flagged one MP3 that was, in fact, playable — but it was not difficult to listen to such a small number of files and thus obtain manual verification of what the programs said.
- MP3Utility was wrong about two of those three files: it gave the playable files its rare “Can’t locate first framer header” error, and failed to give that error to one of the unplayable files.
These conclusions inclined me to think that MP3 Diags might be unmatched for purposes of comprehensive MP3 repair and cleanup, and that such a cleanup might highlight those few files that would resist being fixed. That could be a workable approach, as long as the user was willing to put in the time, did not get bogged down in other potentially minor issues, and was did not find the program or its messages too confusing or complex. It tentatively appeared, though, that MP3val would draw the user’s attention more directly to severely problematic files, as long as the user knew to focus on code J. Checkmate and MP3-Check did likewise, without providing the information offered by MP3val on other problems of potential interest.
As just described, I had searched for other files to which MP3 Diags gave the ac and kb codes that it gave to those three especially screwed-up files. Now I realized that, by seeking both of those codes, I might have unwittingly excluded problematic files that had just one of them. The kb (“Unknown stream found”) seemed to mean that there was audio present, and that perhaps a robust MP3 player could play it, even if the ac (“No MPEG audio stream found”) meant that the audio didn’t qualify as MPEG. So were there, perhaps, some files that had an ac without a kb? My spreadsheet said no. Another possibility was that other codes could point toward other bad MP3s. Code ak was one possibility: MP3 Diags had found it in one of the two unplayable MP3s.
Some of the codes on the MP3 Diags list were not found in any files in my test set. Others appeared to be subsets. In particular, Code ib appeared to be a subset of ia: code ia always accompanied it. I also saw that my kdx and iax codes completely overlapped with kd and ia, respectively — indicating that, at least for these test files, it had not been necessary for me to invent the kdx and iax codes to represent the additional error messages that I had found in the log file. That left four distinct codes that did appear in a number of files in my test set:
- ak (“Invalid MPEG stream”) (135 files)
- ia (“Broken stream found”) (53 files)
- kd (“File contains null streams”) (10 files)
- of (“Unsupported stream found”) (25 files)
Among those, my spreadsheet told me that ak tended to account for codes ia and of: there were only three files that had ia codes without also having ak codes, and likewise there were only three that had of codes without also having ak. (Those two sets of three were not the same; there were a total of six MP3s in those two groups.) Codes ak and kd were closest to being mutually exclusive: eight of the ten files marked kd were not also marked ak. None of those eight were included in those two other sets of three, so now I had a list of 14 files that were coded ia, of, or kd and were not coded ak.
I listened to the first few seconds of all 14 of those files, plus six files that were coded ak but not ia, of, or kd. Some were old voice recordings; others were relatively new music MP3s. The older ones weren’t great, but there didn’t seem to be any real problems with any of these files. As far as I could tell from this brief exploration, none of these MP3 Diags codes identified MP3 files that were actually nonfunctioning.
As noted above, MP3 Diags gave its ac code to 72 files, and MP3val gave its I code to 71 files. Since those two codes were present in the three especially problematic (“unordinary”) MP3s identified by multiple programs, it seemed those codes might point toward other problematic files. My spreadsheet confirmed that both the ac and the I codes were given to 69 files. The text of the I code (above) was, “This is a RIFF file, not MPEG stream.” MP3 Diags did not have an error code specifically referring to RIFF files. It appeared, then, that when MP3 Diags ac code said, “No MPEG audio stream found,” it usually meant, “This is a RIFF file,” not that there was no audio at all. As noted above, all of the kb files were also coded as ac.
At present, then, I had an inconclusive understanding of what would count as a major problem in an MP3. There did not seem to be an MP3 Diags code that would specifically identify nonworking MP3s. The MP3val code J did seem to do that.
Comparing the Other Programs
I wondered whether those ~70 files (i.e., the 72 coded as ac by MP3 Diags, and the 71 coded as I by MP3val) were more or less the files identified by the other MP3 testing programs.
In the case of Checkmate, the answer was no. For the 64 MP3s that Checkmate considered “Broken” (i.e., excluding the three that Checkmate labeled “No MP3”), I tried all available MP3 Diags codes. The ones that did the best job of explaining why Checkmate considered those 64 files broken were, of course, the tag-related codes (e.g., fa, an), but that was surely because the large majority of all MP3s in the test set were coded as fa and/or an. Otherwise, no individual MP3 Diags code accounted for more than about half of Checkmate’s Broken files. It appeared that some combination of MP3 Diags codes might provide the best explanation. I did not attempt to figure out which possible combinations, among the many MP3 Diags codes, might best account for what Checkmate had flagged.
Turning it around, there were virtually no MP3 Diags codes that Checkmate substantially accounted for. The notable exception was code aa: all 22 of the MP3s coded aa were also included on Checkmate’s list of Broken files. That was rather odd: code aa identified the presence of two MPEG streams instead of the desired one. I supposed it was possible that Checkmate’s developer focused for some reason on that seemingly rare problem.
I also tried comparing Checkmate’s list against the lists produced by the various MP3val codes — but there, again, Checkmate’s results did not match up. The big exception was code L, which indicated a CRC problem. On that code, MP3val and Checkmate agreed for 29 out of the 30 files tagged with L by MP3val. It appeared that the Checkmate hits were best explained by MP3val codes B, C, E, I, and L — but, aside from L, Checkmate flagged only a fraction of the files flagged by those MP3val codes. For instance, MP3val coded 38 files as C, but only 13 of those were on the Checkmate list.
At some point in this process, Checkmate seemed to re-run its scan. I was pretty sure that, as recorded above, it had produced a list of 3 “No MP3” and 64 “Broken” files, when I used it to scan my test set. But now, as I returned to the Checkmate window, I saw that it listed six “No MP3” files (i.e., the previous three, plus three others), and a total of 75 “Broken” files. I did not attempt another set of screen captures and recalculations to update the foregoing information in light of this new output. I was not sure what would account for the discrepancy between what I had screen-captured and what I was now seeing. Possibly having the program open for days, while I was researching and writing this post, had affected its database or calculations somehow. I tried playing the new “No MP3” files appearing on the list. All three of these played. I closed and restarted Checkmate and re-ran its scan. Now it was back to reporting just three “No MP3” and 64 “Broken” files.
I tried playing the first dozen MP3s that Checkmate reported as Broken. I only played long enough to open the file and listen for a few seconds. All played successfully.
In sum, Checkmate appeared to do a good job of recognizing unplayable (i.e., “No MP3”) files, based on what I knew so far of this test set of files. But it was not clear what Checkmate was doing, in its determination that other files were “Broken.” Whatever it was, it appeared to be ignoring many types of problems that other programs would detect, and it also failed to detect many instances of the types of problems that it seemed to be trying to detect. Given the absence of any file-repair tool within Checkmate, there did not appear to be any reason to use this program, other than for a quick way of identifying at least some MP3s that were truly unplayable.
My experience with Checkmate affected my approach to the remaining programs. By this point, I was less patient or curious; I felt that the burden was on these programs to justify further attention to them.
Like Checkmate, MP3-Check detected those three files that it called “Unordinary”; and like Checkmate, MP3-Check had no built-in file repair capabilities. As noted above, it offered potentially useful, file-by-file data on bitrate, sample rate, and gain value, and it also offered relatively detailed tag information. As such, it appeared to be another tool that might be helpful now and then. It did not appear to provide detailed information on file problems, however, so I did not plan to explore it further at this time.
MP3Utility likewise offered no file repair capability. Also, as noted above, it failed to detect one of the two truly unplayable MP3s, and drew attention to an “unordinary” MP3 that was, in fact, playable. As such, MP3Utility did not seem as reliable as MP3-Check, for purposes of quickly identifying major problems in MP3s. Otherwise, MP3Utility flagged nothing other than the location of the file’s first sync error. Most of those errors, it said, occurred at “approx. 0:00.”
MP3 Diags and MP3val did not seem to concur. First, the only potentially relevant MP3 Diags code mentioning synchronization was gh, which MP3 Diags found to be an issue in none of the 5,119 MP3s in the test set. MP3val code E referred to “MPEG stream error, resynchronized successfully,” and that seemed consistent with my experience, which was that a half-dozen files selected from the MP3Utility list seemed to play OK. For a number of the files on the MP3Utility list, the only error detected by MP3val was F (“No supported tags in the file”). Likewise, tag-related errors were the only ones detected by MP3 Diags for many of the files listed by MP3Utility as having “sync errors.”
I lacked technical expertise to verify that there were, in fact, no sync errors. Tentatively, however, there did appear to be reason to question the quality of the information MP3Utility was providing. Moreover, unlike the other programs just reviewed, it did not appear to offer any particular features that would warrant keeping it on hand.
Other Programs Collectively
Overall, the primary contribution of these three programs, in the search for other MP3s that may have been corrupted, was that they collectively confirmed that only two or three files in this set were bad. I felt that, when it came time to scan another set of MP3s, I would want to run Checkmate and MP3-Check, primarily to obtain similar confirmation of the list of most problematic MP3s in that set.
Aside from those few bad MP3s, the results of these three programs differed. MP3-Check flagged all but three of the MP3s (5,116 altogether), as virtually all of them had problems with tags and/or with gain. The numbers of files flagged by the other two were more similar — 67 by Checkmate, 119 by MP3Utility — but their results were largely inconsistent: 40 of the 67 files flagged by Checkmate were not flagged by MP3Utility, and 92 of the 119 files flagged by MP3Utility were not flagged by Checkmate.
In short, the situation seemed to be that there was not a settled concept of what a program should bring to the user’s attention, for purposes of testing a set of MP3s. While I have sketched out some limited regards in which these other programs could be useful, it presently appeared that the user would ideally work through the codes provided by MP3 Diags and/or MP3val, so as to arrive at his/her own sense of what needs to be checked in a set of MP3 files.
Repairing MP3s with MP3val
This section describes what happened when I used MP3val to repair problems that MP3val could detect, in my test set of MP3s.
As I review this section, I see that, after the first few paragraphs, I got bogged down in trying to compare what MP3val and MP3 Diags were seeing, when they looked at the results of the MP3val repair effort. Those who are not interested in the details might read the next few paragraphs and then skim down to the last few paragraphs of this section, and likewise in some later discussions of MP3 Diags codes.
To proceed, then. The repair options offered by MP3val were pretty simple. To achieve and log a repair, the documentation seemed to suggest using a command like this:
mp3val *.mp3 -f -lMP3val_RepairLog.txt
There did not appear to be a way to fix specific problems; the -f option seemed to be designed to fix everything that MP3val considered wrong, within the targeted set of MP3s. The documentation indicated that MP3val would be capable of fixing most but not all errors that it identified. In a few areas, the developer expressed an intention to enhance the repair capability; in a few other areas, the documentation said that the only fix was to reencode the file.
It would not be necessary to use one of the tools discussed in this post, to repair a damaged MP3. For some files flagged by Checkmate or with the MP3val code J, it might be possible to make improvements with an ordinary audio editor. I used Audacity and Cool Edit 2000 to try to open the three files that some of the foregoing programs had flagged as “No MP3” or “Unordinary.” Both of those editors were able to open and make improvements to the one MP3 that I had found to be playable, but were unable to open the other two. But Windows Explorer reported that these three troubled MP3s all had contents of some sort: their file sizes were larger than zero bytes or even 1 KB. As another option, a quick search led, for example, to Lifewire‘s recommendation and explanation of MP3 Repair Tool (rated 3.6 stars on Softpedia). I did not explore that option further at this point.
Instead, I proceeded with the MP3val command shown above. I had a backup of the test set of MP3s, so I could go ahead and change the test set if desired, and would be able to restore the original set later. I ran the repair command on the test set. It finished the repairs it could make, to the files it wanted to repair, in about 90 seconds.
I looked at the log file. It appeared to contain the same INFO and WARNING messages as I had seen in the previous log file, but now it also had FIXED messages. All FIXED messages consisted simply of the words, “File was rebuilt.” This implied that the repair did not involve a detailed process of examining each error that MP3val could identify. Rather, it appeared that the repair was simply a matter of rebuilding the file, and hoping that this would resolve most errors. For more information on what had been changed, I ran the command that I had run previously, without the repair option, on a fresh copy of this set of test files:
mp3val *.mp3 -lMP3val.log
The results initially seemed underwhelming. Before running the fix command, the log presented a total of 5,161 WARNING messages. After running the fix, the log presented 4,814 WARNINGs. Upon closer examination, however, it seemed the large majority of those messages were due to code F. MP3val was reporting that it had substantially taken care of a number of types of potential problems:
The asterisks indicate types of WARNINGs that, according to the documentation, MP3val was unable to fix — that, in some cases, no program could fix. For instance, apparently code A was fixable only by recoding the file, and code F would be fixed when the user stored metadata in the MP3’s tags.
I was not sure why allegedly unfixable errors (i.e., A, G, J, K) were seemingly fixed in several instances. I had restored fresh versions of the test MP3s before running the fix, so my previous tinkering should not have affected anything. Possibly MP3val was able to fix some such errors after all.
MP3val did appear to have eliminated all of the warnings that it said it could eliminate. As shown, this test set of MP3s included no instances of some codes (e.g., H, M), so it was not clear whether MP3val would have succeeded in fixing WARNINGs of those types. As shown in the foregoing list, after excluding the error types that MP3val did not claim to fix, there were 323 fixable errors before the fix, and zero fixable errors after the fix.
I decided to get a second opinion on that. Using my spreadsheet, I calculated the numbers of files, in the original test set of MP3s, in which MP3 Diags identified various diagnostic codes, and I compared those results against the numbers of files in which MP3 Diags found those same codes after treatment by MP3val.
In that comparison, one might expect that at least things would not get worse. But four of the diagnostic codes actually appeared in more files post-treatment than pre-treatment. The numbers were very small: three of those four codes appeared in two more files post-treatment than pre-treatment, and one appeared in six more files post- than pre-treatment. The codes in question were ah, an, bh, and ob. (Ah had appeared in no files pre-treatment.) None of these codes appeared significant. I was not sure whether this increase resulted from MP3val making things slightly worse in a few instances, or whether MP3 Diags was inconsistent, or whether my I had made mistakes, or precisely what would explain that.
Zero change was, in fact, the situation for a number of MP3 Diags codes. The 19 codes that appeared in the same number of files before and after the MP3val treatment were as follows: ab, ac, ag, al, aox, cb, dfx, dg, di, dj, dkx, ea, eb, ec, ee, fa, hg, hh, and of. Combined with the previous paragraph, then, 22 codes appeared in at least as many MP3s after treatment as before. These 22 included all four of the codes (i.e., ab, an, fa, and ob) that appeared in more than about 200 MP3 files — quite a few more, in fact: MP3 Diags found each of those four in more than 4,200 files.
Altogether, 46 of the 118 MP3 Diags codes (including the eight codes that I added to account for extra error messages appearing in the log file) did appear at least once in the pre-treatment set of MP3s. After subtracting the 22 codes that appeared at least as often in the post-treatment set, as described in the two preceding paragraphs, I was left with a list of 24 MP3 Diags codes that appeared in fewer files after treatment by MP3val.
In several instances, the improvement was minimal. MP3val eliminated code ad from only 1 of 16 files (6%), code bg from only 1 of 87 files (1%), and code ha from only 4 of 137 files (3%). On the other hand, for six diagnostic codes (namely, ai, ba, bc, bd, da, and fax), MP3val achieved 100% improvement, eliminating all appearances of the code — but those codes had appeared in only one file (presumably not all in the same file) in the test set of 5,119 MP3s.
For seven other MP3 Diags diagnostic codes (i.e., ak, ia, iax, ib, kb, kbx, and kc), MP3val managed to fix fewer than 50% of the occurrences in the pre-treatment set of MP3s. My knowledge of MP3 file structure was not sufficient to support a guess as to why certain fixes would only work sometimes. Possibly re-running MP3val would have improved the outcomes for some files.
The foregoing remarks left me with eight codes on which MP3val achieved nearly perfect results. MP3val eliminated code aa from 19 of the 22 files in which it appeared pre-treatment. MP3val removed all instances of the remaining codes: ae (from 46 files), hd (from 15 files), ja (from 31 files), jax (from 89 files), jb (from 58 files), and kd and kdx (from 10 files each). In essence, those codes addressed problems of duplicate MPEG audio streams, duplicate ID3v1 tags, truncated MPEG streams, and null streams. In a few cases, those summaries seemed to correspond to the code descriptions provided above: different MPEG versions or layers in one file (code A) and truncated file (code D). Otherwise, though, I did not see obvious links between the MP3 Diags and MP3val codes displayed after the MP3val fix.
I wondered whether MP3val made file changes that would be visible to other programs. I was in the habit of using Beyond Compare in tandem with DoubleKiller to make and check my backups. As described in another post, Beyond Compare (or, no doubt, other programs) could be made to look for byte-for-byte differences between two sets of files, and DoubleKiller could do the opposite, seeking byte-for-byte duplicates. I ran DoubleKiller first, to identify those files that remained exactly the same (in terms of date, file name, size, and byte-for-byte contents) as the original (backup) set, after MP3val’s treatment. To my surprise, that step removed 4,901 files, including the dozen or so WAV files I had inadvertently included. I re-ran DoubleKiller, checking only for byte-for-byte differences, to confirm that MP3val did not change file dates or other details unless it also changed file contents. This identified 233 *.bak files created by MP3val in the test folder, as backups, before it made its changes: identical in contents, but having different names.
Those steps left 231 files in the test set whose contents, changed by MP3val, were different from those in the original set. (There should have been 232: 5,133 minus 4,901 equals 232. I did not know what happened to the missing file.) A comparison in Beyond Compare gave me the impression that these changed files had new (i.e., current) dates and times, and that their sizes also tended to be different. I used a DOS command (dir /a-d /s > dirlist.txt) to obtain the details of their sizes, and I put that information into a spreadsheet, to compare against the original files. This comparison highlighted some variation in how much the file sizes had changed. None of the 231 files were larger than the originals; 25 were of the same size as before; and the remaining 206 ranged from being 50 bytes to 39,098 bytes smaller.
I listened to at least the opening few seconds of those 231 changed files. All of the files appeared to be playable. There were three exceptions: the three flagged by Checkmate and MP3-Check (above). I was not surprised, for two of those three; but as noted above, one of the three was playable in its original form. Now, however, I discovered that MP3val had made it unplayable. I also saw that MP3val had converted all three of those files to a size of zero. Those were the only files whose changed size was zero. As noted above, these were the only files to which MP3val assigned its J code. It thus tentatively appeared that an MP3 repair tool might provide a better solution than MP3val’s own repair procedure, at least for files with a J code.
There were, altogether, 11 files whose size had dropped by more than 10KB due to MP3val’s labors. Those 11 included two of the three files, just mentioned, whose sizes had been reduced to zero. (The third one was a small file; its reduction to zero did not involve a loss of many bytes.) By comparing the information provided for the remaining nine files in Windows Explorer > Properties > Details tab (or, in one case where no Details were available, by using MediaTab), I saw that the playing duration of those files had been reduced by one second for seven of them. I listened to the one whose size had been reduced the most, by 20,563 bytes, and whose duration had been shortened by one second. I did not hear any skipping, truncation, or other obvious defects. Both the original and the changed version appeared to have the same metadata (e.g., name of performer, location of their website). Windows Media Player did not display any artwork with either the original or the changed version of that file. I guessed that MP3val had probably found and removed junk data that had nothing to do with the song. I was aware that steganography allowed the concealment of data within JPG files; now a search informed me that this was possible in MP3s as well. Several of these MP3s came from the same musical band, so possibly this had something to do with their style. I didn’t attempt to figure out precisely what data had been there before MP3val did its work.
Repairing MP3s with MP3 Diags
Unlike MP3val, MP3 Diags did not offer one simple choice: either run the command to fix the MP3s, or don’t. The MP3 Diags User’s Guide offered some wise suggestions, but it seemed to assume the user would have the time and knowledge to play around at length — offering, for instance, a set of 14 steps describing what the developer did when he applied MP3 Diags to his own files. As one example from that list of steps, item 4 said, “I look at the ‘All notes’ note list and if I see anything unusual I apply a filter . . .” — which made sense, except that I wouldn’t necessarily know what’s unusual.
Another page in the Guide said, “It’s a good idea to delay any changes until they seem to be needed.” This raised the core question for my whole project: what if I am trying to test a bunch of MP3s that I rarely listen to, but that I would like to have working if I ever do decide to listen to them? Not that I blamed the developer for his philosophy; it was good that the world had this free tool. It was just frustrating to come so far and still not have a reliable way of knowing which MP3s needed to be fixed — other than the clear indication, from Checkmate and MP3-Check especially — that two of my files were nonworking.
The Guide said that MP3 Diags offered “20+ transformations.” To see the list of available transformations (i.e., repairs or changes), I tried clicking on the first of the five hammer icons near the upper left corner of the screen. That gave me a list of 17 choices (e.g., “Remove broken streams”; “Repair VBR data”). The Guide clarified: “Those that seemed quite unlikely to be needed are hidden” by default. To change what was visible in that list, I could go into Configuration > Visible Transformations tab. There, I saw that the program offered another 14 transformations, bringing the available total to 31. In Configuration > Custom Transformation Lists tab, I also had the option of changing the list of transformations that would run if I clicked on one of the four numbered hammer icons. Among those four numbered hammer icons, the list provided by No. 4 seemed the most comprehensive, with 13 of the 17 default transformations.
Choosing any of the hammer icons produced this warning:
Along the same lines, I saw that the Guide offered these four tips:
- You shoudln’t [sic] blindly use the default lists. Try to understand what each transformation does and how it may affect your files, then perhaps go to the configuration and make some changes to the lists so they better fit your particular needs.
- Review the order in which transformations get applied. Sometimes it doesn’t matter but usually it does.
- Look at what the other transformations are doing, and change the lists to better suit your needs. Then test your changes.
- The same transformation may be added multiple times, but usually there’s no need for such a thing.
The Guide further said,
[U]sually it’s better to look at all the notes instead of focusing on any particular one, because many times one issue can be seen as the “real problem”, but it causes several notes to be shown, and focusing on the “wrong one” might lead to confusion.
Further, the MP3 Diags homepage said,
[T]his program is not really meant for those who want to push a button and have everything fixed automatically, but rather for people who have some technical background, who are rather picky about what’s in their MP3 files, and who are willing to spend some time to get them right. Automatic fixing of non-ID3V2 issues can be done using the 4th custom transformation list, which is as close to “fix all” as MP3 Diags can get, but some may not be so happy with it.
If you want to use those parts of MP3 Diags functionality that alter existing files, it is highly advisable to back up your files first, because there is a real chance of data loss. It can be caused by both user error and bugs in the program.
Now if you wonder just how big this chance is, I don’t have any answer. As far as I know, there are no bugs that may cause data corruption, but there may be some that I don’t know about. I didn’t lose any data in my own files, but you’ll have to try it for yourself.
So I was beginning to get the message. The developer was emphasizing that MP3 Diags could seriously screw up files. He knew what went into the program; he was positioned to know what he was talking about. He, himself, had not lost data, but that’s not to say he hadn’t had to restore an accidentally damaged file from the backups he kept emphasizing. He kept saying that the program was not designed for one-click file fixing, and that those who used it that way might not be happy.
The developer’s warnings pushed me toward certain conclusions. One was that I was going to do exactly what he advised against: I was going to run an automatic repair, using every one of those 31 available transformations, against a clean copy of my test set of 5,119 MP3 files. The other was that I was not going to have the knowledge, or invest the time, to do the kind of perfectionist tinkering he was talking about. Pending whatever else I might learn or decide in this investigation, it presently appeared that, if I had multiple non-identical copies of a set of MP3s, I might have to keep them all, until I had time to listen to them and manually determine which ones were good — or until some superior MP3 verification program came along. Alternately, there was always the option of concluding that very few MP3s seemed to be seriously flawed, doing one’s best to arrive at a seemingly solid set, and discarding the rest.
In MP3 Diags, I went into Configuration > Custom Transformation Lists. I clicked on the hammer icon for the fourth custom list, and changed it to include all 31 available transformations. (Note: it would repeat some, thereby running a total of 40 or 50 transformations, if I didn’t clear the “Used Transformations” box first.) Back out in the main screen, I clicked that fourth hammer icon. It repeated the red-and-yellow warning shown above, and then confirmed that I wanted to run all those transformations.
The Guide said that MP3 Diags would apply only those transformations that were relevant to a particular file. It appeared, however, that MP3 Diags could get stuck in a loop. When it got to the 42nd file on the list of 5,119, it spent seven hours repeatedly cycling through several steps (especially “Repair VBR data,” “Remove all non-audio streams,” and “Rebuild VBR data”) before I finally clicked Abort. The file in question was not one that had stood out in any previous analysis. In fact, MP3val had made no changes in the file at all. MP3 Diags had identified a half-dozen problems (ab, ad, an, bg, fa, ob), but nothing remarkable. It just seemed to be one of those situations where one dog sees the other dog, and it’s war. The original file, which I confirmed was playable, had a duration of 41:42, but it looked like MP3 Diags was trying to change that, in a way that confused my audio players: IrfanView reported a playing time of 66:43, while Windows Media Player and MediaTab reported durations of only 33:21. I did not try to verify whether MP3 Diags was attempting to whittle the file down to nothing, or was instead trying to expand it to infinity. My guess was that, in fact, there was something not quite right about the file, and MP3 Diags was trying to fix it, and the result might even have been technically superior. Nonetheless, the devotion of seven hours to perfection of a single forty-minute 5MB recording spoken text did seem less than perfect.
When I restarted the MP3 Diags fix process, it started the same process all over again. I had to remove that file from the set. With that out of the way, it marched on down the line to the 83rd MP3 on the list. Here, again, after ten minutes of looping through those same several steps (e.g., “Rebuild VBR data”), I had to abort, remove the file from the folder, and rescan the list. The same thing happened again, with the new 83rd MP3 on the list.
It was starting to look like a long road to 5,119. I replaced the files that had been changed, so that I had a fresh new test set of 5,119 MP3 files, and looked at the transformations I had selected. The color coding mentioned in the User’s Guide, visible on the main MP3 Diags screen, was not used here, so I couldn’t tell which of these transformations would be most significant; perhaps there was no necessary connection between the seriousness of a diagnostic code and the type of transformation that could fix it. I noticed that “Repair VBR data” was included in the program’s first default custom transformation list, so presumably the developer felt relatively confident of that one. I didn’t see “Rebuild VBR data” or “Remove all non-audio streams” in any of the first three custom lists, and of course I had changed the fourth one, so at this point I had no idea what its original contents might have been. The User’s Guide provided a list of transformation details, and there I saw that “Remove all non-audio streams” would remove Xing and LAME headers and was recommended “if you want to start clean.” In my newly cautious state of mind, that sounded a little radical. I decided to revise my fourth custom list to exclude that one.
Actually, as I was thinking about it, I didn’t know that I would generally want to remove headers of any sort, unless they were problematic, and likewise with other non-audio data (e.g., “Remove composer field”). I revised my custom list to exclude those transformations as well. I also removed the “No change” transformation, as well as the “Pad truncated audio” (about which the developer said, “Its usefulness hasn’t been determined yet – it might be quite low”), the “Remove composer field,” and the “Make the largest image ‘Front Cover'” transformations. On second glance, I decided to remove the “Rebuild VBR data” transformation after all: it, too, seemed to involve deletion of potentially valid metadata.
That left me with a list of 21 transformations in my fourth custom list. I rescanned the folder and ran MP3 Diags on it again. This time, it finished the scan in maybe five or ten minutes. This seemed to tell me that, even if MP3 Diags spent just a couple of minutes trying to repair a file, it was probably hung up and needed to start over.
It appeared that the selected transformations had made a difference. MP3 Diags was now presenting a much smaller set of diagnostic codes in the top right part of the screen. Beyond Compare told me that the transformations already applied had altered the large majority of files in the folder. There didn’t seem to be a way to get a count of the total number of code occurrences now displayed in the MP3 Diags screen, but it looked like codes ab, an, dp, and ea were the most common ones left. All were in black print, indicating they were warnings. Regarding those codes,
- I couldn’t do anything about code ab, regarding low-quality audio: the voice recordings most likely to trigger that warning were what they were.
- Code an called for the use of MP3Gain, and I could have taken care of that at this point.
- Code dp wanted me to add track information. I could have used MusicBrainz, KeepVid, or one of the other MP3 tag data sources that came up in a search. Apparently these would help to automate the detection and naming of music MP3s. But (a) I didn’t want my files renamed and (b) non-music recordings (e.g., my recording of a bird I heard on the trail) would tend not to be in those databases.
- Code ea noted the absence of an APIC frame, which is used to store images. The User’s Guide didn’t offer guidance on how to add an APIC frame. A search of the MP3 Diags website yielded nothing. A broader search led to a support page where the developer said there was no transformation for this because there was no way to do it reliably. The solution, he said, was to go into the MP3 Diags tag editor and assign an image manually.
I wanted to get rid of those commonly occurring errors so that I could see more clearly the remaining codes that might call for other fixes or transformations. Clicking on column headings, in the MP3 Diags main window, did not re-sort the listed files according to those that had a particular code. But the User’s Guide informed me that I could filter out these very common codes, so as to highlight the smaller set of files that had other codes. To do this, I had to click on one of the two funnel icons, indicating filters, at the upper left corner of the window: specifically, the yellow “note” filter, not the blue “folder” filter. In that filter, I included all notes, and then excluded the ones just mentioned.
That greatly reduced the number of files listed: I was down to only 105. I was perplexed to see that codes dp and ea were still being displayed, but then I realized that was probably because these remaining files had other codes; the window was showing all codes for all remaining files. I exported the list to a file and analyzed it in my spreadsheet. Code ac (“No MPEG audio stream found”) was the most frequent remaining code, appearing in 72 files. That was interesting. I had not previously used the number appearing at the right end of the filename line in the MP3 Diags exported text, but now I looked at that number. For example, a line in the exported text might read like this:
It seemed that number (1024 in this example) might indicate the size of the file — that, in other words, this MP3 now contained only 1,024 bytes. Most if not all of these files with code ac had that value of 1024 at the right end of their lines, in the exported text. I tried playing several of the changed files. IrfanView reported that they had a length of zero seconds. This appeared to be an example of what the MP3 Diags developer was talking about, when he said that MP3 Diags was capable of damaging files.
Aside from the handful of codes just discussed (i.e., ab, an, dp, ea), the files marked with code ac appeared to have no other problems. In my spreadsheet, after removing the ac files, there were only 32 files left. Of these, 15 were marked ad (“VBR with audio streams other than MPEG1 Layer III might work incorrectly”). To try fixing that, I clicked on the leftmost hammer icon, at the top of the MP3 Diags window, and from the drop-down list, I chose “Repair VBR data.” I got an error message:
The file “D:\Filename.mp3” seems to have been modified since the last scan. You need to rescan it before continuing.
I exited that and clicked on the gear icon at the top left of the MP3 Diags window to do the rescan, not checking the box that offered to “Rescan files that seem unchanged” so that the scan would run faster. I re-ran the filter and retried the “Repair VBR data” transformation. There still seemed to be a bunch of files with code ad. Maybe the window wasn’t updated? I rescanned and refiltered. No, the ad codes were still listed after a rescan, even if I checked the box to “Rescan files that seem unchanged.” I tried the “Rebuild VBR data” transformation instead. That didn’t do it either. I could have tried one of the other transformations (e.g., to remove unknown, broken, or unsupported streams) — but, as we had just seen, some MP3 Diags transformation could ruin MP3s.
So now I had eliminated or given up on a number of codes: ab, ac, ad, an, dp, ea. The next most common code was cb (“VBRI headers aren’t well supported by some players. They should be replaced by Xing headers”), appearing in eight of my 5,119 MP3s. MP3 Diags didn’t seem to offer a transformation directly targeted on that particular issue. I went into Configuration > Custom Transformation Lists. The full descriptions suggested that what I needed, here, was the “Remove inner non-audio” transformation. That one was available in the hammer icon’s drop-down list. I ran it, rescanned all files, and reapplied the filter. Code cb appeared to be gone.
Now I was down to eight codes (i.e., aa, ag, ah, aox, ba, bd, bh, ec), each occurring no more than five times each in my set of 5,119 MP3s. Collectively, these codes affected a total of 10 files. I could have continued to explore solutions to those codes, but I seemed to have the basic idea and, after all, this was only a practice run.
Review of MP3 Diags Results
It seemed to be time to do a general assessment of what MP3 Diags and I had achieved with these 5,133 files. I turned off the filter and exported the current state of MP3 Diags in text form. I put the exported file into my spreadsheet and compared it to the spreadsheet showing the errors found in the original, unaltered set of 5,133 files. The comparison demonstrated that the situation was improved in 39 of the 46 codes that had applied to one or more of the original files. For example, there were 22 occurrences of code aa before I took the foregoing measures with MP3 Diags, but only three occurrences afterwards. In contrast to those 46 codes found in these files before the MP3 Diags improvement effort, only 14 codes were found in these files afterwards. Other parts of the analysis of what MP3 Diags had achieved previously (above) also applied here. For instance, MP3 Diags achieved considerable success with some codes (such as code aa) but only moderate to minor success with other codes.
The situation was notably worse with respect to just two codes: dp and ea. While there were few if any occurrences of these codes before my efforts, there were nearly 5,000 occurrences of each afterwards. Both of these involved the absence of ID3v2 tags, which were probably scarce in these files to begin with and had apparently been stripped out in my tinkering. Even then, there was a tradeoff: the increase in these two codes was accompanied by the virtual elimination of codes fa and ob, each of which had been present in nearly 5,000 files before I began playing. (The frequencies of codes ab and an remained almost unchanged, each being present in more than 4,000 files before and after.)
There was, however, the problem that, with MP3 Diags, I had managed to convert a number of my files to zero bytes — that is, to wipe out their audio contents. That would presumably have been indicated by code ac (“No MPEG audio stream found”). And, in fact, there were 72 occurrences of ac, both before and after my tinkering, and they occurred in the same files. The problem, as I verified by some sample listening, was that before my tinkering, those files were playable. MP3 Diags may have felt that they lacked an MPEG audio stream, but they did have an audio stream of some sort. It seemed that the post-tinkering results should have produced 72 occurrences of kd (“File contains null streams”), but in fact there were ten occurrences of kd before I went to work, and zero occurrences afterwards. (Sampling indicated that those ten remained playable both before and after my tinkering.)
Undeniably, I had done damage in my reckless use of MP3 Diags. The point here is that certain key diagnostic codes seemed to be reporting problems that were either relatively minor or nonexistent, while failing to report major problems. As just illustrated, when I exported the results of the last post-transformation MP3 Diags scan after I had finished fooling around, I expected to see at least 72 more occurrences of some code that would alert me to the fact that I now had 72 more files containing no playable audio data. That warning did not seem to be present in the final report from MP3 Diags.
With additional experience, of course, I might come to understand what I was doing wrong, or might find some explanation or workaround for these results. Based upon numerous hours of working with MP3 Diags, however, and with what seemed to be a reasonable interpretation of the information that I had encountered thus far, I did not feel that I could trust MP3 Diags to alert me to potentially major problems with my MP3s. I had no doubt that MP3 Diags provided many accurate and helpful diagnoses, and that it offered valid fixes to many problems. It seemed likely that careful use of those fixes by an experienced user, with much testing and also with, perhaps, the investment of time to listen through the full length of repaired MP3s (and always keeping backups), could ultimately improve a user’s collection of MP3s. That seemed to be what the developer had been saying, and now I believed him.
The foregoing findings raised the question of whether I could rely on MP3 Diags, for purposes of identifying MP3s with potentially major problems. To explore that, I compared the MP3val reports from the original test set of MP3s against this set that MP3 Diags had changed. With a copy of mp3val.exe in the test folder, I used the same command as before:
mp3val *.mp3 -lMP3val.log
I pasted the contents of the resulting log files into spreadsheets and compared them. This comparison of MP3val before-and-after analyses indicated that, before using MP3 Diags, there were no occurrences of three MP3val diagnostic codes (i.e., H, M, O). After using MP3 Diags, two of those codes occurred in seven files. The MP3val analysis indicated that several other codes (i.e., A, B, G, N, P) occurred in between one and three files before I ran MP3 Diags, but in at most one file afterwards. The analysis further indicated that MP3 Diags achieved reductions of 97% to 100% in occurrences of a half-dozen other codes (i.e., C, D, E, F, I, K). Except for code F, which MP3val found in 4,702 files, those codes occurred in between 30 and 148 files before the treatment.
That left two codes whose numbers increased after the MP3 Diags treatment. Before the treatment, code J (“Too few MPEG frames (it’s unlikely that this is a MPEG audio file)”) occurred in only the three problematic files identified by various programs (above), and code L (“Wrong CRC”) occurred in 30 files. After the treatment, J occurred in 72 files, and L occurred in 71 files. They were not the same files; J and L occurred in completely different sets of files. As noted earlier, J seemed especially useful for spotting nonworking files. The 72 files flagged by J here were the same as the 72 flagged by MP3 Diags’s code ac after my MP3 Diags tinkering. Unfortunately, as noted above, MP3 Diags had also given the ac code to those 72 files before my tinkering, whereas MP3val had given the J code to only three. So, again, it did not appear that MP3 Diags had a code that would usefully focus on problematic MP3s that had not yet been ruined.
The foregoing results seemed to confirm that MP3 Diags could achieve significant reductions in the numbers of files flagged by MP3val. The reductions were largely similar to those achieved by MP3val itself (above). Aside from J, L, M, and O, the only significant exceptions were with codes F and K, which MP3val itself was almost invariably unable to fix but MP3 Diags almost completely eliminated. This outcome suggested that MP3val might be sufficient for most fixes, and that MP3 Diags would provide a different level of capability, for those with time and expertise sufficient to go that additional distance.
As noted above, another way of investigating what MP3 Diags had done was, optionally, to run DoubleKiller (to eliminate files that remained unchanged in the before and after file sets), and then use a DOS command (dir /a-d /s > dirlist.txt) to produce lists of files that I could compare in Excel. That approach was not promising in this case because, as DoubleKiller quickly reminded me, my use of MP3 Diags had changed every MP3 in the output folder.
DoubleKiller also highlighted the fact that I had a bunch of new duplicates because a number of my files had now been truncated to a size of 1KB (i.e., 1024 bytes, above). This raised the question of whether the duration had changed for any other files. Another post describes the steps I took to figure that out, exploring ffprobe but ultimately choosing a MediaInfo command to produce the desired duration information for the original and modified sets of MP3s. As indicated in that other post, MediaInfo reported no duration information for some files. Those included the files that my MP3 Diags tinkering had destroyed. For several dozen MP3s, MediaInfo was not able to calculate duration for the original files, but was able to do so for the post-transformation version of those files. In those cases, it appeared that MP3 Diags had made the files more recognizable to MediaInfo. (Both the pre- and post-transformation versions of these files were playable.)
My spreadsheet calculated pre- and post-transformation duration for the vast majority of MP3s in the test set; and among those calculations, duration rarely changed. There were no increases in duration. MediaInfo reported reductions in duration, apparently caused by the MP3 Diags transformations, for about 2% of the test files. Most of those reductions were measured in milliseconds; only three were longer than one second. I had played parts of those three previously: they had been flagged by some other file or procedure, though at this point I was not sure which one. I verified once again that they were still playable. Generally, it appeared that MediaInfo could provide useful information about large numbers of files, and that, as such, it could be a useful supplement to other MP3 tools.
* * * * * * * PART TWO: APPLICATION * * * * * * *
Part One of this post describes my detailed exploration of various ways of improving and testing MP3 playability. This Part Two attempts to distill a streamlined, working approach to the question of whether one’s MP3s have major problems.
For this purpose, some context may be helpful. In my attempts to find and repair damaged MP3s, I was pleased to see that very, very few of my MP3 files seemed to have serious problems. A person might fear otherwise, after seeing the number of diagnostic codes that the MP3 Diags tool might give to a single MP3. And I could not be absolutely positive that my MP3s were in great shape unless I listened carefully through all of them — which could be time-prohibitive even with a substantial number of MP3 song files, never mind albums, concerts, lectures, and other long recordings. Tests with a number of different tools, described in Part One, did seem to keep coming back to the same few seriously damaged files. So, in that sense, it could be pretty easy to run a scan and identify the (probably) few files that deserved a full listen. We will be covering that, but this Part Two goes beyond that, to distill what seem to be the best methods for running a variety of tests and repairs on MP3 files. In some regards, the following material expands upon certain information presented above.
The first step in MP3 analysis is, of course, backup. For present purposes, two different kinds of backup seem advisable. First, every user should have a standard backup system that regularly captures the current state of one’s files. I have my own backup system; there are many others. Backup should include an offsite copy of one’s files, presumably on standard or optical encrypted disk(s), stored at least sometimes in the car or garage (away from sun and temperature extremes) or, better, at a friend’s house or in a safe deposit box.
In the case of these MP3s, another kind of backup seemed advisable. Since it was not feasible to obtain bulletproof certainty that every file in my primary set of MP3s was in good condition, it seemed advisable to pull together a pool of alternative copies. Suppose, for example, that I had a file called ABC.mp3, and I also had another file called ABC-Backup.mp3. Maybe ABC-Backup.mp3 came from an old backup disk. Suppose, now, that I believed that ABC.mp3 and ABC-Backup.mp3 were virtually identical copies of the same file. Could I get rid of ABC-Backup.mp3? My feeling, from this exploration, was that the state of the art was not settled enough to be entirely confident of what I really did have. The safer route seemed to be to park ABC-Backup.mp3 on a separate drive, or maybe on a Blu-Ray or DVD disc, after renaming it (maybe adding “DUP” to its name or extension) or adding a read-me file, to make clear that this file, and the others accompanying it, seemed to be mere spare copies. That way, if I found someday that my main copy of ABC.mp3 had a problem, I could try this alternate.
The problem, there, was not with byte-for-byte duplicates. Years of experience with DoubleKiller had persuaded me that it (and no doubt other similar tools) could detect and remove those. The problem was with files that had more or less the same names, sizes, and dates, but could not be verified to be virtually identical unless I took the time to listen through them. To use less disk space, I could reduce similarly named WAVs to MP3 (preferably of a high bitrate, so as not to lose quality). But ultimately, with my present level of experience and the tools available, it did seem best to keep these alternate copies on a separate disk, with clarifying notes or filenames.
That would leave me with my primary set of MP3s. These were the ones I would want to test and, where necessary, replace from the secondary set, whenever I found a bad MP3 in the primary set. The question here was, how should I proceed to test them?
The first question, in that regard, was whether I would leave them where they were, or would instead combine (or make copies of) them into one folder. A decision on that would probably depend on a choice among the options described in the next section (below). Generally, though, if I intended to compare different tools or procedures, or if I planned to do anything that would change the files (as distinct from merely testing them), I would probably want to use separate Changed and Original folders, each containing copies of the files in question.
There were various ways of copying, or moving and then restoring, the files to be tested. For instance, I could easily put copies of various MP3s into a separate folder by using the Everything file finder (searching for *.mp3) and then by pasting into a single folder, using a file manager like Q-Dir rather than Windows Explorer because it would allow files with the same names, coming from two or more different folders, to coexist within a single folder by renaming them as File (1), File (2), etc. Note that Everything would balk if pathnames were too long.
Alternately, if I wanted to work with files in their original folders — or if I wanted to detect their original locations, move them to a single folder, and then return them to their original locations — I could use the DIR command or Everything to export a list of files, and use that list (with Excel or some other spreadsheet program) to create a batch file issuing the relevant commands for large numbers of files. Another post describes relevant steps, using the particular example of an effort to test MP3s by converting them to WAVs.
For my own real files, I followed the approach I took with the test sets discussed in Part One: I used Everything to copy them into a test folder, and made a copy of that folder for backup and comparison.
Positioning MP3 Diags
In deciding how to proceed with my files, one question was whether I would be willing to use the command line, rather than be limited to graphical user interface (GUI) tools. In many computing situations, it is possible to do more with the command line. In the world of MP3 testing and repair, MP3 Diags was a key exception, offering a variety of tests and repairs in a relatively understandable and usable graphical form.
There was also the question of how much time I intended to devote to the process of testing and repairing MP3s. On one extreme, some users would want perfection in their MP3s, and would be willing to spend the requisite time and effort. On the other extreme, some users would want to press one button and let everything be fixed – and if it took much more than that, they would not be interested.
My experience, supported by the User’s Guide, was that MP3 Diags was intended much more for the perfectionist audiophile than for the casual user. This meant that the program could do things that possibly no other program could do; it also meant that those who did not know how to do those things properly might damage their MP3s.
I did have some concerns about MP3 Diags. As far as I could tell, it had failed to focus specifically on the most problematic files, or at least it did not appear to have an error code that would identify those files. An experienced user might have known the explanation behind that seeming failure. It might be, for example, that I needed to search for, not one error code, but a certain combination of codes. Yet this would tend to point toward the same conclusion: that I, personally, could not count on getting good results with MP3 Diags. Beyond that, I didn’t blame the program for the files that were ruined when I used the program recklessly – but I did conclude, from that episode, that I should hesitate to use MP3 Diags to repair MP3s unless I was going to invest the time needed to use it correctly.
Based on these considerations, I decided that MP3 Diags could play a useful supplementary role. It could provide diagnoses that would confirm or challenge the findings of other programs; it could highlight minor (or possibly major) issues that I might want to address at some point. If I did reach a point of caring more about tags than about basic playability, or if for any other reason I wanted or needed to repair technical flaws in my MP3s, it might prove irreplaceable.
With MP3 Diags largely set aside, I felt there was really no substitute for MP3val. In that program and others, I also found that the command line was far more useful than the GUI. For purposes of identifying unplayable files, then, my plan was to look for code J, in the MP3val results, and compare what Checkmate and MP3-Check told me.
MP3val did not offer an option of identifying just those files with a specific code, so I would need to run its output through my spreadsheet, in order to detect files flagged with code J. Now that I had refined my spreadsheets, this would not be too difficult.
As another way of identifying seriously troubled files, I could use spreadsheets to parse the output of the DIR command and of MediaInfo and/or ffprobe, as described in the other post. In that analysis, I would look especially for files of zero length or 1024 bytes in size. MediaInfo had proved imperfect, producing no data for some files, so possibly I would want to try ffprobe first, even if its two-line output would require a bit more spreadsheet tinkering to get a usable report. It was worth remembering that these two tools were capable of producing many other bits of information, such that a person interested in more detail could choose to become expert in them rather than in MP3 Diags.
To run these programs, of course, I would need to add their folders to the computer’s PATH, or just put the executable (e.g., mediainfo.exe, ffprobe.exe, or mp3val.exe) in the Test folder where I would be examining and possibly changing files.
Now it was time to run the MP3val analysis on the files I had collected. Unlike the test situation in Part One, these were copies of files that I actually cared about. I ran the same command as before:
mp3val *.mp3 -lMP3val.log
I copied the contents of MP3val.log into a copy of my spreadsheet, and used the instructions included in that spreadsheet to highlight the seemingly most problematic files. They were of three types. First, one file was marked with the J code discussed in Part One. Also, a number of files were marked with multiple codes. The one marked with J was marked with seven codes altogether; two others had five codes each, and a dozen had four codes. Further, in a development not observed in Part One, three files were labeled with ERROR (as distinct from WARNING or INFO) messages.
Consistent with my conclusions in Part One, I ran Checkmate and MP3-Check on this set of MP3s. Each flagged the same set of four files: the three that MP3val had marked as ERROR files, and the one that had code J. I had not found that Checkmate was highly consistent with other tools, in terms of the files that it flagged as “Broken,” and I did not further explore that with this set of files. As before, MP3-Check agreed with MP3val and MP3 Diags that the large majority of my files had problems with tags and other less crucial issues. Generally, it continued to appear that these two programs offered a good quick way of finding files that, as other programs would agree, were especially likely to be unplayable, without burying the user in a long list of false positives (i.e., MP3s reported as problematic that actually were playable).
I listened to the start of each of the 18 files falling into those three categories (ERROR, code J, and/or having at least four MP3val codes). All had a reported file size, in Windows Explorer, of greater than 1KB. All played except the three labeled with ERROR. Those that played had two relatively noticeable problems. For some, the file information may have been screwed up, insofar as IrfanView was apparently unable to report their playing duration until about seven seconds into the recording. For others, the problem was that MediaTab reported they were actually WAVs, not MP3s as their filenames suggested. For the three marked as ERROR, IrfanView (unlike Windows Explorer) reported zero file length and would not play, and Windows Media Player crashed when I tried to use it to play them. So in this case, unlike the test in Part One, it seemed that it was MP3val’s ERROR label, not code J, that would identify a broken MP3.
Altogether, then, out of many thousands of audio files in the test set and in this set of “real” MP3s, the relatively extensive exploration in Part One had identified the two unplayable files and one playable but repeatedly flagged MP3 in the test set; using insights from that exploration, I had now identified three other unplayable files (two of which were from the same folder and had been created 12 years earlier); and there was one other unplayable file that I had kept aside for further testing someday. Altogether, that gave me a total of seven really bad files — out of thousands, many of which had been copied from disk to disk over a period of years. It tentatively appeared that MP3 files might contain many imperfections that a program like MP3 Diags could identify, but MP3s generally seemed fairly durable and quite unlikely to become completely unplayable.
Fixing Bad MP3s
As noted in Part One, I had managed to ruin some files by using the MP3 Diags repair tools without sufficient technical familiarity with that tool, and MP3val had rendered one previously playable if problematic MP3 unplayable, and had also wiped out the contents of two unplayable but possibly reparable MP3s. I did not have time to listen through all of the MP3s that these two programs could attempt to repair, to verify that their repairs had not caused any further damage.
So if I wanted to fix bad MP3s, it seemed I might want to target my efforts on the seven unplayable files I had identified. I collected them into one folder, made a backup, and went to work.
In Part One, I had not found that an ordinary audio editor (e.g., Audacity) could repair bad MP3s, but I had noted free tools that were reportedly able to fix damaged MP3s. I tried this approach with the portable MP3 Repair Tool (whose reported homepage seemed unresponsive). It turned out that MP3 Repair Tool took a very simple approach. As described by Lifewire, I could start by selecting some or all of the files in a folder, and then remove frames from the start and/or end of the file. That’s it. That’s all the tool did. Lifewire suggested starting with removal of zero frames from the start of the file. The program itself explained that this would remove “ID3v2 tags” and “corrupt ID3v2 tags,” which sounded redundant, along with perhaps other extraneous material.
So, OK, I tried that. I navigated to the test folder, containing my seven bad MP3s; I clicked “Select All”; I clicked “Remove 0 frames”; and then “Repair!” This returned an error, for the first file on the list:
Unexpected error occured [sic] while repairing the file [filename]!
The six other files seemed to go OK — I got “Selected MP3 file(s) have been repaired!” I tried playing them. I found that, still, none of them were playable, except the one from the test set that had been repeatedly flagged, that I found had low but audible volume.
So I tried again, this time trimming one frame from the beginning of each file, as Lifewire suggested. I got the same messages and the same results. I removed the checkmark from the option to remove additional frames from the start of the files, and instead checked the box to “Remove everything after the last frame of each file.” That produced no error message, just the “Selected MP3 file(s) have been repaired!” But an odd thing happened to the MP3 Repair Tool display: we now seemed to have at least twice as many files in the list as before. Windows Explorer concurred: in addition to my original files, I now had the same filenames with “_repaired” added to the end of each. I tried playing those repaired files. I found that none of them worked. Oddly, one of those files — the one that I had previously set aside, to hold until the day when I would be testing MP3s — originally had a file size of only 1KB, suggesting that there might be nothing left to save; but now, in Windows Explorer, I saw that the allegedly repaired version had a size of 2KB. With the exception of one file that was somewhat smaller after repairs, the others all seemed to be the same size as before.
I concluded that MP3 Repair Tool did not work, at least not for the problems found in these unplayable files. I deleted the test set and replaced it with a fresh test set from the backup folder. I tried opening them in Audacity. It said,
Audacity did not recognize the type of [filename]. If it is uncompressed, try importing it using “Import Raw”.
Audacity was able to read only the one quiet but playable file. Cool Edit 2000 tried to read them all, and gave me options to guess what settings (e.g., 32kbps) might have applied to the original files; but with the default settings, it achieved the same results as Audacity.
I was not able to invest more time in an exploration of MP3 Diags for purposes of this repair project, but I did decide to try the MP3val repair command once more. As discussed in Part One, that command was as follows:
mp3val *.mp3 -f -lMP3val_RepairLog.txt
The log file indicated that MP3val believed it had FIXED three files — the same ones that it thought it had fixed in Part One, and with the same results as before: the length of the files had been reduced to zero (including the one that had been quiet but playable), leaving them nonworking and incapable of being repaired by any future MP3 repair tool.
* * * * * * * CONCLUSION * * * * * * *
This post describes an extensive exploration and testing process. The purpose of this effort was to learn more about detecting and repairing MP3 files, especially those that were unplayable.
My travels, during this exploration, did not suggest that the world is eagerly seeking the best and easiest way of repairing MP3s. That seemed consistent with my finding, which was that, among thousands upon thousands of MP3 files, very few would be so seriously damaged as to be unplayable.
That finding raised the question of what to do about my original mission, which was to decide whether to dispose of older copies of those MP3s — not identical backups, but rather versions that might differ only slightly from my primary set. It presently appeared that I might never need any of them, as the older set did not include copies of the handful of unplayable MP3s I found in this research.
Although I did not explore tag editing and repair in much detail, the foregoing impression seemed to apply there as well. My passing observation was that many users did insert text, images, and perhaps other information into the tags of their MP3s, but that few seemed to be running into tag-related problems calling for a tool with the power and complexity of MP3 Diags.
I had no doubt that the errors identified and repaired by MP3 Diags, MP3val, or other tools would be important to some users. They might be important to me too, if I had noticed more problems with my MP3s. I did not seek, and thus may have overlooked, a tutorial that might have given me a better understanding of the significance of various problems that these tools might identify, and of the repairs they might make.
For my purposes, it seemed that these programs could have more beneficially focused upon developing reliable means of detecting and repairing the most serious MP3 problems. Unless I was willing to invest the time to learn the arcana of MP3 Diags — and even then, in my reading of the developer’s comments — I could readily find, as I did in fact find, that both that program and MP3val could actually make damaged files worse.
While exploring MP3 Diags and MP3val, I looked at a handful of other commonly mentioned MP3 testing and repair programs. I found that most were unreliable and/or unhelpful for my purposes. The principal exceptions were Checkmate and MP3-Check. Those programs might provide results that were not always consistent with other programs or helpful for my purposes. But they did seem consistent with each other, and with MP3val, for purposes of identifying the most severely damaged files. A person could run them quickly and easily to get a seemingly good sense of which, and how many, files were bad. I might have gotten the same thing from the MP3val GUI, but I wasn’t sure; I hadn’t explored that in much depth. Regardless, I would still go to the MP3val command line for reliable and detailed information.
One unexpected development, in my exploration, was the discovery of MediaInfo and ffmpeg/ffprobe, as command-line (or GUI, in the case of MediaInfo) tools with great power to test (and possibly repair) a variety of media files. Although I did not explore those programs at length, it appeared they might provide at least as much information as MP3 Diags, without the frequently helpful MP3 Diags GUI but also without some of that program’s peccadilloes.
I found that MP3 Diags and MP3val were much clearer and more useful when I put their log results into a spreadsheet. I developed spreadsheets, with instructions, for both of those programs’ output, and also for MediaInfo output. Those spreadsheets may be useful for users interested in more detailed analyses of such output. All three spreadsheets are combined in the file, “MP3 Analysis Spreadsheets.zip,” available for download at Kiwi6, Dropbox, and Google Drive. Please let me know of any download problems.