As indicated in another post, I was trying to use MP3 Diags to identify problematic MP3 files. This post describes the steps I took in that effort. This is an unfinished inquiry; these notes are for the benefit of those who are wrestling with MP3 Diags as I did.
This discussion includes suggestions that may be of interest to the developers of MP3 Diags and other similar programs. As always, pointers to other related tools are welcome, as are comments in general. It goes without saying that I knew MP3 Diags was free, and I greatly appreciated such volunteer efforts. These suggestions are offered as just that — suggestions, not demands or expectations.
So let us proceed. I had downloaded and installed MP3 Diags 1.0.12.079. As indicated in the User’s Guide, the program would scan a designated file or folder (or drive partition) and produce a list of MP3s, along with an indication of errors appearing in those MP3s. In my version, the program was able to identify a total of 108 different kinds of problems.
Those errors appeared to range in importance. In the long list of MP3s that the program examined on my computer, I was overwhelmed by the number of seemingly troubled files, and the numbers of errors found in those files. I guessed that there were probably simple ways to fix many of those errors, but I didn’t want to become sidetracked with a separate MP3 cleanup project at this point. I just wanted to know which files were in bad shape.
The specific situation was that I had a backup, and I wanted to know whether I should try to restore the backup copies of any of these files. It seemed likely that many of the errors detected by MP3 Diags had been present in the files when first created. In that case, restoring the backup copy would not help anything; it could actually be a step backwards, if it eliminated edits or other improvements that I might have made in the files since the date of the backup. It seemed better to have a specific list of files to restore from backup, and a clear idea of why I was restoring the backup in each case.
First Approach: Eliminating Frequent Errors
At the start, my feeling was that I didn’t care about quality. I knew that some of my MP3s were compressed at a low bitrate. My question was just whether the file was working or nonworking, or at serious risk of becoming nonworking. I didn’t see a page devoted specifically to critical file errors in the Table of Contents for the MP3 Diags User’s Guide, nor a discussion of which errors were most important for basic file functionality (and a search was not helpful), but I was able to get the detailed description of each of the 108 possible problems by clicking on the All Notes button in the program’s center (file information) pane.
Upon reviewing those descriptions and the scan results, I decided upon certain adjustments. First, I decided to exclude errors that, according to MP3 Diags, were pervasive throughout vast numbers of MP3s. As a practical matter, those problems did not seem to have interfered with my use of these MP3s. So I excluded error codes ab, an, fa, ha, and ob. I tried to achieve that exclusion by clicking on the funnel-and-yellow-object icon, near the left end of the MP3 Diags toolbar (tooltip: “Filter by Notes”), and moving those error codes from the right pane to the left pane of the resulting Note Filter dialog. But a strange thing happened: that Filter icon seemed to toggle among three different states: it would show all 108 error code columns; or it would show the Note Filter dialog; or it would hide the column for code ej, which I had not altered.
I was looking for a way to filter out unwanted information, which in this case would include (a) columns for error codes that I was not interested in and (b) files that did not exhibit those unwanted error codes. The relevant User’s Guide page seemed to say that I was doing it wrong: apparently I should have started by putting all 108 notes into the left pane of the Note Filter dialog, and then should have moved codes ab, an, fa, ha, and ob to the right pane. When I tried that, I could see that it did change something in the display, but it still failed to produce the intended effect. Any way I tried it, I was still seeing the columns for codes ab, an, fa, ha, and ob, and I was still seeing files marked with those codes. It might have helped if I could have sorted the file list by clicking on one error code or another, but that did not seem to be an option.
Going to the bottom of the file list, I saw that, when the filter was in effect, it was dramatically reducing the number of files shown. I guessed that perhaps the program was showing all files that contained at least one of the error codes shown in the right pane. In that case, the pane’s title (“Available notes”) would be a misnomer. So then I would have been doing it right in the first place: put all notes in the right panel; move to the left those error codes that I did not want to use; and the files remaining in the list would be only those that had some other error code (even if the display did irritatingly continue to show the error code columns that I was not interested in). But no, that wasn’t right. Perhaps I had it backwards; I reversed the contents of the two Note Filter panes and tried again, though I thought I had already done that. Nope; still confusing.
At this point, I discovered another possibility: Configuration icon > Ignored notes tab. This tab indicated that four notes (he, hf, oc, and od) were already being ignored by default. Now I added ab, an, fa, ha, and ob to the Ignored Notes list. Now, back in the main screen, with or without the filter running, those columns were gone. That seemed like progress: now, with the filter on, I could see that some other notes were also appearing in large numbers of files. (Here, again, it would have been useful to sort by column, or to see a counter at the bottom of each column or available by tooltip.) I went back to the Configuration screen and marked those to be ignored as well. I was not really sure whether, or why, they might be important.
My mission, in eliminating those codes, was just to get the list down to a point where I was dealing with an analyzable handful of MP3s — not hundreds or thousands. I was in over my head; I was just flailing around, trying to pare this mass of information down to a manageable size. It would have been handy to have an option of producing a report, indicating that 53 files suffered from error xy, telling me what xy meant, and giving me a severity rating, so that I would know whether xy meant unreadability, playback difficulty in some MP3 players, or perhaps a mere absence of perfect orderliness. Based on 1 2 3 different searches, it appeared that nobody had provided a detailed discussion of these error codes, and that it could be difficult even to find a simple explanation of which sorts of errors might be important, and why.
I made several more trips back and forth, between the main screen and the Ignored Notes tab, as I continued to eliminate proliferative error codes and home in on the more unusual ones. In the end, the full list of error codes on my Ignored Notes list were aa, ab, ac, ad, ae, ak, an, bc, bg, cb, ea, fa, ha, he, hf, ja, jb, kb, kc, ob, oc, and od. But now I noticed that the codes appeared to be color-coded, for some unknown reason, and that I seemed to have eliminated some important ones along the way. For instance, error ac was “No MPEG audio stream found.” That sort of error could certainly interfere with one’s listening pleasure; not exactly a problem you’d want to ignore. And after all this, I still had a list of more than 200 MP3s flagged with various errors, each guilty of no more than a few sins among the two dozen commandments still inscribed on the stone tablets of MP3 Diags.
Second Approach: Prioritizing Error Codes
It was time to start over. I went back to the Ignored Notes tab. I intended to reset it to default, but now I saw that there was no button to restore the default settings. Fortunately, I had noted (above) that he, hf, oc, and od were ignored by default, so I restored that status quo manually.
While I was there, it occurred to me that there was probably some significance, not only in the color coding, but also in the choice of first letters (e.g., the “a” of aa, ab, ac): errors aa and ab might have something in common, whereas errors aa and ba might not. I tried a search for insight, but came up short. I found a relevant page in the User’s Guide and concluded that the first letter of each two-letter error code had approximately the following significance:
- a: involves MPEG audio streams
- b: involves Lame and Xing headers
- c: involves VBRI headers
- d: involves ID3V2
- e: involves APIC
- f: involves ID3V2.3.0
- g: involves ID3V2.4.0
- h: involves ID3V1
- i: involves broken streams
- j: involves truncated streams
- k: involves null and unknown streams
- l: involves Lyrics tags
- n: involves Ape Items
- o: involves MP3 Diags program errors and limitations
I wasn’t sure what to do with that information, so I turned to the colors. That same page in the User’s Guide said that black was for warnings, red (which looked more like brown to me) was for errors, and blue was for support. By color, the codes were as follows:
- Red (errors): aa, ac, ae, af, ag, ah, ai, aj, ak, al, ba, bc, bd, be, bf, ca, da, db, de, ef, fb, fd, fg, ga, gb, hb, hc, hd, hg, hh, hi, hj, hk, ia, ib, ja, jb, kb, kc, la, lc, na, nb, nc, nd, ne, nf, ng
- Black (warnings): ab, ad, an, bb, bg, cb, cc, dc, dd, df, dg, dh, di, dp, dq, ea, eb, ec, ed, ee, eg, fa, fc, fe, gc, gd, ge, gf, gg, gh, gi, ha, he, hf, ka, kd, oa, ob, oc, od, oe, of, og
- Blue (support): ao, dj, dk, dl, dn, do, eh, ei, ej, ff, gj, lb, ld, le, nh, ni, nj
This was interesting. It seemed that many of the most commonly occurring codes — the ones that I had previously marked to be ignored (see the end of the preceding section) — were mere warnings, not actual errors. So perhaps an analysis focused on actual errors would produce a more manageable list of problematic MP3s.
I tried that, marking all of the black and blue error codes to be ignored, thus leaving only the red error codes to be displayed. But the program was still displaying over 200 MP3s. I noticed that, again, a few codes were responsible for the bulk of the errors. These included ac, ak, ib, and kc. I wondered if those, again, were trivial errors serving only to exaggerate the number of bad MP3s. I listened to one of the files marked with no errors other than those four. It was a long recording of speech, not music. The part I listened to seemed fine. I did not think that MP3 Diags was wrong when it listed about 200 places in the file where it had detected error ak (“Invalid MPEG stream. Stream has fewer than 10 frames.”). But the file was definitely playable. (I was using IrfanView as my player.) I tried listening to a few more similarly marked files. (To play files, double-clicking on the entry in MP3 Diags did not work. Instead, it seemed I had to right-click on the entry, choose the only available option (“Open containing folder”), check back to MP3 Diags to confirm which of the files in that folder was at issue (because none was highlighted), and double-click on it.) Those, too, were playable.
Since those files were playable, it appeared that errors ac, ak, ib, and kc were not identifying serious problems warranting replacement of the file with its backup. Yet here, I thought, was a possibly incorrect assumption. If I understood the MP3 Diags information correctly, it would say that an MP3 suffered from error ak (“Invalid MPEG stream. Stream has fewer than 10 frames”) when what it really meant was that one potentially tiny part of the MP3 had that error. Otherwise, why would the File Info pane for one of my files repeat that error ak had occurred at 200 different addresses within that file? Contrary to one idea I entertained, a single MP3 would apparently not contain multiple streams: a search led, for example, to a VideoLAN webpage indicating that an MPEG video file would contain one video stream and one audio stream. (A search of the MP3 Diags website yielded no hits directly on point, but did lead to a Transformations – Details page ambiguously summarizing the actions that various MP3 Diags repair tools would take.) Apparently what MP3 Diags meant was that it considered the MP3’s entire audio stream invalid because of a single error occurring somewhere in that stream.
It seemed, in this case, that MP3 Diags would benefit from a change in its File Info pane, offering both Summary and Detailed options, where the Summary might indicate how much of the file (e.g., 43%, or 27 seconds) was plagued by such errors. Without that sort of information, it seemed I was looking to the program to tell me which of my MP3s were bad, and the program was looking back at me, asking how I defined “bad.” To get past that logjam, users might ideally have an option of setting a threshold, so that MP3s would be flagged as potentially nonworking (i.e., the need I was trying to address here) only if they met certain criteria (e.g., more than 5% of the file, or 3 seconds, was subject to error ak). Going further, one would ideally click on a specific address in the Detailed presentation in order to go to a waveform displaying that specific address (or to open a related audio editing program at the desired location), so that the user could view the problem directly.
At present, it appeared that I could not simply disregard errors ac, ak, ib, and kc. If I understood the situation correctly, the question was not whether such errors existed somewhere within an MP3. The question seemed rather to be this: how much of the file is subject to this error? I could not answer that question without manually inspecting the Detail pane for each file, so as to see whether error ak recurred two times, or two thousand times. Even then, I could not tell what a list of 2,000 addresses (presented in the form of e.g., 0xc892 and 0xb8c7d) suffering from error ak would actually mean — because, as I say, that particular file (a noisy recording of ten minutes of speech rather than music) seemed to be playing OK despite its 200 instances of error ak. In short, I could not disregard error ak, not on the basis of my existing knowledge, and that meant that I was looking at a number of potentially problematic MP3s too large to allow manual inspection.
This understanding of the situation suggested that the purpose of MP3 Diags was not exactly to identify errors within MP3s. It was not very well developed for that purpose. Rather, the purpose of the program seemed to be to correct such errors. Ciobi, the developer, had built the program to find everything that was arguably wrong with an MP3 file, not for purposes of informing the user in detail about the nature and extent of that file’s problems, but rather to give the user some incidental information about what MP3 Diags was now going to repair. Far from being incomplete for purposes of giving the user a comprehensive look at the extent of a file’s problems, MP3 Diags was providing vast amounts of information that most users would never look at, on their road to file repair. So it seemed that my strategy of narrowing down the focus to a few particularly important error codes was not compatible with the purpose and function of MP3 Diags.
Third Approach: Repairing MP3s
As noted in the companion post, there were different strategies for identifying bad MP3s. These posts were focused on the use of programs like MP3 Diags, which offered some file analysis capabilities; but there was also the strategy of attempting a file conversion, to see which files would produce error messages in the conversion process. It now appeared that MP3 Diags was implicitly proposing a third strategy: attempt to repair the files, and then see which ones are thus rendered free from error and which ones turn out to be completely unplayable.
That strategy suggested certain thoughts about a given file and its backup. First, again, it would be wise to make a backup and to use a file comparison utility (for similar purposes, I usually used Beyond Compare) to see whether a file differed from its backup before or after using the MP3 Diags repair tools. Also, it seemed that a program like MP3 Diags would ideally offer a comparison on the directory level, listing MP3s from the file directory and its backup side-by-side in two columns (á la Beyond Compare), with additional file-by-file drill-down capability, so that the user could view in detail the ways in which an MP3 differed from its backup.
Within the existing limits of MP3 Diags, having made a backup, it appeared that I would now want to go ahead and repair all errors that the program had identified, so as to shrink the list to those programs that were truly troubled. Yet here, according to one discussion of MP3 Diags, “It’s not really clear which transformation [i.e., which MP3 Diags error correction tool] corresponds to which problem, so you have to do a bit of educated guesswork.” I thought I might evade that set of unknowns by simply fixing everything that could be fixed.
Then again, that prospect raised a concern. What if I did proceed to alter thousands of MP3s, based on what the program seemed to be recommending? At that point, I would do a comparison in Beyond Compare and, sure enough, I would see that thousands of MP3s on my data drive differed from their counterparts on my backup. What then? Was I going to listen through those thousands of files, to make sure that MP3 Diags had not actually screwed up any of them? After all, this was exactly the line of reasoning that I had followed in my previous look at MP3 Diags, several years earlier, and in that case I had noted this quote from the Main Window of the MP3 Diags User’s Guide:
Before going any further, I should state the obvious: if you like your files and they don’t bother you, then you probably shouldn’t change them. Anyway, if MP3 Diags finds errors, it is possible for you to have problems playing those files with other players or processing them with other tools. Still, it’s a good idea to delay any changes until they seem to be needed.
So that was baffling. MP3 Diags seemed to be designed to point the user toward mass identification and repair of large numbers of potential errors, rather than toward detailed identification of major problems on a file-by-file basis; but then, when the user started down that road, the MP3 Diags developer warned that that was probably not a good idea. What, exactly, was I supposed to do with MP3 Diags?
I gathered, from the foregoing quote, that the developer was not completely confident that the fixes incorporated into MP3 Diags would improve MP3 files. My searches had not led to any webpages in which knowledgeable people would subject MP3 Diags to tests on a wide variety of perfect and flawed MP3s, to see what the program did to those files. What we had here was a black box — what appeared to be a very capable and well-designed black box, but a black box nonetheless — into which users might feed their MP3s, hoping for improved output but not guaranteed of same.
In this situation, I wondered whether I could apply an approach somewhat opposite to the one that I had considered previously. Instead of trying to identify the errors that would prevent files from playing, could I narrow down the list by cleaning up the MP3 Diags errors and warnings that would not affect a file’s playability, and then see which files remained on the list after that? One answer to that question was that I was not sure I could figure out which errors were minor. After all, I had had a hard time figuring out which errors were major. There was also the risk that fooling around with supposedly minor errors would make a file worse — due, perhaps, to some unrecognized bug in MP3 Diags. This would be a decidedly roundabout approach to the question of whether a given file was damaged or unplayable. And, of course, I had already gotten a sense of how this would work, from the foregoing attempt to ignore certain errors: I was still looking at a longish list of flagged files, without much clarity on which of those flags I should focus on.
Fourth Approach: Testing MP3 Diags
Given the current design of MP3 Diags, the situation seemed to be as follows: I did not know (and the program did not help me decide) which files, or which kinds of errors, were worth worrying about. I did not even know whether I liked my MP3s (as the foregoing quote put it), because I did not want to listen through all of the files that had been flagged with various errors, using various MP3 player programs, to see what kinds of problems they might have. I just wanted to test my MP3s, and to fix those that needed fixing.
So I found myself considering the backwards posture of using my MP3s to test MP3 Diags, rather than using MP3 Diags to test my MP3s. That is, I was now thinking that I should run MP3 Diags on a bunch of MP3s, to see whether its repairs improved or at least did not harm them. If I could test those MP3s before and after using MP3 Diags, then perhaps I could narrow down what MP3 Diags had done to them, and whether the MP3 Diags changes had been helpful.
If I had been an audio engineer, I would probably have known how to find or create MP3s containing various kinds of errors, so as to test the many errors and warnings produced by MP3 Diags. It appeared that, so far, there had been no scholarly analysis of MP3 Diags. With my very limited knowledge, it seemed the only option was to run MP3 Diags on a large number of MP3s, and see what happened.
There were two ways to do that. I could use my own files, or I could use someone else’s. The latter seemed safer. So I ran a search, looking for reports of problems from people who had used MP3 Diags. It was reassuring, but also not very helpful, to see that that particular search produced no user complaints. Taking another tack, I looked for reviews on Softpedia, CNET, and SourceForge. Unfortunately, I was only seeing general comments by professional reviewers and private users who did not seem to have gotten into the details. They would say that the program had a complex interface best used by people with technical training, without explaining why technical training would be necessary or helpful to overcome the seeming shortages of specific information identified above; they would say that MP3 Diags had done a great job on their collection, without indicating what they had done or what problems had been fixed. No doubt many detailed accounts of personal experience with MP3 Diags would have emerged, if I had been willing to spend hours rooting through random webpages — such as the page offered by Dan of BlissHQ, who successfully used MP3 Diags to repair a specific problem with certain MP3s. Within a brief search, sad to say, I was not encountering any users who had attempted a detailed and comprehensive discussion of what MP3 Diags might have done to a large set of MP3s.
The idea of testing MP3 Diags on my own files was problematic in itself. In these posts, I was searching for a way of identifying which MP3s were problematic. Since I was not an audio engineer and was not inclined to invest the time needed to understand MP3 problems in detail, I was going to have to rely on other programs to tell me which MP3s were good and which weren’t. Yet that was precisely the mission of the companion post. I did not yet have a reliable tool for that purpose. Even a manual listen-through might not suffice, because I might not be listening for the kinds of tiny flaws that MP3 Diags might have detected and repaired (or made worse).
Fifth Approach: Backing into MP3 Diags
It occurred to me that I might be able to figure out what MP3 Diags should consider important by looking at the kinds of errors that other MP3 diagnostic programs considered relevant. If several other programs agreed that files A, D, and G were problematic, then perhaps I could look at the errors MP3 Diags would identify in those files, and see whether MP3 Diags identified those same problems in other files.
In the companion post, I had experimented with a handful of MP3 diagnostic programs. Collectively, these tools identified a small number of distorted or nonplaying MP3s on my system. I was running out of time for this inquiry; but when I looked at two of the nonplaying files, I noticed that MP3 Diags assigned them error (or warning) codes ac, fa, kd, and ob. Separately, I looked at MP3s produced by a certain Olympus recorder that I had used briefly — and whose legacy included a number of low-bitrate MP3s that AudioTester was now flagging as defective. It tentatively appeared that the MP3s flagged by AudioTester were also marked with codes ae and ja, whereas the Olympus MP3s not flagged by AudioTester were not marked with those codes.
I wondered how that differed from the codes that MP3 Diags had assigned to the files flagged by MP3Utility. Here, again, I just looked at a sample, not at every marked file. The situation might have been different with other MP3s flagged by MP3Utility. But for the ones I looked at, all were marked in MP3 Diags with error codes ab, an, fa, and ob — as were many other files not flagged by MP3 Utility. Here, the difference seemed to be that these files, unlike others around them, were also marked with error codes bg and cb and, in some cases, ad. It appeared that the “First sync error” upon which MP3Utility seemed to fixate might be equivalent to MP3 Diags error codes bg and/or cb.
I had to curtail the inquiry there. Hopefully these notes would prove useful to me and/or to others, on the way to a better understanding of how MP3 Diags might help in the identification and repair of unplayable MP3s.