Translation Challenge Solved

Saturday, 2006-08-19; 04:00:00

The method I used in solving a problem with D2X-XL, which is similar to the Translation Challenge

About a week ago, I posted a challenge -- kind of a riddle, actually -- about how one would go about finding all problems in two translators. I mentioned it was related to D2X-XL.

Well, as you can imagine, I tailored the nature of the challenge to closely fit a bug that I found in the program. Specifically, after saving a game, and then restoring it, D2X-XL would crash. This wasn't a save game created from a previous build of the program; saved games created with the same build would crash when trying to restore the game.

The bug came into existence because saved game files before this particular build were architecture-specific. That is, if you saved a game on a PowerPC-based Mac and tried to restore it on a Windows PC, it wouldn't work. Diedel, the main developer of D2X-XL, introduced a new universal save game format that could be read by D2X-XL on any architecture. And, of course, since he doesn't have a PPC Mac to test on, there were problems with the new savegame format; they weren't exactly universal as they were supposed to be.

There are two functions in state.c of the D2X-XL project: StateSaveUniGameData() and StateRestoreUniGameData(). I'm sure you can infer what they're supposed to do from the name -- one saves the save game to disk, and one restores it from disk. But since the program was crashing after restoring a saved game created with the same build, something was going wrong in these functions. There could have been a problem in only one of them, a problem in both of them, or there could have been multiple problems in either or both functions. I didn't know.

I also had a debugger. Of course, I could step through the code and examine values of various objects, but I wasn't exactly familiar with the code, so I didn't know exactly what to look for or what values would be wrong. (This often is the case when I'm trying to figure out problems with the Mac code -- I'm kind of shooting in the dark a lot of the time, poking around, and trying to guess at what various values should be at various points in time. This is especially true for debugging network play problems.) In essence, these two functions were like black box translators to me: I could see the code that was powering them, but I didn't really know very well what exactly was happening and what was supposed to happen.

The one thing that I did have that was useful to me was a variable called "fp", which basically was a cursor letting me know at which point in the savegame file I was. Since both the save function and the restore function would have to go through the same data, the cursor was very useful: if the save function's first step was to write an int, then the restore function's first step would have to be to read that int. In essence, "fp" allowed me to distinguish "sentences" in the save game file format.

Going back to the abstract Translation Challenge, the obvious solution was to simply divide and conquer: split the document in half, and see if there's still an error in the first half. If so, then an error gets triggered in one of the sentences in the first half of the document. Halve that half of the document again, and repeat the process until you've drilled down to find the single sentence that is causing the problem. In theory, this should allow you to find all the problems in the translators in 2*e*(ln(s) + 1) minutes, where s is the number of sentences in the document and e is the number of errors in the translators. The plus one is needed because you need to run the single sentence through one of the translators again to figure out if it's the first or the second translator that's causing the problem on each problematic sentence.

In practice, this isn't exactly the most time-saving way to figure out the problems. I wasn't exactly sure even how to compare the "initial document" to the "document" after running through both translators. I could really only verify if the whole savegame was restored correctly by seeing whether the program crashed or not, and whether I ended up having the correct amount of weapons or shields or energy. I could go through each line one by one and examine each variable that was supposed to have been written to disk, and the value after reading from disk -- essentially the "brute force" method that requires 2002 minutes to find all problems in the mythical translators. But there was no way to let just half of the game be saved and half be restored and see if there were any problems with just that half of the saved game file.

The trick was to use the "fp" variable. The analog in the Translation Challenge is the fact you know about the alien language -- one sentence in your language translates exactly to one sentence in the alien language. This allows you to basically define a "cursor" when reading through the document. Even without any other knowledge of the alien language, you know that a one thousand sentence document run through the first translator should produce a one thousand sentence document, and similarly with the second translator.

That means that whatever output is being produced by the first translator, you know that the second translator should be producing the identical number of sentences if fed the output of the first translator.

So the first step in finding problems is to check the consistency of the second translator. (You can use a binary reduction in this case, too.) To do this, you run your initial document through the first translator. Then, halve that translation and feed the first half to the second translator. If the output has the same number of sentences, you can infer that there are no errors in the second translator up to that point. If there is an error, halve the first half of the translation and feed that to the second translator. Again, comparing sentence count tells you if there's an error at that point or not. Do this until you have found all errors in the second translator.

And that's what I did. At about half way through StateSaveUniGameState(), I checked the value of fp. Then, in StateRestoreUniGameState(), I checked the value of fp at the respective line of code. If the values of fp matched, then I assumed that there was nothing wrong with the restore function up to that point. I eventually found a case of mismatched values of fp -- aha! There was an error in the restore function. After checking the value of fp at about 10 pairs of lines or so, I found the culprit: the save game function was saving the old-style AI robot parameters, and the restore function was restoring the universal AI robot parameters.

You'll note that since this was supposed to be the universal savegame format, the error, in this case, was actually in the save function (i.e.: the first translator) and not the restore function (i.e.: the second translator). Looking at it from a different perspective, though, and it's actually a restore problem -- the second translator isn't correctly reading what the first translator wrote.

Once I fixed that problem, the value of fp was consistent after all pairs of lines in the save and restore functions. But something still wasn't right -- D2X-XL no longer crashed when restoring a savegame, but the number of missiles that I had was way off mark. This was quite an amusing sight, actually -- I saved a game right after entering the mine on level 1 of Descent 1. I had 6 concussion missiles. When restoring the game, I had 1536 of them. For your amusement, this is what happens when you die while having 1536 missiles in your possession:

1536 concussion missiles

You'll also note that the Phoenix cannon is also present in that screenshot, even though it should never exist in the Descent I levels.

In any case, since there was still a problem, there was yet another error in the save function -- since fp was consistent at every corresponding line of code between the save and restore functions, I knew that any remaining problems had to be in the save function. It turned out that that was indeed the case -- although I didn't find it, because at that point I was up too late at night and in the morning I received an e-mail from Diedel telling me exactly where the problem was.

So, by analog, once you've found all the flaws in translator 2, then you simply use the standard binary reduction method on the remaining errors in the translators. I'll call this modified method the "split binary reduction method" or the "split method".

If given a large number of different scenarios, odds are that half the errors will end up in translator 1 and half will end up in translator 2. By using the split method, you'll have saved half the time in finding the errors in translator 2 (because you don't need to run the document through both translators 1 and 2 to find the errors in translator 2). Then, you use the same amount of time in the split method as in the standard method to identify all the flaws in translator 1. On average, you'll save 1/4 the amount of time by using the split method rather than the standard method.

The split method is actually a little better in terms of time used: instead of 2*(e_1 + e_2)*(ln(s) + 1) minutes, we instead have 1 + e_2*ln(s) + 2*e_1*ln(s) minutes, where e = e_1 + e_2. e_2*ln(s) comes from our binary reduction in finding how many flaws are in the second translator -- in this case, there's no coefficient of 2, because we're not running anything through the first translator, besides the first time that we translate the initial document (which is why the first term is simply a 1). The 2*e_1*ln(s) comes from the binary reduction of finding how many flaws are in the first translator, in which case we do need to run text through both translators.

In our Translation Challenge case, you can further optimize your time as compared to the standard method. That's because you can exploit the same property again: after identifying all flaws in translator 1, just run half of the original document through translator 1, and compare the original number of sentences with the number of sentences in the translated document. Again, where there's a discrepancy in the number of sentences, there's a flaw -- and you don't need to know anything beyond our one known fact about the alien language. This optimization, however, doesn't help in terms of figuring out the bug in D2X-XL: I didn't have a "cursor" for the "original document".

There are a few small caveats to this "split binary reduction" method, though.

The first is that for small documents, this method is worse. For example, suppose you have a 10-sentence document, with only 1 error in the first translator. The "split binary reduction" method goes through unnecessary steps to make sure that translator 2 is working properly. Even though e_2 is 0, the e_2*ln(s) term doesn't vanish, because it requires at least one binary reduction to prove that translator 2 doesn't have any errors.

There are also complications with this method when you have errors that just produce garbled sentences. Since this doesn't alter the sentence count, it could make you incorrectly conclude that translator 2 doesn't have any errors, when in fact it does; you just don't have enough information to figure that out. Our premise in using the split method is based upon the number of sentences -- if a flaw doesn't alter that, then the split method will waste additional time trying to figure out the garbling flaws.

But all in all, the split method usually is faster in identifying flaws, and it's more practical -- at least in this one story of a bug -- in finding problems in source code.

Kind of an interesting math problem, really. And it's now 5:15 AM, and I should go to bed.

Technological Supernova   D2X-XL   Older   Newer   Post a Comment