Help Me, it Burns!

Friday, 2007-10-05; 01:19:00



So last month, if you deigned to read it all, you'd know about the fascinating tale of the Disk Usage Ballooning, caused by Microsoft Office 2004's method of installing fonts.

Today, I'm going to relate to you the tale of the Microsoft User Data Folder.... FROM HELL!


The Background

If you use Microsoft Office, you're almost assuredly familiar with the evil Microsoft User Data folder. It's a folder that Office uses for storing settings, but Office helpfully places the folder in the standard Documents folder inside your home. Whenever I go into my Documents folder, I see that nagging folder, as if it were saying, "Heehee! Look at me! I'm not a document, but ZOMG I'M HERE ANYWAY! Muhahahaha!"

(Yes, yes, I know the secret trick to banish it from the Documents folder. Don't muddy up my story with details! Besides, it's irrelevant.)

Well, if you've ever looked inside this Microsoft User Data folder, there's one file in particular that has been wreaking havoc on the laptop carts at the elementary school where I used to work (and am currently temporarily filling in for a week). It's located inside the Microsoft User Data folder, inside "Office 2004 Identities", named "Database".

If you have Office 2004, go ahead and get info on this file. How big is it? Wait, no, don't tell me. I can guarantee you that it's bigger than 18 MiB. Why? Because a stock database file created by Office 2004 is 18 MiB. No, I'm not shitting you. Delete this file, and launch an Office 2004 application. It gets created anew, and it's 18 MiB by default.

What's weird is Office X's database file, in contrast, is only 144 KiB when first created. What the heck is Microsoft storing in this new database file that it's so bloated from the beginning?

Anyway, it turns out that this file is read by every Office 2004 application upon launch. There's an app called "Microsoft Database Daemon" that's inside the "Office" subfolder of your main Office 2004 folder. This is the helper app for Office 2004 that seems to create the "Database" file, and despite the kind of this file being listed as a "Microsoft Entourage document", the database daemon is launched and the database is read whenever you launch any of the Office 2004 applications, not just Entourage.

This is significant because it means that 18 MiB of (useless) data is read upon each launch of an Office 2004 app.


The History

Now let's rewind to my last rant about Microsoft Office. My entry caused Nathan Herring, who apparently works for the MacBU, to comment about my problems and we had an interesting discussion about the technical details of how Microsoft Office saved files. In the end, though, I concluded:

But here's the deal: when a program like Microsoft Office causes catastrophic problems with saving due to the way we've set up our clients and servers, and requires us to completely revamp our server/client setup in order to get around the bug, do you think that's gaining any good will toward Office? Is it our responsibility to know the quirks of Office when deciding upon how to structure our users and groups? Is it reasonable to require this of your users, more so than us asking Microsoft to fix esoteric problems like this?

If you haven't read the comments or the Apple knowledge base article regarding the problem I was referencing, the long-term solution to this problem was to have unique user IDs for each person that accesses a network share on which they save Microsoft Office files.

And that's what the elementary school did this year. Before, each class of about 30 kids would log on to the computers with the same username, which would automatically mount the server onto which they saved their files. The users themselves were local users to each machine, but they all had the same user ID, which caused the saving problem.

This year, we changed over: now each student in the school has a separate login that's a network user, not a local user. This means that all of the files in the home folder of the user are hosted on the server and retrieved via the network connection, so that it doesn't matter what computer a student uses to log on, he'll always have access to the same exact files. (In case you're wondering, there was no way we were going to make each student have a separate local user login. That scenario would be completely impossible to manage.)

Now, add to the mix the fact that the students sometimes use laptop carts when they do their work, which connect to the server via a wireless (presumably 802.11g) connection.

Again, I'll pause to let you, dear reader, figure out the problem.

Done yet? OK.


The Horror

When the students need to run Microsoft Office, Office will attempt to read the 18 MiB database file from the user's home folder upon launch. Remember, since these students now have network logins, these 18 MiB files will be hosted on the server, so the database file will need to be retrieved via the network, which means downloading it via the wireless connection.

Since in normal classrooms, all students work on the same project at the same time, it's not entirely uncommon (and it's in fact quite routine) for all the students in one classroom to launch Microsoft Word all at once. That means, at the same time, thirty 18 MiB files are going to be downloaded simultaneously via the WiFi connection. Assuming this is an 802.11g connection (an assumption in which I'm fairly confident), the maximum theoretical speed is 54 Mbps. Remember, there's only one wireless router, so at max, 54 Mbps of traffic can go through this router. This rate is equivalent to 6.75 MiB/s of traffic.

Thirty 18 MiB files is a total of 540 MiB, and using the max rate of traffic, it means that at a minimum, it will take 80 seconds for all 30 laptops to read the database files that Office 2004 stores in the Microsoft Office User Data folder. That's one and one third minutes that the students have to wait, just to launch Microsoft Office 2004. All of the laptops are reading data at the same time, so it's not like one computer will read the whole database file, then the next computer will read its whole database file, and so on; no, the laptops will all be competing to get data from their database files, and everyone will essentially finish at the same time.


The Burning

But wait. We're talking about real-world conditions here, so let's use some real-world inputs. Realistically, the max speed you're going to get through an 802.11g router is probably more around 20 Mbps, not 54. Also, sometimes two classrooms of 30 laptops each attempt to connect to the server. And then, of course, not only does Microsoft Office need to read the Database file, but it needs to read its preferences (also stored in the home folder, which will also need to be retrieved via the network) and load the application into memory.

And guess what? To add insult to injury, stock Office 2004 preferences come in at a whopping 3.5 MiB.

So let's do that calculation again. We have 60 x 21.5 = 1290 MiB of data to retrieve simultaneously, via a 2.5 MiB/s wireless connection, which comes out to an 8.6 minute launch, which probably translates to 9 minutes if you include the memory loading time.

This is not some theoretical thought experiment. This has happened. Many times. To the point where projects that were going to be created on these laptops have been delayed by weeks because of the inability to use Office. (One project in particular uses the Notebook functionality of Word; the only recommended product I've heard of that can compete with the notebook features of Office is Circus Ponies' Notebook.)


The Solution

What's the solution to this problem? Well, you have three options. The first: convert all your users to mobile accounts, and exclude the Microsoft User Data folder from synching. But it's never a good idea to do such an account transition in the middle of the year.

If that's not an option, there's a hack that'll get you around this problem. First, you have to delete that 18 MiB file from each of the users that has one. (IMPORTANT NOTE: If you use Entourage, then absolutely don't do this. Entourage stores records of e-mails in here, so if you delete the database, your e-mails will be gone. This option is only for those who don't use Entourage.) Then, you have to move the "Microsoft Database Daemon" helper app out of the "Office" folder so that Office 2004 can't find it and won't attempt to create or read the database file. This will cause PowerPoint to generate an error on launch, and will cause Word to also generate errors when you try to use user-specific features that require the database file, but these messages can be clicked through and the functionality of Office that we need stays intact. (This does not, however, solve the problem of the accessing of the 3.5 MiB pref files, which themselves cause a shit ton of traffic to fly across the network. Using a similar real-world calculation, just the preference files will cause a wait of 84 seconds.)

Deleting the database files from the users' folders, though, is problematic. You have to move the daemon out from the Office folder of all potential computers on which these students will run Office. Consider this: I delete the Microsoft User Data folder from a user and the daemon from the Office folder on one of the laptops, but what if he uses a different laptop the next time? If the daemon hasn't been moved on this new laptop, the database will get recreated, causing the problem again.

Your third option? Run, don't walk, away from Microsoft Office. Find apps that can replace Office's functionality. iWork seems to be a great candidate, especially now with the $250 site licensing for educational institutions. But there still may be other functionality that students or teachers need, notebook functionality being one example, and from my experience, it's so worth it to find alternatives.

The bottom line? Right now, the only feasible option at this point for me is the second one. I'm going to have to, at the very very least, go around to the three carts of 30 laptops each as well as a computer lab of 30 more computers and remove the database daemon. Then I'm going to have to go to the server and purge the Microsoft User Data folder from all users on the server. And I'll still have to worry about those stray Office 2004 installations on other laptops and/or computers in the school which the students might use on the off-chance. It won't be too bad if I use a script and Apple Remote Desktop, but just getting 30 laptops out from one cart and starting them all up is a chore, not to mention three carts.


The Insanity

In what world is this acceptable behavior for a computer program? In what world do users tolerate such a crappy product? In what world do people think that Microsoft Office for the Mac is a nice program to use? Am I the only one who encounters and feels the incredible need to vent about problems like this?

Does anyone find it ironic that one of the primary motivations for our new server setup was to get around a Microsoft Office bug, and in doing so we encounter an entirely new one which will, again, require us to rethink our server setup?

No, it's not ironic, actually. It's insane. You have no idea how much frustration Office alone has caused.


Technological Supernova   Rants   Older   Newer   Post a Comment