Wednesday, April 30, 2008

Triple-Boot Mac Part 2: MBRs Are Still Hell

A while ago I blogged about the setup process to get OS X 10.4, Windows XP, and Ubuntu 6.06 onto my MacBook Pro. I tried the same trick today on a Mac Pro running OS X 10.5, Windows Vista, and Ubuntu 8.04. To further complicate things, I created a true swap partition.

Now in the old way, Windows had to be last on the disk...fortunately Vista is a little bit more savvy about partitioning (although it stil uses MBR and not GPT partitions). So the order of operations was:
  1. Install rEFIt onto the Mac.
  2. Use diskutil to partition the main drive, cutting off three new Fat-32 partitions, size 30G, 30G, and 10G, which will become Vista, Linux and a swap partition.
  3. Install ext2FSX.
  4. Reformat the second Fat-32 partition as ext2 format. (Once ext2FSX is installed, Disk Utility will let you do this.)
  5. Install Ubuntu, using manual partitioning, and using the last Fat32 partition as a swap file.
  6. On reboot, Use rEFITs's partition tool to fix the MBR partition table, which the Ubuntu installer will trash.
  7. Install Vista onto the middle partition, which hasn't been used. Vista will want you to reformat the Fat-32 partition as NTFS, which is fine. One tricky part: since rEFIt is installed, when Vista reboots you have to pick the volume that the Vista installer thinks it is rebooting to. As far as I can tell, this is always the volume Vista is being installed on.
  8. Reboot with the Ubuntu Live CD, and run a grub shell and use "setup" to put a boot sector back on the Linux partition. I don't know why it was missing - possibly an error in setting up Ubuntu the first time.
Some notes on the sketchier steps:
  • Why ext2 on Mac? For some reason the 10.5 version of diskutil won't format Linux volume drives. The problem is that when the Ubuntu installer reformats the FAT-32 partition as ext2, OS X doesn't know about it. The next time you boot OS X, your Linux partition is mounted as a FAT-32 volume, which completely trashes it. It took me quite a few tries to figure this out, and I got myself into a state where I had to totally rebuild the startup drive from a full format and reinstall OS X itself. So converting the format to ext2 is a hack to keep the partition from getting smashed by OS X.
  • rEFIt is great - given a valid GPT partition table and a screwed up MBR table, it'll usually do what you want. It will prioritize the first four partitions, hence the Linux swap is the last partition, not Vista. (BTW Vista's partition tool sees the swap partition as "unused space" - so do not repartition in Vista.)
  • The goo to rebuild grub is more or less:
sudo grub
root (hd0,3)
setup (hd0,3)

Grub will complain a bit, but it will fix the MBR. Note that hd0,3 means the 4th partition on the first hard drive, so your mileage may vary. Tab completion in grub is a great way to scan the local files sytems, e.g. type root(hd to list drives and root(hd0, to list partitions.

Friday, April 25, 2008

What is OOP Good For?

An interesting artcle here got me thinking about about design and the development process. First of all, I agree completely with their 3 lies, particularly the second and third one. To riff on it a little bit:

Goals and Non-Goals

Code serves two purposes:
  1. It has to what you want (and this usually means: the features you want, the performance you want, no bugs).
  2. It has to not drive you insane.
This second point is of key importance: code design paradigms need to consider maintainability. Code always live longer than you think it will - you're going to be stuck with your work, so set up your code to be usable. I would say you can't achieve (1) or at least you can't achieve it cost-effectively, without (2).

However, what's conspicuous by absence is: the purpose of code is not to determine your data model! And in this regard OOP has done us a bit of a disservice in how we think about design.

OOP Myths

The problem with an OOP design is that it confuses code structure with data structure by providing a set of idioms and language constructs that bind the two together in very specific ways.

When I was in school, OOP was mis-advertised as providing three really great things:
  1. Encapsulation (that is, keeping the guts of your code from leaking all over the place).
  2. Polymorphism (that is, giving you language constructs to define multiple implementations of abstract behaviors).
  3. Code reuse (via inheritance, that is, you should be able to sub-class to utilize existing code).
Turns out that's not really true. Item 1 (encapsulation) is where 99.9% of the value of OOP comes from. The number one problem with big old code bases is that everything fondles everything else, and it's impossible to work in an environment like that. To the extent that OOP encapsulation encourages programmers not to write crappy code, that's a good thing, although OOP doesn't hvae a monopoly on this.

Polymorphism is a neat trick, but just not that important. I think we have all of one polymorphic class in all of X-Plane (plus one more in the installer). That's in close to half a million lines of code. Again, OOP doesn't have the monopoly - we get most of our dynamic behavior from function behaviors, not virtual function tables.

And code reuse via inheritance is so far off base that I'd call it a damned lie. Inheriting implementation usually results in chaos and is a bad idea, so I never believed this one.

But the idea that code design and data design serve different masters brings home exactly why this is a poor idea. Basically you are creating a very particular relationship among your data for the purpose of organizing your code! Bad idea! Data design should be driven by program needs, not code organization needs.

Design Method

When I work on X-Plane features, I think in terms of a data structure and an algorithm that go together to serve my purpose. Typically they are inseparable; it is necessary to organize the data in a certain way to run an algorithm with the performance characteristics we need. I would say you can't separate the two; is a hash table the structure, the algorithms for insert, search, etc. or both? I'd say both, because eithe one on their own are useless.

The actual "OOP" level code design is only determined after I have a clear picture of the data and algorithm design. Once we've decided that we're going to use a quad-tree for our scene graph (again, that's an algorithm and a data structure), then I go looking for language-specific idioms that will do this well.

(In this case we do use objects, but they aren't polymorphic, and in most cases they're just glorified structures.)

Refactor Early, Refactor Often

Finally, I always start a new feature in X-Plane with a clean work-surface by refactoring all of the code I will be touching as heavily as I can. You wouldn't know that the module in question wasn't version 1.0 (and this is before coding the new feature).

The advantage of this is that it cuts new-feature development time way down -- it's as if the app was custom coded to make the feature trivial. Most of the real work and development time is spent in the refactoring.

One reason to favor refactoring over new features for where to spend development time is that when refactoring, you can make a number of small changes and regression-test repeatedly. I find this goes a lot smoother than changing a huge amount of code and then discovering that multiple bugs causes nothing to work. With the small changes and regressions, bugs are caught a few at a time, making them easy to stomp.

By comparison, a new may be complex enough that it has to be coded fairly significantly to be able to test in useful ways. (The Rapid Development folks would yell at me for not having adequate test code.) Regardless of your testing diligence, you can't deny that when you start a new feature, there are already more ways to test the old one than the new one.

Thursday, April 24, 2008

The Über-Queue

In redesigning the background processing code for X-Plane, I am revisiting message queues.

Requirements on Queue Design
  • Anyone can be a worker thread. To be a worker thread, you simply pull off a task and execute it. (But see barriers below.) The important thing here is that we can have more than one worker thread for one giant queue, so that dispatch models CPU cores rather than categories of work.
  • Anyone can queue work - no requirement to be the main thread (just about any message queue design does this).
Problems Left for Client Code
  • Client code must not delete objects that are required by pending work tasks (scheduled or running). Doing so would be, um, bad.
  • Client code cannot halt running work tasks - doing this would introduce a huge amont of complexity for very little value.
  • Client code must use its own mechanism for getting "receipts" from finished tasks.
To this last point, typically while we want to schedule everybody in one big queue and have all hardware work on tasks, we want to dispatch our results to disparate locations in the app in a very client-code-specific way. In our case, we'll typically put confirmation code in the specific task objects, such as sending a message to our parent subsystem saying we're finished.

Queue Editing

While it's not strictly not strictly necessary, we provide an atomic editing operation on our message queue - basically a client can go in and reorder pending tasks. There are a few important uses for this:
  • It allows for prioritized queueing - I can queue a work task and then edit the queue to put my task first.
  • I can find "stale" work objects and remove them (replacing them with no-ops). Normally if I need to delete an object that is referenced by a pending task, I need to wait for the task (and have some way of knowing that the task has completed, specific to the task). But if I can clean the queue first I can remove unnecessary scheduled tasks and prevent work on soon-to-be-dead objects.
  • Being able to place a task in a specific place is useful for barriers.

Barriers are actually implemented as clients on top of the scheduling system - their only requirement is to know the exact number of worker threads. (This imposes less generality on the worker thread pool.)

A barrier is implemented as a pair of message queues (or even semaphores, since all we care about is counting) and one work task for each thread.

We use the barrier to halt thread processing by inserting all of the work tasks into the queue; each one effectively signals back to us and holds up that worker. We thus wait for four receipts from the work tasks indicating that all four workers are now waiting on us. At this point, we are running and background processing is halted.

To resume processing, we send messages back to the worker threads and wait again. They acknowledge and are then finished. They send an acknowledge back to us because we may want to deallocate the resources used in the barrier and cannot do so until we know that all worker threads have woken up and finished their code.

We can use barriers when we need a true halt of processing; we can also use them as markers in the stream to indicate that we are done with a range of tasks.

If we want to wait for only one task to be done, we're better off having it signal us directly. The power of barriers is that, because they halt all processing, we can be sure that all tasks before us have finished. But the real purpose of barriers is to halt the processing of tasks that are behind us.

Sunday, April 13, 2008

An Unfair Unicode Rant

I would not be the first person to be grumpy that you can't use UTF8 on Windows, and I must admit that my desire for that option comes from pure laziness.  Perhaps it comes down to Microsoft not wanting to change their huge code base and me not wanting to change mine, but I must admit that X-Plane's code is still slightly smaller than, um, all of Windows. :-)

It turns out that the problem of dealing with Unicode isn't as bad as it seems - for reasons of pure luck (probably partly coming from X-Plane not being very file-system or text-centric) most of our text-based contact with other systems happens in only three or four translation units, so it's pretty easy to find and trap the bottlenecks.

The ugliest problem turns out to be the C runtime.  For historical reasons X-Plane on Windows is built using Metrowerks CodeWarrior 8.  The bad news is that the C runtime API uses single byte file paths, and does not support UTF8.  The good news is that they give you the source.

So it looks to me like what we're going to have to do is to put a converter code in the bottom of the C library to convert 8 bit UTF8 paths (passed in to X-Plane) into UTF16 paths that we can then pass to the "W" variants of the file routines.  Fortunately since this happens "just in time" we only have to look for Win32 file system calls inside the C runtime source, which are already partitioned into a "Win32 specific" section.

We're using UTF8 internally because it allows us to not change the size of any internal memory buffers or data structures - if we did we would have to trace every single instance to find which ones get written to a binary file format and convert there (and I can tell you off hand that we have a number of these cases).  

Converting the string path to wide characters seems like a loss - trading an annoying but survivable bug (incorrect strings) into a set of worse ones (data loss, file corruption, and crashes).

And that brings me to the grumpy rant part: a lot of the documentation and comments I found online were simply focused on the "Microsoft" process, e.g. use TCHAR for all your strings, #define UNICODE to 1, clean up the wreckage of having changed the size of a fundamental data type in your entire app, go home happy.

The truth is, the rant isn't about Microsoft, and their "churn the whole code base, redefine the world" approach to Unicode (which I am sure was a lot more compelling when the BMP was the entire character set...with wide characters, you get the joy of debugging your entire app combined with the joy of dealing with variable-length characters!) - rather this rant is about the state of documentation and how programmers use it.  And I don't think the documentation writers are to blame.  Besides their having to approach a very difficult problem without ever having enough time and budget, I think they're just giving the market what it wants.

Documentation today is built around a combination of lowest-level reference (e.g. dictionary styled definitions of what a single function's parameters are) and recipes (that is, examples of how to do common tasks).  Low level reference is of course mandatory, but it's not enough. The gap is is in conceptual documentation.  The "why" of programming is under-documented, and from what I can tell, even when the why is documented (in a blog post, jammed into the remarks of a function reference, in an overview to a module), it is often ignored by programmers who are either too busy, too lazy, or too dumb to care.

The idea of programming like this is absurd - to use a library whose design you don't understand is like to take a speech, run it through Google language services, and then give that speech in a language you don't speak in front of foreign dignitaries.  Sure they might look at you like they understand most of it, but fundamental things could be going wrong because you don't understand what you mean* and you wouldn't even know.

Something I have learned via the school of hard knocks is the rule of "no squirrelly behavior". That is to say, if your application does something that is surprising and innocuous, it always pays to take the time to understand what is going (thus rendering the behavior unsurprising).

The alternative is to let the misunderstood behavior exist as "harmless", except that:
  1. Since you don't really know what your app is doing, you won't be understanding what you say for all the code your write from that point on and
  2. more importantly usually the harmless behavior is the tip of an iceberg that will sink your app in a way that is both much worse (crash, data corruption) and much harder to debug.
So in an attempt to end my rant and get back to the real work of drinking coffee and cursing:
  • Always take the time to understand why a library is the way it is before using it.
  • Always make sure you understand what your own code really implies.
  • If you have unexpected behavior, fix it early while it's easy to do so.
And my plea to documentation writers is: please tell us why, even if most programmers don't know why they need to know!

* This is part of Scott Meyers 50th rule from "Effective C++": say what you mean and understand what you say.  It could be the best one sentence of programming advice you will ever get.

Thursday, April 10, 2008

How I tell ++preincrement and postincrement++ apart.

The rules of C operators are byzantine; the best thing you can say about them is that they make some idioms very terse, e.g.
while(*p++) { }
You could go crazy trying to remember the operator precedent rules if you don't use them all the time. But the way I remember the two forms of ++ gets at a deeper issue, so it's more than just C trivia.

First, let us recall that ++a will increment the variable a and return the new value; a++ will increment a adn return the old value. I suppose that Kernigan and Richie, while "designing C" (between bong hits) thought that having the ++ after a would help make it clear that the increment happened after, um, other stuff. In partiular, while(*a++) evaluates the thing pointed to by a for not being zero before incrementing. (But then it increments no matter what. Cute.)

I suspect that I am like virtually all C programmers in that I learned to use post-increment for whatever reason, e.g.
for (n = 0; n < 5; n++)
Now step into the world of C++. Consider the implementation of ++ when our variable is not a simple integer, but rather a very expensive C++ object.

The pre-increment is easy. We change our object to be one more, and then that object is used in the next expression. But post-increment is a lot more expensive. We have to...
  1. Make a new object that is a copy of the object that is being incremented.
  2. Increment the old object.
  3. Use the new object later in the expression.
  4. The new object is in fact a temporary variable invented by the compiler, so it will get destroyed.
In fact, it's even worse...because of the way the operators are written, there will actually be two temporary variables, two copies, and two destroys. If creating, or deleting copies of your object is even remotely expensive, post-increment is much more expensive than pre-increment.

And in that is both a lesson in C++ and how to remember what's going on. In three parts:
  1. Always default to using pre-increment when you can. Make it a habbit to pre-increment, because post-increment can cost you and pre-increment never will.
  2. Remember that post-increment is the "expensive" one, that's how you can tell them apart.
  3. The more expensive post-increment also has the weirder signature when overloaded, e.g. operator++(int), not just operator++().