Monday, April 20, 2009

mod_rewrite for MediaWiki

In moving the X-Plane SDK Wiki to MediaWiki I had to use mod_rewrite. Getting this right took me a few tries.  This is what I ended up with:
# First: any old phpwiki witih a query is rewritten to the
# "clean" mediawiki URL. Note that we do not test for file
# existence - as of this writing the php index.php DOES exist.

RewriteCond %{QUERY_STRING} ^(.+)$
RewriteRule ^xpsdk/phpwiki/(.*)$ /xpsdk/mediawiki/%1? [R=301]

# Also, if we did not rewrite that means we had no query -
# means the default page - map it to the base root.

RewriteRule ^xpsdk/phpwiki/(.*)$ /xpsdk/mediawiki/ [R=301]

# Now we have "clean" wiki URLs, so do not make any more rewrites
# permanent. What we do next is to clean up problems.

# And finally and most importantly: because php is running in a CGI,
# it needs the "ugly" title=$1 form. So...convert the virtual path
# into CGI access.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^xpsdk/mediawiki/(.+)$ /xpsdk/mediawiki/index.php?title=$1 [L,QSA]
Probably the most useful thing I discovered was: by using R=301 you can force the rewrite to be visible in the URL bar. This isn't always what you want in the long term, but it does let you see the output of mod_rewrite, which is handy.

Friday, April 17, 2009

My Favorite MediaWiki Extensions

The X-Plane Wiki is powered by MediaWiki.  I have mixed opinions about this, but I think it was the right decision.  The one part of MediaWiki that was really difficult was re-skinning it, but that may have been due to PEBKAC - I made the boneheaded decision to build my skin off MonoBook. MonoBook is, to put it mildly, not the most obvious combination of PHP and CSS you'll ever read.  If I had started with a simpler official skin life would have been easier.  (I prefer more direct non-php template schemes like you'll find in CMSCS or phpwiki.)

Where MediaWiki shines is in its scale - because it is used so heavily by so many people, including some very high profile applications, it tends to have everything you want either built-in or as an extension.  There's a rich set of special pages for sight management, and it's been pretty easy to stop spamming with minimal effort.  The template mechanism makes creating consistent in-content markers (e.g. standard disclaimers and boilerplate) really easy.

Here are a few of my favorite extensions:
  • FCKeditor. This is the must-have extension.  It provides full rich-text WYSIWYG editing to the text editor.  The transfer between Wiki and the rich preview isn't quite perfect, but it is very, very close.  The editor has hooks for useful Wiki functions, like adding categories, templates, and browsing uploaded images.
  • EmailAddressImage. Makes an image out of email addresses, to prevent email scraping.
  • SpamBlackList. Honestly I'm not sure how well it's doing - we get very little spam.
  • NiceCategoryList. Automatically generates topic-subtopic charts that are a bit nicer than what is built into MediaWiki.
  • ImageLink. Provides a lot more flexibility in embedding images. Since I like to use images as "big icons" for site navigation, ImageLink lets me do within a Wiki page what I would normally do via HTML/CSS.

Wednesday, April 15, 2009

Sometimes You Get a Gift

Sometimes you get really lucky and a library has a routine that just does exactly what you want.

I got lucky the other day, thanks to this function: make_conforming_Delaunay_2.

The X-Plane scenery tools creates a mesh by triangulating a big pile of points, prioritizing the points that are most important to capturing the "shape" of the terrain.  It also adds in a series of edges ("constraints") that must exist, to ensure that the borders between land-water are triangle edges.

The problem is that these edges reduce the "quality" of the triangulation - a long forced edge can force a long thin triangle, which in turn looks ugly.  For example, in this picture, the red edges are forced into the triangulation, causing a lot of long thin triangles.

CGAL::make_conforming_Delaunay_2 is like a one-stop spa and resort for your mesh...it adds the necessary points to cut long thin triangles into more balanced ones, like this.

I don't expect that kind of service from a code library, but when it happens, I like it!

Why CGAL?

The X-Plane scenery tools use CGAL as a base library for most geometry calculations.  This post attempts to explain the situation.

For The Impatient

CGAL uses precise numeric types - replacements for normal floating point math that don't have rounding errors. You can treat the data types that come from CGAL as "opaque" - they aren't really opaque, but their implementation is complex enough that you're better off not looking inside it.  The main things to know:
  • Point_2 has members .x() and .y() that return the X and Y coordinates.  You need to use CGAL::to_double to convert these opaque (but precise) numeric to a double.
  • Be warned: the conversion to_double is not exact!  You may get rounding errors 'just because'.
  • You can't change a Point_2 (or any other CGAL geometric primitive) once it is created.  Instead you would construct a new Point_2 that is based on the original Point_2.
No Rounding Errors?  How?!?

Here's a very brief sketch of how you can have floating point that doesn't create rounding errors.  There are basically two cases we have to look at:
  • For multiplication, addition and subtraction, the risk is that we might run out of mantissa and have to round.  To work around this, we build an object that can dynamically allocate memory for the mantissa.  If we run out of space, we just allocate more.
  • For division, that's not good enough; 1/3 is a repeating decimal (in base 10 or binary) no matter how many digits you have.  So we store the numerator and denominator separately (both having arbitrary mantissa).
Those two techniques give us a numeric type that is capable of performing +, -, * and / without rounding errors, which is actually enough to implement most geometry algorithms.

What CGAL actually does is a lot more sophisticated and well-optimized.  Arbitrary precision math is slow (think of what happens if you replace every FPU operation with a dynamic memory operation).  CGAL can pool and reference count number "objects", use real FPU approximations (switching to slow precise math only when necessary) and has a number of wrappers that defer expensive computations, avoiding unnecessary work.

Ben, Did You Take an H-Bomb to a Knife Fight?

The big question here is perhaps: is it worth it?  CGAL is heavy-weight complex heavily templated machinery and there are penalties for using it (in terms of performance, memory footprint, ease-of-debugging, etc).  Well?

Yes - I can say unequivocally that CGAL is worth it - I know because I have tried building the scenery tools both ways, and the CGAL way is much, much easier.

There is a fundamental problem with the kinds of algorithms that the scenery tools must do (finding the union of polygons, etc.): rounding error.

An algorithm that is correct in principle according to the rules of geometry becomes wrong when coded in floating point; imagine if the mid-point between two points is not actually collinear with the original points...that kind of thing happens all the time with floating point. (Other fun phenomena include points that are not on either side of a line, nor are they on the line itself, and triangles that are neither clockwise nor counterclockwise.)

We have only two real options to deal with this:
  1. Recognize when floating point has failed us and code a reasonable alternative.
  2. Get better floating point.  (This is exactly what CGAL does.)
Now I like CGAL because it has algorithms right out of the box that I need, coded a lot better than I would ever do...examples:
  • Fast incremental constrained Delaunay triangulation.
  • Union and intersection of large numbers of polygons in optimal computation time.
But the big win of CGAL (to me) is in avoiding edge cases.  The scenery code involves a lot of "constructive" geometry - that is, code where I go in and compute some new shapes based loosely on input data in a way that "looks nice" for X-Plane.  These are big blocks of code using fundamental geometric operations...finding the intersection of lines, adding vectors to points, and testing side-of-line.

Thus option 1 (detect edge cases) isn't really viable.  It would mean asking 'how can this blow up' for pretty much every single line of code I write, and in some cases there isn't a really great answer if something goes wrong.

By using exact floating point, CGAL provides exact constructive geometry, which means that I can go write big fat piles of constructive scenery-creation code and trust that the results are what I think they should be regardless of how weird the input data is.

Tuesday, April 14, 2009

Objective C Is Completely Freaking Weird

Well, its memory management policy is, at least.

If you are like me (a C/C++ programmer who has to write small amounts of Objective C for a certain mobile platform), this document contains pretty much all of the answers you need regarding the question "am I leaking memory?"

I'll try to limit the ranting - but there are definitely some surprising things here, particularly if you are used to a more traditional reference-counted environment.  Key points:
  • Reference counting is not automatic like Java, so you absolutely can screw it up.
  • You are expected to not create cyclic reference-count dependencies - see "weak vs. strong" references.
  • Apple's naming conventions are very consistent, once you understand them.  This is important because the rules for memory allocation vary in a way that make the APIs more convenient, but less consistent.  (That is, sometimes objects are retained or auto-retained for you.  Lacking automatic reference counting of everything in the language, this is necessary to keep code from coming completely bloated.  But it means that you, the programmer, have to be able to look at an Obj-C method and go "I [don't] need to retain" and get the answer right 100% of the time.
  • autorelease basically acts as a deferred release, similar to a deferred-destroyer idiom in C++.

Monday, April 06, 2009

Compiling a 10.4u Application with GCC 4.2 on OS X

It turns out you can build an OS X 10.4 compatible application using GCC 4.2.  The trick is: you need to install GCC 4.2 Developer Preview 1, available as a download from Apple via ADC.

The developer preview has the GCC 4.2 support files for Darwin 9 (that is, OS X 10.5) in the 10.4u SDK.

Friday, April 03, 2009

XPTools - A Thing of Beauty

I'm going to start blogging about the X-Plane Scenery Tools code tree here...the source code is too technical of a subject for my scenery blog, which is aimed at authors, not programmers. Now that the repository has been cleaned up, other programmers can start working with the code pretty easily...if they Google for answers maybe some of these posts will show up.

Let me show something that I think is really quite beautiful:

http://dev.x-plane.com/cgit/cgit.cgi/xptools.git/tree/?h=master

That is the root level of the X-Plane Scenery Tools code tree, after a week of Janos and myself bashing at it. A ton of legacy code has been ripped out, file names normalized, library systems standardized across builds, etc. If you want to work with the scenery tools code, you can now do so without going crazy from the mess of random code floating around.

(You will still go crazy trying to understand how the hell the algorithms work. :-)

Here's another thing that I think looks good:


That's the cleaned version of the brains of "MeshTool".  Having it pulled out means that I can now add usability features, like more automated handling of orthophotos quickly and easily.  In fact, so could you!

One more...


Okay - that one doesn't even make sense to me.  The polygon code in the scenery tools had this horrible hack called "dominance" - Andrew can testify to how ugly it was.  The polygon code uses a DCEL as its data structure for polygons.  This means that for every "edge" (line) in the map, there are actually two "half-edges", lines in opposite directions that overlap.

The problem: which one contains the road data?  The old solution was dominance - exactly one was flagged as "dominant" and held the metadata.  The dominance flag helped code figure out where to look/store the data to avoid double-storage.

What a gross hack.  Dominance has been fully removed from the code.  And the data?  It is now possibly stored on either half-edge.  This isn't just a way to remove dominance, it's necessary. A road segment is now stored on the half-edge that goes in the direction of traffic.  So I can now import one-way street grids and the directional information is preserved.