Thursday, September 07, 2006

Programmers' Candy

Aspirin:  Candy for Programmers

Happiness is sitting in a room full of suits and Java guys and trying to explain all the different flavors of IPC you can do on Windows. You're trying to convince them to just go with sockets, because it'll have to work on Unix too in the next go-round, but the other stuff sounds so shiny and foreign and exotic to them, and they really want to have a go at it, moth-to-flame style. Or more to the point, they want you to have a go at it. Somewhere they've picked up the notion it'll be "faster" to go with something else, and not just because it'll have to be done in JNI. That may actually be true, so long as we don't mean "faster" in terms of development time. They also don't have any metrics or estimates on how fast is going to be "fast enough" for their needs. It's all just hunches and irresistable shiny objects right now So you're in a room with these guys, trying to explain the differences between a.) memory-mapped files, b.) shared data segments (blech!), and c.) opening another process and writing into its address space (yikes!). Plus a few other things you tossed in for the sake of absolute completeness, so they can't come back and demand why you didn't tell them about, say, doing it with custom window messages or NetDDE or something equally nutty. And at the end of it, you get blank looks and somebody asks you "Can't you just make an API?" Well, if you mean "Can't you hide the gory details from us with a clean, abstract interface?", sure, of course I can do that. But if you really don't want to know, why do you keep asking me?

Happiness is also explaining, for the umpteenth time, how to use the basic Windows Registry functions properly without screwing up, for really basic no-brainer stuff like listing the keys or values under a given reg key, or getting or setting a given value. It's not hard. If you don't know how, you go read MSDN and do what it says to do. Write a little test program, see if it works how you expect. If it doesn't, go back to MSDN, or just experiment a little until you figure it out. More than likely, you just didn't realize that the name buffer size is an in/out param in RegEnumKeyEx or RegEnumValue, and you have to reset it to the right value after each iteration. When someone says their registry code isn't working, that's pretty much the first thing I look for.

There are a few really obscure hangups and gotchas involving the registry, but you probably won't encounter them unless you deliberately go looking for trouble. (I'm coming at this strictly from a programming perspective, and I'm ignoring larger issues like the lack of meaningful structure, and the whole single-point-of-failure thing, just for example.)

  • Key names can contain embedded null characters. The internal Native API calls use Pascal-style UNICODE_STRING structs to represent key names, rather than the null-terminated strings you see in Win32 land. So you can have two names "Foo", length 3 wchars, and "Foo\0", length 4 chars, and Windows will happily consider them different names. They look identical in Regedit. If you have both, selecting either will open the one with the 3-character name. If you just have the 4-character form, you'll just get an error. Since the Win32 functions expect a null-terminated string, and don't expect the null as part of the name, a name with an embedded null is literally unspeakable with the usual functions. You need to use NtOpenKey instead, but first you need to know you need to use NtOpenKey. There's very little documentation about this stuff out there. MS originally used this trick to protect some of the SAM keys under HKLM\Security\Policy\Secrets, and now it's become popular among spyware authors too. Someone at MS gets points for cleverness, but I still think this is a deeply silly and weird "feature".
  • Registry symlinks are a big botch. I have no problem with the basic idea of symlinks in the registry. Having a CurrentVersion link that points at the current version of something is fine. The problem is that there's no easy way to tell 100% for certain if a given key is a symlink or not. There's a procedure by which you can open a key as a symlink. When you do that, the link data itself lives in a value under the key called SymbolicLinkValue. But just looking for values by that name isn't good enough, because you can just as easily create a value named that under a normal key. And opening a normal key as a symlink just opens the key normally instead, rather than erroring out. There's no property you can query that tells you whether a key is a symlink or not. Which is weird, since the information has to live in the registry somewhere, internally. It just isn't exposed properly to the outside world. Bastards.
  • There also isn't any good way to know whether a given key is "volatile" (i.e. not backed by on-disk hive data) or not. There really ought to be some way of knowing whether the data you just stuck under key X will still be there afer the next reboot.
  • Another Native API quirk: In the kernel namespace, the entire registry has a single root, so that HKEY_LOCAL_MACHINE\Software\Asdf is really \Registry\MACHINE\Software\Asdf, for example. \Registry has two subkeys, MACHINE for HKEY_LOCAL_MACHINE, and USERS for HKEY_USERS. (If you have auditing turned on for any registry keys, accesses will be reported under these kernel-style names.) One fun detail is that the root key itself is not visible with the Win32 functions. I haven't tried this, but it's conceivable that you could mount a registry hive directly under \Registry and it'd be basically invisible.
  • HKEY_PERFORMANCE_DATA works in a totally different way than everything else in the registry. Walking through its contents causes various performance counter DLLs to be loaded and executed, which a.) can take a while, and b.) may raise security concerns.
  • 64-bit Windows introduces a brand new layer of complexity, with separate 32-bit and 64-bit versions of the "same" key, and a goofy redirection layer that tries to give you the right one. If you're a 32-bit app on 64-bit Windows, and you want to see the 64-bit portion of the registry, you need to specify the KEY_WOW64_64KEY flag when opening keys. The 32-bit version of a key is stored in a subkey named Wow6432Node, under the 64-bit version of the key. So when a 32-bit app opens HKEY_LOCAL_MACHINE\Software, by default it's actually looking at HKEY_LOCAL_MACHINE\Software\Wow6432Node, unless the special flag is provided.

    So far, so good. But it turns out that you also need to *NOT* specify this flag if your 32-bit app wants to look at the 32-bit key, using the "real" reg path. If you pass the KEY_WOW64_64KEY flag in this case, instead of HKLM\Software\Wow6432Node, you get the real, 64-bit HKLM\Software. Which has a subkey named Wow6432Node. And if you open that subkey, again using the flag, you get HKLM\Software again, and so on, ad infinitum. Which is bad.

    The only solution I know of so far is to look for "Wow6432Node" in the key name, and take that as a sign to not use the 64-bit flag. At this point I don't know if keys named "Wow6432Node" are automatically 'magic' or not. If not, even looking at the key name won't be foolproof.
  • Did I mention there's such a thing as remote registry access? And the 32-vs-64 bit thing can crop up there, too? So you have to either know or figure out what sort of CPU the other box is running, just so you can be sure you're talking to the registry correctly. That's just not very nice at all.
  • If you need to change or validate settings for all (or arbitrary) users on a given box, you may need to manually mount their user hive under HKEY_USERS, and then manually save the hive when you're done, since HKEY_CURRENT_USER is always you, whoever you happen to be, and not the other user account. Apps really ought to have global settings under HKEY_LOCAL_MACHINE or something, but often they don't. Windows isn't in the business of enforcing stuff like this.
  • Windows isn't even in the business of enforcing a limited set of data type in registry values (the stuff you you see in the Type column in Regedit, such as REG_SZ, REG_BINARY, etc.) When you set a value, you can set it to any old 32-bit value you like. So if you have, say, a switch statement based on the type of a registry value, handling all the types listed in MSDN is not enough. You need a default case that at minimum doesn't lead to your app exploding. This is the voice of experience speaking here.
  • Similarly, Windows doesn't even enforce the types it does know about (although Regedit tries). Just because something says it's a REG_DWORD, it ain't necessarily so, and you still have to check whether it's actually 4 bytes or not. It could be zero bytes, or 5, or 4095, or whatever.
  • When you create a key, you can optionally specify a "classname", a string of arbitrary length that serves no known purpose and isn't exposed by the standard registry tools. If you're worried about people hiding stuff in your registry, or you want to hide stuff of your own in the registry (your own, not somebody else's, please), this is a good hiding place. It's only visible with RegEnumKeyEx, and then only if someone has the presence of mind to provide a classname buffer. Once the classname is set, the only way to change it is to delete the whole key and re-create it with the new classname.
  • If you're backing up a section of the registry, or you just need to be sure you can read all of it, you can open it with REG_OPTION_BACKUP_RESTORE, which ignores all those pesky file permissions and so forth. The problem is that you can't use this flag with RegOpenKeyEx, but only with RegCreateKeyEx, with the unhappy side effect that if the key doesn't already exist, Windows helpfully creates it for you, and you may have to look at the "lpdwDisposition" outparam and figure out whether you just created the key or not, and act accordingly.

No comments :