How Not to Read Code #announcement

As most everyone knows, the site has a lot of bugs. 99% of which are caused by my own poor planning, and tendency to hack together modifications to the site with code that "looks like it works". – Yahweh, October 2007

That's how I'm feeling right now, but I'm getting more competent with each mistake I make and realize, each modification I hack together and have to fix, each time I do something that "looks like it works" but doesn't and I have to figure out why.

Seems I take a more optimistic attitude to it, though. I suppose it's partially because I'm an incorrigible optimist, partially because I have experience with .NET from working with the code of Windows Forms applications, and partially because I also already have a lot of Web development experience under my belt.

But I firmly come from the other side of the tracks when it comes to that; my experience prior to FSTDT has been mostly LAMP (Linux, Apache, PHP, MySQL*), the exception to the LAMP acronym being that one of the projects I worked on professionally mercifully chose PostgreSQL. So far as Apache goes, I've never had to do much work configuring it aside from mod_rewrite.

*I hate MySQL with a passion for so many reasons that are beyond this post, and now that I no longer have to deal with it to get paid, I refuse to work with it.

In other news, voting is now open to anonymous and unregistered users! And that's actually because I programmed, compiled, and 'plugged in' a couple of classes that automatically check for suspicious voting behavior and emails my FSTDT admin account an alert if it catches anything, so I don't have to manually keep an eye on it myself. It intercepts parts of the QuoteComments.aspx codebehind if the current quote is either of the voting pages or the nomination page. It also the first and only Visual Basic code in the current FSTDT codebase, though it's only going to be there temporarily.

Now to the main topic of this post: I was wrong about something I said earlier in the Site / Off-Topic discussion thread, namely that the BBCode processor is an ancient external library. In actuality, it appears to be custom-programmed, or if not, I still have the full source code all the same. The reason for my error is a combination of several coincidences (a couple almost freakishly so) and some stupid assumptions on my part. I feel like a complete fool. But I'll tell you a (not so) little story about what happened:

First, the external coincidences: There is another BBCode processor written in the stone age that has the exact same name as the class files of Distind's. Both of these store their settings and tags in a database, something I find a little unusual for a BBCode processor. Both are also written for the .NET platform, though I can't tell you whether the other BBCode processor was written in C#, VB.NET, or something else entirely. Either way, it almost certainly had to have been one of those two but that's beside the point.

Second, the internal coincidences: A few days after receiving the FSTDT source code, my walking through it led me to the BBCode class files. I skimmed over them (more on the reason for my skimming in just a bit), and a number of identifiers (variable and method/function names) stuck out to me and gave me the impression that these classes were some sort of wrapper* to the "real" external BBCode processor: frequent use of words like get, send, use, precompiled, translate, ConvertArguments, etc. all strike me as "wrapper-like" language. Wrappers are a pretty banal thing so I lost interest pretty quickly and skimmed over that code even less than the the other code in that directory — most experienced programmers have seen a wrapper class or library at some point or another, and some have written them (me being on that list from back when I was a programmer-for-hire).

*So this makes a little more sense, a wrapper is a class or other structured code that acts a mediator between a program's code and an outside library, i.e. all data transfer and other interaction between the program and external library are done through the wrapper class. There several reasons to use a wrapper, but going into that is too far astray from the current topic. Even a nonprogrammer could probably think of a few useful purposes for something like that.

I also noticed a surprising lack of words used in identifiers that I'd expect to see in a BBCode processor: lexer, tokenize, parse, expression, state, seek — basically I was expecting it to work a lot like a compiler that takes programmer-readable source code and compiles (translates) it into computer-readable binaries. I guess I had that assumption because that would be the approach I'd take to programming a BBCode processor — and is the one I'm taking for the one I am making now — since processing / converting data and compiling code are tasks suited to the same basic kinds of algorithms.

Upon arriving at this "realization" about the BBCode processor, I set off to find the "real" BBCode processor. I couldn't find any trace of it in the compiled binary or library folders. I thought that was a little odd (when in actuality it should have been a giant red flag I was not on the right trail — my blond comes from a bottle, but perhaps the chemicals from the dye have been somehow seeping through my scalp and skull and into my brain...) So I decided to go searching the 'net for it, and lo and behold, I find an external .NET BBCode ~~library~~ assembly whose name is identical to the BBCode class name. This must be it!!

...except it wasn't.

This first started to dawn on me when I could not find the DLL to the damned library anywhere, not even the GAC. I even checked for it on the server. Nothing. Hmm... maybe the BBCode classes in the FSTDT source code were borrowed from the source of this. Wait... there is no source code to it available. And no documentation to it. This thing must be shitty. But wait... the FSTDT BBCode classes are also kind of shitty about some things! And both access the BBCodes from a table in the database! Eventually I got the (legitimately) bright idea to just freaking read and comb through the FSTDT BBCodes classes with the same thoroughness I gave the main code — what I should have freaking done in the first place. Lo and behold, it was the real BBCode processor code all along! It's also really, really, really bad in comparison to the code for the rest of the site, which is programmed competently enough. Its primary issue is a lack of polish.

Now for the reason why I didn't do go through the BBCode files with the same attention I gave the others. "I thought it was a wrapper" is only part of the reason — why didn't I read through it and see that it wasn't? The short answer is that, as I started earlier, I was skimming (to be generous), not reading. And the reason for that is that the BBCode class was part of a sub-project whose contents were almost entirely unimplemented, not-quite-complete code* that existed as a sort of island, having no kind of entrance point except for instantiating User and BBCode objects from their respective classes within the sub-project. I was not aware of the latter being referenced at that time, so I just skimmed through it like I did the other unreferenced code, and I arrived at the stupid-ass conclusion I did. And I stuck with that conclusion even after I realized it was referenced.

*I'll talk more about what that code was intended to be for in a later post. It's pretty interesting, and I intend on starting to incorporate some of its ideas into the FSTDT rewrite as soon as its stable and at functional parity with the current version of FSTDT.

I mentioned earlier that the BBCode processor is lik really, exceptionally, extravagantly awful. But if the sloppiness of this thing's code is that bad, then no better are my stupid-ass assumptions, stupid-ass self, and stupid-ass failure to actually seriously read the damn code properly to start with, instead of only doing it after I had was at a complete loss as to what was going on.

Now let's look at some of the code of this, err, masterpiece.

This is just stupid on so many levels, like borderline CodeSOD stupid (sorry, Bossman):
private static Regex UrlRegex = new Regex("\\b([\\d\\w\\.\\/\\+\\-\\?\\:]*) ((ht|f)tp(s|)\\:\\/\\/|[\\d\\d\\d|\\d\\d]\\.[\\d\\d\\d| \\d\\d]\\.|www\\.|\\.tv|\\.ac|\\.com|\\.edu|\\.gov|\\.int| \\.mil|\\.net|\\.org|\\.biz|\\.info|\\.name|\\.pro| \\.museum|\\.co)([\\d\\w\\.\\/\\%\\+\\-\\=\\&\\?\\: \\\\\\"\\'\\,\\|\\~\\;]*)\\b", RegexOptions.Compiled);
Programmers in the audience: What is wrong with the above code when it interacts with the following code?
{ input = HttpContext.Current.Server.HtmlDecode(input); if (Uri.IsWellFormedUriString(input, UriKind.RelativeOrAbsolute) && YCode.UrlRegex.IsMatch(input)) return input; return "#"; };
Can you identify what bug this causes? If you can get it right and explain why, I'll put your name and achievement in an award at the top of this post.

How Not to Read Code #announcement fstdt.com blog

0 comments

Confused?