Note: this is a bit of an unfiltered ramble, but I wanted to make some sort of public statement about how our first week after PuzzleNode's launch went.
Last week, we launched a programming quiz website called PuzzleNode as a fun way to conduct new student admissions for Ruby Mendicant University. While the primary purpose of this site is to act as a sort of entrance exam for our core skills course at RMU, it is also suitable for general use. Because it is modeled in the style of Project Euler or the Internet Problem Solving Contest, people aren't limited to solving the puzzles using Ruby, any language will do.
In less than 7 days, we've had 110 people create accounts, 27 attempt at least one puzzle, and of those folks, 12 solved at least one problem. Two folks have even solved all four problems, which amounts to no small amount of work. These numbers are modest, but better than we expected for a project that has only indirectly been mentioned through our call for participation for RMU. People seem to enjoy the problems we posted so far, and we're still seeing new signups each day. But after a week or so, we're feeling more embarrassed of our execution on this project than we are proud.PuzzleNode was built in the same manner that our original work on university-web was conducted; an hour here and an hour there between active RMU operations such as courses, alumni network activities, and other administrative work. The code is something Jordan and I hacked together whenever we got a bit of spare time. This is not how we usually like to do our work, because it results in sloppy, unreliable products. To put it bluntly, we hardly tested either the puzzles or the application at all before we went live with PuzzleNode, and that was a very bad idea.In the first hour after launch, I immediately detected a critical bug in the system that we did not anticipate ahead of time. We were using send_file for our file downloads, which worked fine when running the app in our local environments. But we quickly found that send_file did not work on our production server, which is Passenger based. I did not yet get a chance to look up the root of the issue, but I had run into this problem once before and found that it's possible to work around it by using send_data instead. So, I went ahead and cargo culted that old fix and it solved the problem, but likely not before lots of folks tried to download input files and got 0k file sizes.Soon after that, bug reports started to roll in about the problems themselves. We simply hash uploads and compare them against a reference fingerprint, which means our system is very stupid about things like whitespace. This is something we're okay with because we keep the output specifications very simple, but two of our own solutions did weird things with whitespace by accident. We forgot to put a trailing newline at the end of our solution files, which is unintuitive and caused a bunch of false-negatives for submitters until we fixed them and re-uploaded them. Soon after that, the real problems began to uncover themselves.Two of our four problems did not have unique solutions, which when you are using hashes to verify results, is a big problem. One we fixed by uploading a slightly modified input file which guaranteed a unique output. Another we fixed by introducing some instructions on how to break ties between equivalent solutions. It took a long time to discover that these problems were flawed, and took a fair bit of back and forth with frustrated submitters who were quite sure they had a correct answer, but not sure why PuzzleNode was telling them they were wrong. Thankfully, these folks took it in stride, giving us patient, helpful bug reports until we uncovered the root issues and solved them. Fixing these problems caused solutions to start rolling in, a sign that things were looking up. But somewhere along the line, bad turned to weird.Lots of folks successfully solved Problem #1, so we had assumed it wasn't flawed. But we were also receiving problem reports from folks who were pretty sure their solutions were right, which felt like a contradiction. Sure enough, our solution was wrong, but not for any reason that we could predict. We learned via a tweet that apparently, the support for Banker's rounding in BigDecimal in Matz's Ruby (MRI/YARV) is broken, and always has been. That forced us to implement our own rounding function, and cross check it against JRuby and 1.9.3dev to confirm our new answer. This seemed to solve the problem, but required us to post a glaring disclaimer that basically said "Be careful not to get bit by the same bug we did".In the end, this whole thing feels like a bit of a clusterfuck, but it's worth mentioning that it was never meant to go this way, and that we've already put plans in place to prevent this from happening again in the future. Our original plan was to run the puzzles internally within RMU's alumni network for several weeks before the entrance exam opened, to test both the app and the puzzles for correctness. I was also supposed to sanity check the rough drafts of puzzles the students wrote, and maybe even hand verify their solutions. But instead, what I ended up doing is uploading them the day of release and letting them loose on the public. Such is the life of running something like Ruby Mendicant University, which is in almost in every possible way akin to a startup environment in which I'm the frazzled founder left holding the bag when the shit hits the fan. As important as PuzzleNode is to us, it was one of 100 things I needed to do for RMU that week.Thankfully, our "Ready... FIRE! Aim..." approach has its limits, and we're actually craftsmen and professionals at heart. We know when we've botched a release, and we botched this one. We were scared that if we didn't release it on the date we promised, it might get buried under 100 other things, and bringing the site live forced us to treat it as a real thing. But we wasted a lot of people's time with badly tested infrastructure and problems, almost entirely due to my own lack of available time to do the proper QA checks. So, with that in mind, it's not enough to say we're sorry, I want to make sure something gets done about it.Starting tomorrow, PuzzleNode is going to have a proper support structure for it. One of our alumni at RMU, Brandon Hays, has kindly volunteered to manage the triage, testing, production prepwork, and launch of new problems. We are also going to modify PuzzleNode so that we can pre-release problems to the RMU alumni network so that they can be tested ahead of time. In the future, every problem will be solved by at least a few testers before it goes live. This will also serve as QA testing for the app itself, which now that we're out of crisis mode on, we'll commit ourselves to working on in a bit more careful fashion. But when things do break down from time to time, you'll have an official way to reach us, and someone ready to help you when you do. This is what we should have launched with, and I'm sorry that we didn't. But hopefully, our ability to fix things up quickly will restore any confidence we lost by doing such a sloppy launch.
Stay tuned to the announcements page on PuzzleNode for details about the support system we're putting in place. Until then, please go ahead and try the puzzles now if you haven't seen them yet, or try them again if you were having problems before. All in all, it's a cool site that our staff and students have put a lot of effort into, and I hope that it becomes a fun diversion for any hacker who enjoys a good puzzle from time to time.
PS: For those concerned about the broader problem that the velocity of the materials and tooling being produced around RMU is coming at the cost the quality and consistency of our work, we hear you. The spring and summer are going to be dedicated to doing less, better, and we're taking lots of steps in that direction. More to come about that soon. And for what it's worth, all these corners we've been cutting are because we pour all our energy into making RMU courses themselves amazingly high quality. But there are ways we can keep that up without shipping crap for our supporting systems.