The Decisions Behind WikiLoopr

September 11, 2012

A couple weeks ago, I created a website called WikiLoopr. Hours after it went live, the site exploded on Hacker News, made its way around Reddit and Twitter, and got featured in an NBC News tech blog. Those first 24 hours brought in over 40,000 hits to the site.

The site went live on the same day that I came up with the idea. Plus, I’ve never paid a dime for hosting. This was my least expensive project ever, both in terms of time and money invested. It is also by far my most successful. Here’s how I turned an idea into a viral success in a single night.

If you want concrete code examples for anything I talk about, check out the source code.

I satisfied my own curiosity.

The idea stemmed from this blog post by Andy Jordan, a brilliant student at the University of Chicago and fellow intern at Syndio Social. After reading his post, I spent way too much time that day looking up random Wikipedia articles and clicking links until I entered a loop. I was hooked. Curiosity got the best of me, and I decided I wanted to build something that could automatically show you the entire path to the loop without all the manual clicking. A couple hours later, I came home from work and got started.

I kept the design simple.

Look at the design. There’s nothing there that’s not necessary. I didn’t set out to create a beautifully designed website. All I wanted was to see the loop that stems from a Wikipedia article. The site has two distinct colors, plus a few shades of gray. As far as images go, there are two logos and a loading spinner. Nothing more is needed.

I didn’t use a database.

Yes, it would have been cool if I had kept track of things like what people are searching for, average length to the loop, and more. But this wasn’t part of my original plan. I didn’t care too much about analyzing the usage… I just wanted to get it out there! This saved me a bit of development time and aided performance.

All the logic is on the client side

Speaking of performance, this one is huge. I run the site on a free Heroku server, which can only process one thing at a time. Unfortunately, processing one request from the user can take quite a while. The site has to find the first link in the article, follow the link and download the next article, find the first link again, etc. Some requests can spawn up to 50 Wikipedia articles. With hundreds of people on the site at once, that would have killed my precious free Heroku dyno.

So I took a different approach. I wrote all the logic in Javascript (well, Coffeescript to be more accurate). When a user visits the site, the browser essentially receives a list of instructions for what to do once the user searches for a page. It’s the ultimate form of parallel processing. Instead of one little server downloading all those pages and finding all the first links, hundreds of browsers can do that work at the same time.

The downside of this is that the site is much slower on mobile devices. But the huge benefit of running everything on the client side far outweighs this minor disadvantage.

I didn’t cache anything

This one is a little controversial. Yes, I could have cached the results so that any time a new page is requested, WikiLoopr stores the first link so it doesn’t have to fetch the page from Wikipedia again. I didn’t do this simply because I wanted to release this thing as quickly as possible, and it simply wasn’t necessary. It may have put a bit of strain on Wikipedia servers, but to a site that serves 2500 pages per second the increase is barely noticeable.

Looking back, I’m really glad I never ended up caching anything. Why? Because the loop kept changing. When I made the site, most pages would lead you to a loop between Philosophy and Reality with nothing in between. Over the next few days, Wikipedians edited pages for Philosophy and similar concepts, which changed the loop. About 6 hours after I launched WikiLooper, the loop contained 28 articles. It has since leveled off at 7. Andy Jordan, the author of the blog post that inspired WikiLoopr, wrote a follow-up piece in which he described the observer effect that was occurring.

I made the site evolve with Wikipedia

As I said, the loop was just two articles. In addition, the path leading up to the loop was usually around 5 or 10. Everything fit nicely on the screen at once. But then as both the loop and the path leading up to it got bigger, they would no longer fit on the same screen. I started getting requests to show how big the loop is. So I added a little display on the right side of the page. Not only does this tell many users exactly what they wanted to know all along, but it makes the site easier to understand.

I didn't waste time hunting down bugs

It would have been nearly impossible to find all the bugs myself. With an input domain of 4,049,583, it would have been pretty damn hard to test to make sure every Wikipedia article worked. So I didn’t even try. I spent about an hour testing it with all the random Wikipedia articles that popped into my head, and that’s it. Then I released it knowing that many articles would cause problems.

And they did. But people were very quick to point them out. People commented on Hacker News and Reddit. People tweeted bugs to me and submitted issues to GitHub. Within the first few hours, I knew about almost every bug that existed on the site. I fixed them later that day, and haven’t noticed any issues since the first few days. As far as I’m concerned, WikiLoopr works for 99.9% of pages and I can leave the code alone and move onto something else.

So there you have it. Feel free to reach out if you have any specific questions.