Dinner with Stroustrup – Takeaways

Recently, I had the pleasure of having dinner with Bjarne Stroustrup, the creator of the C++ programming language, in Mountain View, CA. I talked to him about JS++, and here are a few takeaways (no pun intended):

Every industry is full of buzzwords; JavaScript is no exception. In order to get mainstream acceptance, JavaScript thought leaders maintained that JavaScript does support object-oriented programming – through prototypes. This is true; technically, JavaScript does support the pillars of OOP: inheritance, abstraction, encapsulation, and polymorphism. Historically, the problem with JavaScript isn’t the computer science concept of prototypical inheritance versus class-based inheritance; it’s the implementation of prototypes in JavaScript that result in spaghetti code.

Instead, Bjarne suggests: how does it benefit you? How does prototypical inheritance benefit you over classes? You don’t get SOLID, you don’t get custom types, etc. For instance, you’d have to really stretch the imagination to claim JavaScript supports the Liskov Substitution Principle, and some possible implementations for SOLID principles in JavaScript would – once again – result in spaghetti code. Don’t listen to “thought leaders” who have been yelling for years that prototypical inheritance is, in fact, superior to class-based inheritance (usually built on very weak arguments like “expressiveness”). Think critically. How does it benefit you? Does it make you more productive? (Certainly, spaghetti code can’t make you more productive.)

Furthermore, there is this constant re-invention of the wheel, such as trying to fit classic design patterns and paradigms into prototypes; this results in regurgitation after regurgitation of the Gang of Four Design Patterns book, and, oftentimes, these books are error-ridden and the authors do not have the depth of understanding of the design patterns in question to be writing a book about them. What’s wrong with classes and benefiting from all of the knowledge and intellectual capital that has been built up over the years for class-based OOP? The Gang of Four Design Patterns book is widely regarded as one of the classic texts on computer science and software engineering. Again, how does it disadvantage you and your work if you use classes rather than prototypes? Don’t listen to doomsday scenarios. Tune out the fearmongering. The truth is, classes have been used for decades in mission-critical scenarios like banking, telecommunications, aerospace, and rocketry.

As Bjarne suggests: what is your problem, and how can OOP or classes/prototypes solve them? And to the purists and zealots, what did he have to say? I quote: “Are you a Nazi?!” There are always going to be a crowd of people that will tell you that prototypes are the only way, that classes are bad, and only prototypes are the light. So I want to end this with: would you listen to a Nazi?

With that said, JS++ – by its very nature – is multi-paradigm like C++; it supports both prototypical inheritance and class-based inheritance.

Inefficiently Efficient

I’ve been asked about this twice these last few weeks so it’s easier to blog about it. This blog post will be written within the context of C/C++ where there is usually an (over-)obsession with performance or at least much more performance “awareness” than with higher-level languages where developers accept that they’re pretty slow.

I love clean code. Clean code is both an art and science. One of the most important elements of clean code is naming. If we name our variables well, our code can be so readable that we don’t even need comments! It brings us a step closer to declarative programming. However, it takes more than just good variable naming to write clean code. Sometimes, you might just want to eat up more memory for the sake of readability. This is what I call “inefficiently efficient.” Let’s take a look…

Oftentimes, C/C++ programmers can become so obsessed with performance that they begin to micro-optimize. We’ve all heard the quote, “Premature optimization is the root of all evil.” While it’s all well and good if you aren’t overoptimizing to the point of impacting readability, programmers can become frightened to death with the thought of personally introducing inefficiency into the code. “What?! Inefficiency under my name?! With the VCS to prove it too!” This is an instinctual reaction. However, allow me to discuss why purposefully introducing inefficiencies to write clean code can do a world of good for you (and your code).

This is our if statement:

if (node->operator == "-" && node->argument->is<NumericLiteral>()) {
    // ...

What on Earth is that doing? As you’re writing code like this, you might know in this moment exactly what it does. Come back and revisit the code in five years, and let’s see if your memory is so fresh.

I’m a fan of writing code where, upon glancing at it, you can immediately tell what it does. There is no need to inspect the code, and there is no need to jump from file to file trying to figure out how it all interconnects. Following this principle, let’s try and refactor the above code:

bool isNegativeNumber = node->operator == "-" && node->argument->is<NumericLiteral>();
if (isNegativeNumber) {
    // ...

Now it’s perfectly clear what we were testing for in the if statement: we wanted to know if we had a negative number!

The beauty of variables is that you can use them to give a human readable name to almost any expression. In fact, that’s what variables were designed for! What’s happening is that we consume slightly more memory to greatly enhance readability. Let’s put numbers to that: we are allocating 1 byte (on the stack in this case, not even the heap), but we are getting 2x, 10x, 100x better readability, maintainability, and scalability. Is that a trade-off you’re willing to make? In a more advanced usage, you can actually break down expressions into multiple subexpressions and baptize those subexpressions with human-readable names.

So, yes, we are introducing inefficiency. In return, we gain scalability. I’m not talking about vertical scalability and efficiency where your program is so lean, mean, and efficient it takes almost no memory. I’m talking about scaling the software development process, scaling the project, scaling where – once you add more team members – they will know exactly what your code is doing and how to use/modify it.

In the end, is it all worth 1 byte on the stack? I think it is.

Process for Major Code Rewrites

Eventually, in our careers as software developers, we are going to screw up. For some of us, we will screw up massively such that, after exhausting all the potential options, we come to the knowledge we must rewrite our code.

I’m not referring to refactoring, which I define as changes to the code that do not result in changes to the program’s behavior. I’m referring to a proper rewrite. (Due to fundamental scalability problems, fundamental architectural problems, platform revisions, and so on.) Even successful companies in the field of technology have faced this problem before. You’re not alone.

We recently finished a major code rewrite early this year. Surprisingly, I was unable to find many actionable resources on the web. There were a number of theories and lots of opinions, but real-world experience seemed scarce.

The experience for me relates to a major code rewrite for the JS++ compiler. We could not scale further. Compile times occurred in O(n2). This wasn’t a micro-optimization or single class at fault. There were fundamental errors we made in the design of the programming language which carried over to its implementation in the compiler.

While I endeavor to not obsess over performance, this reached the level of performance degradation which would negatively impact UX. Have you ever stopped using software because it was too slow to be usable? We were in that boat. We’ve gone from 5 minute compile times generating 40mb of cache to 7 seconds generating 30kb of cache.

At the time we began the rewrite, the project exceeded 100,000 lines of code (without counting third party libraries) spanning some three years worth of work. In terms of real-world experience, this should be suitably complex.

It took us just three months to complete the rewrite. This was the strategy I devised and which we successfully executed:

  1. Do NOT rewrite from scratch (without a VERY good reason).
  2. Identify what needs to be rewritten.
  3. Break the rewrite down into chunks.
  4. Break the rewrite down into testable chunks.
  5. Start small. One monolithic rewrite is just a sum of its parts.

Let’s break it down step by step:

1. Do NOT rewrite from scratch (without a VERY good reason).

As tempting as it may be, conventional wisdom dictates that we should NOT rewrite from scratch.

Whilst actionable software rewrite strategies were scarce when we set out on this, an article by Joel Spoelsky was helpful and convincing. If you read nothing else, consider these words of wisdom:

“It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don’t even have the same programming team that worked on version one, so you don’t actually have ‘more experience’. You’re just going to make most of the old mistakes again, and introduce some new problems that weren’t in the original version.”

However, there are good reasons to rewrite from scratch, such as using a better programming language that better fits your problem domain. The benefits of a full rewrite need to fully outweigh the disadvantages. If you plan to rewrite from scratch in the same programming language, with mostly the same developers, and so on – it might not go as well as you think it will right now. You ought to be actively persuading yourself with all the reasons not to rewrite from scratch – not the other way around.

Remember: rewriting from scratch does not guarantee “better,” no matter what the perceptions or biases inside your head are telling you right now. You are taking a massive risk with a full rewrite. It’s a gamble when you have a sure thing that is built already.

2. Identify what needs to be rewritten.

The reason you got here in the first place was likely a lack of foresight. Don’t make this mistake again.

First, analyze the problem. In this phase, you are trying to come up with every reason NOT to rewrite at all. Is there a clever fix you can come up with? Can you do things a different way without a rewrite? Can you sweep it all under the rug and build on top of what already exists?

Poor code quality is not the reason for a rewrite. It’s the reason for refactoring. Otherwise, do you have fundamental behavioral changes you must make? The keyword is “must”. Is it like food, water, and shelter for your company, or is it more like buying a nice vase? Separate the important from the luxuries.

For commercial software, these will almost certainly be business reasons. For us, we knew the software would fail in the market if it was too slow to use. We were developing a tool to enhance developer productivity while simultaneously absorbing it in compile times.

Once you’ve analyzed the problem, you must devise “how” you are going to rewrite. Which new algorithm(s) are you going to use? Which new architecture? What are the consequences? Do the positives of a rewrite outweigh the negatives? How long will this take? You must carefully assess the who, what, when, where, why, and how before you start.

3. Break the rewrite down into chunks.

This was the “Aha!” moment. As I mentioned, when researching how to move forward with a large code rewrite, there existed a dearth of actionable resources. After careful analysis, I had to create a reasonable plan of action. We were talking about a massive and daunting rewrite. Morale was low.

Chunking solved our problem (in terms of engineering complexity and psychological morale). “Chunking” is just breaking down one very large task into smaller, more manageable chunks. Generally, the more granularity you can achieve, the better.

4. Break the rewrite down into testable chunks.

So far, we’ve discussed strategy. We haven’t actually talked about the software side yet.

You’ve broken your large code rewrite into much smaller, more manageable tasks. This is from the project management perspective. Now, when you actually look at all your code, where do you start?

First, take advantage of version control. Create a new branch. Fortunately, we were using git so branching was cheap. A new branch minimized risk. If the code rewrite failed, we would just scrap the branch and revise our strategy. (Fortunately, we didn’t have to as that would have eaten up precious time.)

Starting from your new branch, you do not look at implementation first. Instead, you’re taking a TDD-style approach from here: find a relevant test, make it fail, fix it so the test will pass, and repeat.

5. Start small. One monolithic rewrite is just a sum of its parts.

As mentioned in the last section, start with your first test and make it fail; fix the test so that it passes and repeat the process with the next test. Over time, you will incrementally have applied the fundamental changes across the entire system.

And, really, that’s it. We had A LOT of integration tests failing (naturally) with just a few small changes when we first began, and it was very daunting. I can assure you, the light is at the end of the tunnel. The beauty of breaking everything down into testable chunks is that you will have tangible progress which will inevitably boost morale. You’re not staring into a black box and hoping to go from nothing to a fully rewritten and working final software product. So the final advice is: be patient.

I’ve worked on several large-scale projects before JS++. I’ve worked on projects large and small since 1997, in both waterfall and agile environments, and rarely have I needed to rewrite, let alone rewrite in such a massive, make-or-break, and demoralizing scenario. It is human to make mistakes; it is human to make massive mistakes. My hope is that this article will shine a light for others facing a similar challenge.