Drafts

Deadlines

“You can ship all your features, or you can ship on time, but you can’t do both” – God A Story Pachyderm Projects (I’ll call them “namespaces” for blog purposes): “all the functions in PPS identify repos with a string: the repo name. Well, with projects, we need two strings: the repo and the project. This is a major refactoring, affecting all of our functions and classes, as well as our database schema. ...

Time Management for Software Engineers

The nature of being a Software Engineer: you wake up every morning, and you have a million things you need to do, all of which will kill your startup/get your project cancelled if not done: a hundred tests to fix, a thousand messes to refactor, ten thousand features to add. This week, you have time to do two of them. Unless you decide to really push it, starting early and staying late, charging boldly into the abyss; then you might be able to start on a third. ...

Debugging

The epistemological structure of a piece of software and a scientific theory is the same: you can’t “prove” a theory is correct in the same way you can’t “prove” a piece of software has no bugs, but you can do lots and lots of testing The logical parallel clarifies (IMO in an interesting way) a fact of scientific research that I didn’t understand for a long time: you need expertise/peer review to evaluate scientific studies. This is for the same reason that programmers need domain-specific experience to know whether a given piece of software is “well-tested” or not. “Well-tested”, as a criteria, is defined relative to other, similar software and to standard practices in that field. This is why “facts guy” is often wrong: they don’t know what works, so they don’t know if a given experiment is rigorous. When you have any kind of non-trivial bug in your software, you have two problems: (1) the software is broken and (2) your mental model is broken (i.e. you don’t know why the bug is happening because you don’t know what the software is doing). It’s sort of impossible to fix (1) without fixing (2), so my debugging technique focuses on fixing (2). Fixing broken mental models is the purpose of the scientific method, and it really works great for debugging. I just keep a Google Doc, and I use my mental model to make guesses about what the software is doing (but not necessarily about the root cause of the bug; just anywhere I think I might be wrong about anything). Once I have a hypothesis about what the software is doing, I literally write out “Hypothesis”, “Experiment”, “Result” in my Google Doc. I think of one or more experiments to test the Hypothesis, write them down, do them, and record their results. “Result” always starts with “disproves hypothesis” or “consistent with hypothesis” (again, you can’t ‘prove’ a hypothesis is true. But you’re testing it, and “consistent with” means the hypothesis passed this test, so it’s not wrong yet). I color-code the “Hypothesis-Experiment-Result” block red or gray if the hypothesis is disproven, and green if it’s not. This technique parallelizes well: if lots of people are working on a bug, each person can do an experiment in parallel. Similarly, if a manager wants to see progress in investigating a bug, this list of hypotheses and experiments is a good way to show that.

Developing Incrementally

Incrementality is a style of development that affects everything in a software company, from “how to structure PRs” at the bottom to “how to release and market products” at the top. I have a collection of practices that I’ve learned, where general theme is to make software development more incremental. I use them unless I have a good reason not to. I’ve seen more failures from insufficient incrementality than from superfluous incrementality, but I’ve seen a non-zero number of failures of each type. Starting from the top: it’s good to build MVPs. Every product is, at release time, an experiment testing the hypothesis “people will like this”. Engineers are often anxious about releasing MVPs because they have visions of being overwhelmed by operational problems. I’ve learned that, typically, no one uses a piece of software on release, and you usually have several weeks to fix things before your software gets any users at all (even with good marketing), and perhaps months before you get more than a handful of users. At a lower level of abstraction than the software product itself: I usually try to include the ability to release experimental features. I usually implement this with a single “experimental mode” feature flag, client library, or beta release series, containing all experimental features to limit combinatorial complexity. I know some projects, e.g. ember.js and Google Chrome include a set of feature flags, one per experimental feature. If you’re confident you can manage the combinatorial complexity, this is better for users because they can use as little experimental code as they need This way, you can release features as “experimental” as you develop them, get feedback from 1-2 interested users, iterate, and then release those features as “non-experimental” in the next major release. You can greatly reduce value risk and product risk with this approach, and also provide more value directly as experimental users aren’t stuck waiting for “the next big release that fixes everything”. Finally, at a lower level of abstraction than “features”: I strongly endorse writing code incrementally: Write a design doc before writing any code. Even if you don’t show it to anybody (initially) design docs are much shorter than code, but detailed enough to reveal a lot of design problems. Iterating on the design is much, much faster when writing English. They’re also a useful piece of documentation (make sure to include why the project is needed) They can obviate annoying status meetings; just record your implementation progress in the design doc as you go and send it to partners/managers who want to see progress. On teams with limited product vision, a common problem is that there are too many ideas. Design docs serve as a crude triage mechanism by imposing a “proof of work” burden on new ideas. If someone wants to take the product in yet another new direction, you can delay (or sometimes eliminate) debate by asking them to write a design doc first. This is pretty dysfunctional, but it’s better than actually changing direction every day. Write any new persistent data structures or schemas next. Whenever writing new code you should always write the data structures first12 When it’s time to write code, I’m a huge, huge fan of breaking up patches as much as possible. Reviewers are sometimes annoyed by the flood of patches, but in my experience, code gets merged much more quickly and safely this way, because each patch is between easy and trivial to review, so they get reviewed immediately. To do this, I often implement each change twice: once as a monolithic patch that contains a whole prototype of the feature (which I eventually discard), and then again as a series of small patches. I use the monolithic change as a guide for what’s left to merge (by continually rebasing on ’trunk’ and refactoring as I merge patches), and try to factor out: any non-functional changes (e.g. updating comments, renaming variables), merged as separate patches. In general, since I’m trying to make reviews fast, I also try to keep diffs small, and factoring out non-functional changes is critical to that goal. For example, if I move a function, rename it, and change the implementation, I’ll make that three separate patches: Moving a function is trivial to review (the diff is the size of the function but the lines are the same) renaming a function is trivial to review (the diff is 1+number of callers, but every diff line is a simple replace) changing the implementation is nontrivial to review, but because the function has already been moved and renamed, the diff is no larger than the function body and shows exactly what’s different. any new classes or internal data structures, with no implementation. This attracts a lot of design feedback that is much easier to apply before the implementation is written any new methods/APIs (again, with an implementation of “error: not implemented”). The implementation and tests are added in a second, now-smaller followup patch. each API call’s implementation, and tests for just that API Other writing (specifically about breaking up patches) that I’ve done on this, which I’d like to incorporate: ...

Technical Risk

Software people love to debate the value of software timelines/project deadlines123. I think this hand-wringing arises from an incomplete understanding of how deadlines get missed. Deadlines are really a problem when software projects are harder than you thought they’d be. Emphasis on “than you thought” rather than on “harder”. Not knowing how hard a technical project is, is the definition of technical risk Technical risk and difficulty are totally orthogonal. Digitizing forms is high-difficulty, low-risk. “I think there’s an API for that” is low-difficulty, high-risk Risk depends on both the nature of the project and the knowledge of the developer. Everything is higher-risk for new teammates who don’t know the tech stack well, and it’s highest risk for new devs who don’t know any analogous tech stacks A challenging corollary of this is that only the implementing developer knows how risky a project is, because only they know what they don’t know. Another corollary is that senior engineers can provide a huge amount of value just by de-risking others’ projects with the knowledge they passively hold (by estimating whether an approach will work and answering questions as the work progresses). There are lots of techniques for managing/mitigating technical risk: spikes (for the risks you know about), MVPs (or, my favorite game: what would it take to get this done in T/2, which forces you to reflect on what you know will work vs. what you hope will work) https://jproco.medium.com/how-to-deliver-software-without-deadlines-872f8eb244b0 ↩︎ ...