Optimizing for learning vs results
Yesterday, there was an active discussion on Hacker News that was inspired by a blog post of somebody railing against the woes of large language models being able to allow people to do things without understanding what they're doing. In particular, his gripes were, in the end, around edge cases, at least on the surface. But deep down, psychologically, this discussion happens every time there's a new technology where someone laments that the old technology had nice benefits, the new one doesn't. Or that in order for them to get where they are, they had to go through certain difficult trials that involved pain and suffering, whereas the new ones don't. And yet they can get the same things done. One of the two.
In particular, there's one theme that always comes up, which is that the new technologies allow you to do things without understanding what you're doing. In the short term, this is fantastic. Now you can do things that you didn't know how to do before, and you don't need to pay the heavy learning costs in order to accomplish a task, which is awesome. The concern is the long term, where if one continues to do things without understanding what they're doing, then maybe they'll never actually learn the thing. In which case, when things become too difficult, they'll find themselves unable to proceed. And so this discussion can be twisted slightly for our purposes, for coding faster, which is when and how do you optimize for learning and when do you optimize for results? If you use technologies that solve problems for you quickly, then you're definitely optimizing for results. If you try to do things out and sit and really grind things out, then you're optimizing for learning, which in the long term will really come in handy. So let's dig into that discussion.
Optimizing for learning
Optimizing for learning is extremely important in the long term for fast coding. The more you understand, the better you'll both be able to solve problems by hand, as well as solve problems using sophisticated tools. Because you understand the systems well at a deep level, whenever there's a problem with technology, which happens more than you think, you're able to quickly readjust because you understand that the technology is not doing the right thing. This is similar to the theory that the best engineer should be promoted to a manager, because the best engineer will also help troubleshoot other engineers, because he understands what they're trying to do more than they do. And so if they make a mistake or a misstep, then he's able to help course correct them and continue them along the great path. And this is the same whether that person is a person or that person is actually an artificial intelligence agent.
The main problem with optimizing for learning is that learning is really hard. And if we're lucky, we enjoy the learning, in which case we deeply engross ourselves in the material. And it's a pleasure, and we just want to learn more, which is great. But sometimes learning is really painful, for lack of a better word. And it often involves grinding things out and dealing with finicky systems. For example, yesterday, I was trying to build a program that both ran on ngrok and used a VPN in order to make web requests. And the problem with this is that the networking becomes really, really subtle here. And there's not an obvious way to do this. And so because of that, you end up just grinding it out where you try this and it fails. And you try that and it fails. And you try this other thing and it fails. And all of these failures are really good for learning because you understand what not to do, which is just as important as understanding what to do. But it's a pretty miserable experience, at least for me, to try this and fail and try that and fail. At the end of the day, not really have anything. Honestly, I just ended up using a simpler solution with a virtual private server instead of trying to get this thing to work because it was just so finicky and so subtle. And so I optimized for results and not learning. And a large part of this was because I needed to get it done. But there was definitely an element of the psychology where it wasn't a fun process. And so I didn't want to keep doing it. And so I just figured out a way to solve my problems so I didn't have to keep going down this path of extremely painful learning.
At the meta level, coming up with creative solutions and what not to do is part of learning. And so next time I have this networking issue where I want to run this thing, I'm going to immediately just use a virtual private server instead of trying to do fancy things with Docker because the networking is such a pain. And I also know now that if I do anything Docker networking, that might end up being a pain point. And I should try to avoid it if possible. But the only way I was able to do that was by actually trying it and going through the painful experience of finding out that none of the obvious solutions work and that there has to be something very subtle that one has to do to make it happen.
Optimizing for results
So optimizing for results is a very interesting strategy because it follows the bell curve meme, where if you're just naively optimizing for results, then you're not going to grow. And therefore, in the long term, because you're not growing as quickly, you won't get as much stuff done in the long term because your ability to get stuff done in the long term is dominated by your growth, not by any minor optimizations related to getting stuff working. So the IQ100 strategy of optimizing for learning takes this all into account, where it says focus on growth and learning and the results will take care of themselves.
But the IQ140 move is to optimize for results while minimizing the necessary learning.
So the reason why it's a bell curve is because the end of the day, the IQ60 and the IQ140 person are both trying to optimize for results and the 100 IQ person is only optimizing for learning. But the 140 IQ play gets results while actively avoiding expensive costs of learning .
The IQ 140 person knows that learning is really expensive (time cost) and understanding stuff is expensive and so aims to build systems where they are simple to understand without running into the IQ 60 problems of creating spaghetti code and the IQ 100 problems of having to spend tons of time learning stuff to be able to do anything.
Sadly, there's not really a way to become a 140 IQ engineer without going through the stages of being a 60 IQ and 100 IQ. Here we're using IQ to talk about the meme, but really it better maps to a
60: junior engineer
100: mid level to senior level engineer
140: is staff level plus.
Of course, being able to make simple systems that are simple to understand yet solves complicated problems is extraordinarily difficult, which is why it deserves the IQ 140 moniker and partly why staff engineers are paid so much more than senior level engineers.
You're going to have to write spaghetti code so that you appreciate why you shouldn't write spaghetti code.
You have to start learning so that you appreciate the value of learning and also how darn expensive it can be and how much to avoid it.
The best way I know to get through this is to not to try to avoid being a junior/mid-level, and instead speedrun the process. I wish there was a way to avoid these steps and just have everyone immediately become staff, but I haven't figured out a way to do it.