Things Every Programmer Should Know: #10, #11, #12

Here are the 97 Things Every Programmer Should Know project, pearls of wisdom for programmers collected from leading practitioners, published by O'Reilly (license link).


You can go to read the previous very interesting points #7, #8, #9.


10. Consider the Hardware by Jason P Sage

It's a common opinion that slow software just needs faster hardware. This line of thinking is not necessarily wrong but, like misusing antibiotics, it can become a big problem over time. Most developers don't have any idea what is really going on "under the hood." There is often a direct conflict of interest between best programming practices and writing code that screams on the given hardware.
First, let's look at your CPU's prefetch cache as an example. Most prefetch caches work by constantly evaluating code that hasn't even executed yet. They help performance by "guessing" where your code will branch to before it even has happened. When the cache "guesses" correctly, it's amazingly fast. If it "guesses" wrong, on the other hand, all the preprocessing on this "wrong branch" is useless and a time-consuming cache invalidation occurs. Fortunately, it's easy to start making the prefetch cache work harder for you. If you code your branch logic so that the most frequent result is the condition that is tested for, you will help your CPU's prefetch cache be "correct" more often, leading to fewer CPU-expensive cache invalidations. This sometimes may read a little awkwardly, but systematically applying this technique over time will decrease your code's execution time.
Now, let's look at some of the conflicts between writing code for hardware and writing software using mainstream best practices.
Folks prefer to write many small functions in favor of larger ones to ease maintainability, but all those function calls come at a price! If you use this paradigm, your software may spend more time preparing and recovering from work than actually doing it! The much loathed goto or jmp command is the fastest method to get around followed closely by machine language indirect addressing jump tables. Functions are great for humans but from the CPU's point of view they're expensive.
What about inline functions? Don't inline functions trade program size for efficiency by copying function code inline versus jumping around? Yes they do! But even when you specify a function is to be inlined, can you be sure it was? Did you know some compilers turn regular functions into inline ones when they feel like it and vice versa? Understanding the machine code created by your compiler from your source code is extremely important if you wish to write code that will perform optimally for the platform at hand.
Many developers think abstracting code to the nth degree, and using inheritance, is just the pinnacle of great software design. Sometimes constructs that look great conceptually are terribly inefficient in practice. Take for example inherited virtual functions: They are pretty slick but, depending on the actual implementation, they can be very costly in CPU clock cycles.
What hardware are you developing for? What does your compiler do to your code as it turns it to machine code? Are you using a virtual machine? You'll rarely find a single programming methodology that will work perfectly on all hardware platforms, real or virtual.
Computer systems are getting faster, smaller and cheaper all the time, but this does not warrant writing software without regards to performance and storage. Efforts to save clock CPU cycles and storage can pay off as dividends in performance and efficiency.
Here's something else to ponder: New technologies are coming out all the time to make computers more green and ecosystem friendly. Efficient software may soon be measured in power consumption and may actually affect the environment!
Video game and embedded system developers know the hardware ramifications of their compiled code. Do you?

11. Continuous Refactoring by Michael Hunger

Code bases that are not cared for tend to rot. When a line of code is written it captures the information, knowledge, and skill you had at that moment. As you continue to learn and improve, acquiring new knowledge, many lines of code become less and less appropriate with the passage of time. Although your initial solution solved the problem, you discover better ways to do so.
It is clearly wrong to deny the code the chance to grow with knowledge and abilities.
While reading, maintaining, and writing code you begin to spot pathologies, often referred to as code smells. Do you notice any of the following?
  • Duplication, near and far
  • Inconsistent or uninformative names
  • Long blocks of code
  • Unintelligible boolean expressions
  • Long sequences of conditionals
  • Working in the intestines of other units (objects, modules)
  • Objects exposing their internal state
When you have the opportunity, try deodorizing the smelly code. Don't rush. Just take small steps. In Martin Fowler's Refactoring the steps of the refactorings presented are outlined in great detail, so it's easy to follow. I would suggest doing the steps at least once manually to get a feeling for the preconditions and side effects of each refactoring. Thinking about what you're doing is absolutely necessary when refactoring. A small glitch can become a big deal as it may affect a larger part of the code base than anticipated.
Ask for help if your gut feeling does not guide you in the right direction. Pair with a co-worker for the refactoring session. Two pairs of eyes and sets of experience can have a significant effect — especially if one of these is unclouded by the initial implementation approach.
We often have tools we can call on to help us with automatic refactoring. Many IDEs offer an impressive range of refactorings for a variety of languages. They work on the syntactically sound parse tree of your source code, and can often refactor partially defective or unfinished source code. So there is little excuse for not refactoring.
If you have tests, make sure you keep them running while you are refactoring so that you can easily see if you broke something. If you do not have tests, this may be an opportunity to introduce them for just this reason, and more: The tests give your code an environment to be executed in and validate that the code actually does what is intended, i.e., passes the tests.
When refactoring you often encounter an epiphany at some point. This happens when suddenly all puzzle pieces fall into the place where they belong and the sum of your code is bigger than its parts. From that point it is quite easy to take a leap in the development of your system or its architecture.
Some people say that refactoring is waste in the Lean sense as it doesn't directly contribute to the business value for the customer. Improving the design of the code, however, is not meant for the machine. It is meant for the people who are going to read, understand, maintain, and extend the system. So every minute you invest in refactoring the code to make it more intelligible and comprehensible is time saved for the soul in future that has to deal with it. And the time saved translates to saved costs. When refactoring you learn a lot. I use it quite often as a learning tool when working with unfamiliar codebases. Improving the design also helps spotting bugs and inconsistencies by just seeing them clearly now. Deleting code — a common effect of refactoring — reduces the amount of code that has to be cared for in the future.

12. Continuously Align Software to Be Reusable by Vijay Narayanan

The oft cited reason for not being able to build reusable software is the lack of time in the development process. Agility and refactoring are your friends for reuse. Take a pragmatic approach to the reuse effort and you will increase the odds of success considerably. The strategy that I have used with building reusable software is to pursue continuous alignment. What exactly is continuous alignment?
The idea of continuous alignment is very simple: Place value on making software assets reusable continuously. Pursue this across every iteration, every release, and every project. You may not make many assets reusable on day one, and that is perfectly okay. The key thing is to align software assets closer and closer to a reusable state using relentless refactoring and code reviews. Do this often and over a period of time you will transform your codebase.
You start by aligning requirements with reusable assets and do so across development iterations. Your iteration has tangible features that are being implemented. They become much more effective if they are aligned with your overall vision. This isn't meant to make every feature reusable or every iteration produce reusable assets. You want to do just the opposite. Continuous alignment accepts that building reusable software is hard, takes time, and is iterative. You can try to fight that and attempt to produce perfectly reusable software first time. But this will not only add needless complexity, it will also needlessly increase schedule risk for projects. Instead, align assets towards reuse slowly, on demand, and in alignment with business needs.
A simple example will make this approach more concrete. Say you have a piece of code that accesses a legacy database to fetch customer email addresses and send email messages. The logic for accessing the legacy database is interspersed with the code that sends emails. Say there is a new business requirement to display customer email data on a web application. Your initial implementation can't reuse existing code to access customer data from the legacy system. The refactoring effort required will be too high and there isn't enough time to pursue that option. In a subsequent iteration you can refactor the email code to create two new components: One that fetches customer data and another that sends email messages. This refactored customer data component is now available for reuse with the web application. This change can be made in one, two, or many iterations. If you cannot get it done, you can include it to on your list of known outstanding refactorings along with existing tasks. When the next project comes around and you get a requirement to access additional customer data from the web application, you can work on the outstanding refactoring.
This strategy can be used when refactoring existing code, wrapping legacy service capabilities, or building a new asset's features iteratively. The fundamental idea remains the same: Align project backlog and refactorings with reuse objectives. This won't always be possible and that is OK! Agile practices advocate exploration and alignment rather than prediction and certainty. Continuous alignment simply extends these ideas for implementing reusable assets.

The 'episodes' #13, #14, #15 come in the next posting :)

Cheers

No comments:

Post a Comment