Learning Objectives

  • Be able to dissect a problem into its components.
  • Be able to incrementally create a program to solve a problem.

Lecture Video

Introduction

Designing an algorithm that is both efficient and useful is why programmers get paid the big bucks. This is an art whereby a programmer takes their combined knowledge of programming, computers, architecture, and mathematics to accomplish a task.

Many times, your task comes to you as a big paragraph or by a phone call. There is no: "Step 1, do this, step 2, do that."--that'd be too easy. Instead, we have someone who probably has no idea how to code asking for something to get done. That's where you, as the programmer, come into play.

Generally, we can break down a problem by using some methods to analyze a problem: (1) break out the design "must haves" vs. "like to haves", (2) determine the outputs, (3) analyze the inputs, (4) orient the problem to your environment, (5) solve the problem incrementally.


Must Haves vs Like to Haves

One of the harder aspects of programming is to avoid mission creep also known as scope creep. This is where the original goals of the program tend to keep expanding as the finished product comes into view. The important part of designing a solution to a problem is to determine what MUST be done vs. what could be done.

The best thing to do first is to solve the must haves. This will give your program a starting point. It is very difficult to see the "forest through the trees", meaning you can't see the bigger picture until you have at least something running.


Determine the Outputs

Before we begin calculating, we need to know what is the final goal? What do we want to do? You can see we sort of develop the problem backwards. We see what the end goal is and work our way backwards. We don't really know the starting point. We really only know the ending point when we get our requirements.

It is rare to get "we want to start here." Instead, we get, "this is what we want this to do." So, we have our outputs and not necessarily anything else. This is why I recommend you work backwards.


Analyze the Inputs

Notice that we analyze the outputs before we analyze the inputs. Sometimes it looks apparent that we need certain inputs based on the generality of the problem, however, we might have to update our assumptions if certain aspects are irrelevant. For our power meter, do we care about temperature? Maybe, but not in our situation, so why make that an input?

Before you write a program, you must have knowledge of the system itself. You must be able to see, "what information must I have before I can run a calculation?" For example, if I am making an electronic power meter, I need to know time and amperage (how much electricity is flowing at a given moment). Somehow, we have to get those inputs into our program before we can even hope to write a program that calculates power usage.


Orient the Problem to your Environment

Solving a problem is a very generic phrase. I can solve a mathematical problem by grabbing a calculator or asking Siri to solve it for me. However, you need to know where this system that you're designing will be deployed. Is Python available? If so, how much memory do we have? Can we suffer the performance hit by using floating point numbers, or are we confined to the faster integers?

Knowing your environment (or platform) is important to developing the solution.


Solve the Problem Incrementally

Most programmers will try to write everything all at once from start to finish. This rarely works out in practice because sometimes many aspects of the program are not apparent until other sections have been written. Breaking up your program into smaller goals, also known as milestones, can help give you a roadmap to a solution.

Don't worry if your roadmap needs some detours, that's all part of the problem solving process.


Real World Examples

Many large-scale programmers will not work alone. Sometimes tool programmers work alone, but you must be able to communicate efficiently. Hiding details or being afraid to ask certain questions can make your program impossible to complete.

A central repository for all programmers to contribute is generally used in large-scale programming projects. Many companies have their own internal systems, however, we can see public systems, such as GitHub that allow collaboration.


Working with Bugs (errors)

Bugs WILL occur. I bolded that, italicized it, and underlined it! Sometimes bugs aren't apparent, those confounding logic errors will rear their ugly heads when you're least expecting them. There are three types of common errors you will encounter while programming: (1) syntax error, (2) runtime error, and (3) logic error.

The syntax error is an error where Python can't understand what you're trying to do. For example, missing a parentheses would be a syntax error. A runtime error passes all syntax checks within python, however when running the program, an error occurs. An example of this type of error would be trying to get an element in a list that doesn't exist, such as mylist[len(mylist)] = 100. The logic error is the toughest of all errors to fix. These are the types of errors where Python knows what you want to do, but you miscommunicated to Python what you actually wanted to do. An example of a logic error would be taking the square root of a number instead of squaring the number. Both are equally valid in the eyes of Python, but only one of those two will lead you to the correct result.

Issue tracking is a great way to work with bugs. However, testing your code and making sure you test all possible paths and edge cases, such as trying to input negative values where only positive values make sense.

When I wrote a large-scale program, I had the general public write issue tracking (also known as bug reports). Before I knew it, I had 20 issues opened!

Humility might go a long way. Egos can lead to some pretty large road blocks!


Examples

3d Printed Statues Problem

You have a single 3D printer, and would like to use it to produce n statues. However, printing the statues one by one on the 3D printer takes a long time, so it may be more time-efficient to first use the 3D printer to print a new printer. That new printer may then in turn be used to print statues or even more printers. Print jobs take a full day, and every day you can choose for each printer in your possession to have it print a statue, or to have it 3D print a new printer (which becomes available for use the next day).

What is the minimum possible number of days needed to print at least n statues?

Must Haves vs Like to Have

This problem is fairly straight forward. We want to print the minimum possible number of days needed to print at least n statues. So, we only have one requirement, and it's a must.

Determine the Outputs

Remember to work backwards. We know our goal is to print the "...minimum possible number of days needed to print at least n statues..."

Analyze the Inputs

We need to look at the problem to see how we can calculate the number of days needed to print n statues. So, what sort of information do we need before we can make that calculation?

First, we need the value n, which the problem tells us is the number of statues we need to print. This comes directly from the user--they tell us how many statues they want to print.

What other information do we need to know? Well, how much time does it take to print a statue? How much time does it take to print a printer?

Orient the Problem to your Environment

We will be using Python to solve this. This means we are given the full suite of Python constructs, such as if statements, loops, lists, and so forth.

Solve the Problem Incrementally

First things first--let's get the input from the user. We are printing whole statues, so we need to input n as an integer. No need for floats here.

def main():
   n = int(input())

if __name__ == "__main__":
   main()

This isn't much, yet, but at least now we have all the inputs we need for this problem.

Now we need to figure out how the number of statues interacts with our problem. You can see we have a choice to make: (1) print statues or (2) print printers. The end goal is to print statues, so we need to see when printing printers is no longer more effective than just printing a statue.

For example, if we only had one statue, there is no reason to print a printer. Some of you might be future-looking and say, "well yeah, but in the future another printer could help!" Yes, but that's not a consideration in our problem. Remember, do not allow the scope to creep!

Let's throw everything we have at this problem: we can try all possible combinations of statues vs. printers and then pick the minimum. Since we're not limited on processor time or memory, this is a feasible solution, even if it isn't perhaps the most efficient.

So, let's write this out in pseudo code (not real Python):

  1. Start with 1 printer (we always have at least one printer), store the number of days to print n statues.
  2. Move to 2 printers, store the number of days to print n statues.
  3. Keep going until we have n printers. We just saw above that it makes no sense to make more printers than statues.
  4. Pick the minimum value in the list....Python makes this step really easy!

That makes sense to me. Let's do a little example on the whiteboard to see if it makes sense. Let's take n = 5 and see which one we would choose and see how we figured it out ourselves:

Scenario I takes 5 days to print 5 statues, Scenario II takes 4 days, and Scenario III takes 4 days. Arrows show which printer produces what.

First, let's add the logic that tests 1 printer, then 2 printers, then 3 printers, and so forth. Since we are given 1 printer, I'm going to use a for loop that goes through the number of additional printers. So, we will start at zero:

def main():
   n = int(input())
   results = []
   for printers in range(0, n):
      pass


if __name__ == "__main__":
   main()

So, this code still doesn't do anything, but you can see we now test printers from 0 up to but not including n. This is the number of additional printers we are going to print.

We also need to account for the new number of printers. The problem tells us the newly printed printers can also print new printers! So, if I have 1 printer, I can print one printer only. If I have 2, I can print 2 printers, if I have 4 I can print 4 printers. You can see that each time we print a printer, we divide our work by the new number of printers we have.

Notice that every time we have a printer print printers, we double the number of printers that we have. This means that we take the printers as a power of two. 21 (one additional printer) = 22 (two additional printers) = 4. We can see that we double the number of printers. If we take the opposite, we halve our work every time we print more printers. So, this is a logarithm (base 2). We can take the log, base 2, by using math.log(number, base). So, let's update our program:

import math
def main():
   n = int(input())
   results = []
   for additional_printers in range(0, n):
      total_printers = additional_printers + 1
      tm = math.ceil(math.log(total_printers, 2) + n / total_printers)
      results.append(tm)

if __name__ == "__main__":
   main()
   

So, we've improved our variable names to make it completely clear what is going on. The math.log calculates the printing time of the number of printers, and the right hand side (n / total_printers) shows how much time will be spent printing statues. As you can see, there is potentially a time where for a day, not all printers are printer (see scenario III above). However, we still must count that as a day, even though some of the printers are idle. This means we have to take the ceiling, meaning always round to the next integer.

We have tested every scenario, so now we have to take the one that gives the minimum number of days. Luckily, Python has the min() function that allows us to do this.

import math
def main():
   n = int(input())
   results = []
   for additional_printers in range(0, n):
      total_printers = additional_printers + 1
      tm = math.ceil(math.log(total_printers, 2) + n / total_printers)
      results.append(tm)
   print(min(results))

if __name__ == "__main__":
   main()
   

Now we can look at the results of our problem:

Now that we've done preliminary testing, now all that is left to do is test every path--luckily we only have one!

Conclusion

Solving problems is an art form in of itself. Sometimes it takes trial-and-error and/or drawing out the problem on paper first before the solution starts to become clear. This problem has some complexity that can be seen when you draw out the problem with a pencil and paper. Keep at it and don't forget to incrementally build your program and test each individual step!