GVSU CIS 263

Week 10 / Day 1

NP

NP means “Non-deterministic polynomial time”.
- Algorithm will run in polynomial time, if the algorithm can make an optimal guess at each stage.
- Easier to think of it as an algorithm whose answer can be verified in polynomial time.
Of course, all algorithms that run in polynomial time (i.e, “P”) are in NP.
List some problems not known to be in P:
- Travelling salesman problem
- Satisfiability
- Colorability. (Not the four-color map problem!)
- Max clique
- Bin packing
Similar problems can go from very easy to very hard
- Schedule jobs to minimize waiting time is in P.
- Minimize final completion time on a multi-processor is in NP.
Cook-Levin theorem

Decideability

The Halting Problem
- boolean will_halt(string code, string input)
- What does will_halt(troublemaker, troublemaker) return?

boolean troublemaker(string code, string input) {
   if (will_halt(code, input)) {
      while (true);
   } else {
     return false;
   }
}

The https://en.wikipedia.org/wiki/Post_correspondence_problem(Post correspondence problem)
- Consider an alphabet of symbols (e.g., “a”, and “b”)
- Define two sets of n words
  - A: a ab bba
  - B: baa aa bb
- Can you find a sequence of indexes i1, i2, i3, … such that such that Ai1, Ai2, Ai3, … = Bi1, Bi2, Bi3, …
- In this case (3, 2, 3, 1) is a solution.
- In general, no algorithm is guaranteed to find a solution.
Determining whether a player has a winning strategy in Magic the Gathering is undecidable.

Week 10 / Day 2

Greedy algorithms

What is a “greedy” algorithm. Give examples
- Giving change
- Job scheduling to minimize average wait time
  - Just schedule shortest job first.
  - Extends to multi-processor system. Given P processors, just sort by time then schedule in batches of P.
  - Note: Minimizing total completion time, although it sounds similar is much more difficult.
- Big picture:
  - Small changes to problem definition can make big changes in running time.
  - Knowing what problems are NP-complete, and how to do reductions can help you avoid spending a lot of time trying to find an ideal solution when none exists (in practice).
- Huffman Codes
Huffman Codes
- Using 8 bits for each ASCII character is not optimal because some letters are used much more often than others.
- Let more frequently used letters have shorter codes.
- Combine trees with lowest total frequency.
- This algorithm provides optimal tree
- This is a greedy algorithm because we can make an optimal step by looking at individual nodes (rather than having to consider the complete global picture).
- Contrast with Max Clique, where there is no “greedy” choice that works (e.g., looking at biggest vertexes first doesn’t help.)
Where greedy algorithms don’t work: Bin Packing
- Pack N items of given size into as few containers as possible.
- (Normalize containers to size 1. Package sizes must then be < 1.)
- Two versions:
  - Online: Each item must be assigned a container as it arrives.
  - Offline: Get to see the entire set of items before making a decision.
Is there an online bin packing algorithm that will give an optimal solution (i.e., a solution as good as the optimal off-line solution?)
- No. Consider M items of 1/2 - ε followed by M items of 1/2 + ε. An optimal on-line algorithm must put each of the first M items in a separate bin, so each of the next M can be paired accordingly. However, a different workload with only the small items must be packed two to a container. The online algorithm has no way of knowing which workload it has currently.
- Any online algorithm can be forced to use at least 4/3 of the optimal number of bins. Why
  - Have students try to come up with worst-case
  - Think in terms of the “adversary”. Given any online bin packing algorithm, the adversary can construct an input to force an output 4/3 of of optimal or worse.
  - Suppose the workload starts out with 100 “small” boxes (size 1/2 - ε) What must algorithm do so adversary can’t just top and call “foul?”
    - If algorithm pairs the small boxes up, the adversary can switch to large boxes, which will eventually become sub-optimal. (For example, if you stack for small boxes in two piles, adversary will then send 4 large boxes, giving you 6 instead of optimal 4. FAIL.)
    - If algorithm doesn’t pair small boxes, adversary just declars end of input when current state is sub-optimal.
Next fit: It either fits in previous box or starts a new box: O(n)
- What is running time for N items? O(n).
- How bad can this get? 2x. Look at pairs of filled bins. Sum of each pair must be at least 1, otherwise they would have been combined.
- Is the 2x factor “tight” (i.e., is there a sequence of package sizes that results in a 2opt number of bins?)
  - Have students try to come up with worst-case
  - Alternate boxes of size 1/2 and 2/N. Optimally uses N/4 + 1 boxes. But algorithm will use N/2 boxes.
First fit: Place item into first box into which it fits.
- What is running time for N items? Naively O(n^2). Can be done in O(n log n)
- What is worst case?
  - Have students try to come up with worst-case
  - about 1.7opt.
  - Consider 6M of 1/7 + ε followed by 6M of 1/3 + ε followed by 6M of 1/2 + ε
    - Will use 10M bins (M for the 1/7, 3M for the 1/3 and 6M for the 1/2)
    - Optimal is 6M.
- Average case for random inputs is only 1.02. Pretty good!.
Best fit: Place in tightest space.
- Sounds good; but, has similar worst cases.