GVSU CIS 263
Week 13 / Day 1
Exam 2
Week 13 / Day 2
Optimal binary search tree
- Set of words
w_1
,w_2
…w_n
with probabilitiesp_1
,p_2
, … ,p_n
- Goal is to minimize access time in a search tree (i.e, keep most common words toward top)
- In other words, minimize
Sum p_i(1+d_i)
whered_i
is depth of word 1. - Describe a greedy algorithm.
- Put most common word at the root.
- Then divide words before and after and recurse.
- Not optimal
- Balanced tree is not optimal either.
- Huffman codes can use greedy algorithm. Why can’t this?
- Tree must maintain binary search property. Can’t put nodes wherever we want them.
- We can take a dynamic programming approach similar to matrix multiplication
Randomized algorithms
- Use random numbers in algorithms
- Two main types
- Las Vegas:
- always gives the right answer, but running time is random.
- quick sort
- Monte Carlo:
- may or many not give the right answer
- primality testing
- Las Vegas:
- Las Vegas algorithms are a way to defeat the “adversary”
- Quicksort has a worst-case scenario of
O(n^2)
, even with median-of-7. - Given a fixed algorithm, the adversary can intentionally cause bad / worst-case behavior
- Realistically, “the adversary” is patterns in real problems that happen to interact poorly with the algorithm.
- (Think about the bad cases for bin packing)
- If we choose pivots randomly for quick sort, we can still get unlucky, but that “bad luck” is not likely to recur.
- However, if a workload pattern is causing bad quicksort behavior, that bad behavior will pop up again and again.
- Quicksort has a worst-case scenario of
- Need to understand (pseudo) random numbers
- Computer can’t truly choose numbers at random. Need some algorithm.
- Goal is to produce statistical properties that are common among truly random numbers.
- Why is just using part of the clock a bad idea?
- Simplest is Linear Congruential Generators
x_{i+1} = A x_i mod M
- Given M = 11, A = 7, and
x_0
= 1 - 7, 5, 2, 3, 10, 4, 6, 9, 8, 1, 7, 5, 2
- What are some dangers?
- Short period. Can’t just pick any
A
andM
. - If
A = 5
then we get 5, 3, 9, 1, 5, 3, 4 - Notice this happens even though
M
andA
are relatively prime.
- Short period. Can’t just pick any
- One good choice:
M = 2^31 - 1 = 3,147,483,647
A = 48,271
- Small changes can completely break the generator
x_{i+1} = (48,271x_i + 1) * M
has a period of 1 if seed is 179,424,105
- Need to be careful when implementing. Overflow seems harmless, but can mess with the period.
- Watch out for generators that use
M=2^B
(e.g. 2^32 on a 32-bit machine).- Always alternate even-odd
- Lower
k
bits have a period of2^k
or less. - The UNIX
drand48
uses a generator like this; but, uses a 48 bit generator, but only returns the upper 32 bits.x_{i+1} = (Ax_i + c) mod 2^B
- Constants are
A = 25,214,903,917
,B=48
,C=11
- There are many better algorithms than Linear Congruential.
- Mersenne Twister is popular
- Beyond the scope of this class
- Why don’t we use random piviots in quicksort?
- Random number generation is expensive.
- Why are lottery numbers drawn using physical machines?