Endangered Random Numbers

Like infinity, randomness is as easy to misunderstand as it is useful. As an added bonus infinity and randomness are interconnected. I don’t think you can have one without the other.

I’m not a mathematician but I like to think about numbers. Take a look at this series of integers: 31415926535

It might look pretty random if your not a number geek. It’s starts with “31”— the country code for the Netherlands. And the format for international phone numbers contains 11 digits. So it could be a phone number. But actually it’s one of the most famous numbers of all: Pi (3.1415926535…)

(Maybe it’s also a phone number for mathematician in Europe. I have not tested that theory.)

Pi only looks like a random bunch of digits because we’re expressing the ratio of a circle’s circumference to its diameter in integers and integers are bad at representing ratios. Some rations are easily represented by integers (like 1/2 which evaluates to 0.5) but many important numbers (like Pi, e, and the square root of 2) are simply unworkable with integers.

Actually there is one number base where Pi can be easily represented by integers! Base-Pi! In base-Pi (where we are counting place values by powers of Pi) Pi is expressed as 10. But then the other numbers, like 4, become irrational. Yikes!

Because of Pi and how hard it is to express (outside of a formula or the greek symbol π) I have begun to doubt than any string of numbers are usefully random. If you run into 31415926535 you might say “Aha! That is the number Pi! I know what the next number is! It’s 9!”

If you can predict the next number in a series of numbers then the numbers are not random, they are well ordered and governed by some principle or function.

So what about 3958391848594819348593?

I just made up that number by typing as randomly as possible on my keyboard. Is it random?

To me 3958391848594819348593 is pretty random. But maybe it’s ratio of an aardvark to a zebra? Or it’s a prime? (nope—it can be factored to 3 x 3 x 86441 x 508811). Or maybe I can guess the probability of the next digit by looking at the frequency of the digits that I typed.

To make my number I only used 1,3,4,5,8, and 9. And most of the time after a “3” I typed a “9.” Given this small sample size I’d say there is a 2:3 chance that if I had typed another digit it would have been “9” and a 1:3 chance it would have been a “4.” It’s good thing I don’t create my passwords by playing “kitty on the keyboard”.

If you use a computer algorithm as a random number generator you get “pseudo random numbers.” That is you get numbers that look random, and are nearly random, but are produced by a non-random process, and if you know the details of that process you can generate the same numbers again. Generally the way pseudo random numbers are generated is by using a “seed” value. If you know the seed value and the formula you have the number. So it’s not great for passwords or for sampling or for simulations.

To get real random number from a computer you have to some kind of noisy system like random.org does (they use atmospheric noise). But that real random number could turn out to look non-random and be useless. For example a true random number from random.org between “1960” and “2016” is “1990.” That is definitely a year and millions of people have it as the birth year of someone in their family. It’s probably overed-used as an ATM or smart phone PIN and easily guessed.

You can’t use any number as a secure PIN that looks like a date–even if you generated it from atmospheric noise! Four digit PINs are terrible. There only 10,000 of them (0000 to 9999) and hundreds of them look like non-random dates. 1492? 2001? 1066? All famous years to just about everyone.

In the end, to be really useful, a random number has to be generated in a as random a manner as available, it has to look and feel random, it has to be statistically random, it has to be unrelated to your person, and it can’t be so long that it’s hard to remember or work with.

I have an intuition that the actual amount of useful random numbers that fit the above criteria over time is approaching zero.

 

In Defense of Bubble Sort

Bubble sort is an algorithm with a very bad reputation. Robert Sedgwick, in Algorithms in C, notes that bubble sort is “an elementary sorting method” and “it does much more work” than it needs to. Donald Knuth is much more harsh when he states “the bubble sort seems to have nothing to recommend it, except a catchy name…” in the Art of Computer Programming.

Like a an actor valued only for his good looks bubble sort is an algorithm an enterprising coder should probably not admire.

Why is bubble sort so bad? Why does decent computer society advise young and impressionable software engineers avoid it? Why am I devoting a whole blog post to the ugliest sorting algorithms?

TIL: You can learn more from the flawed than you can from the flawless.

I’m not going to tell you why bubble sort is so bad. But before you google it why not try to figure out on your own?

The truth is that modern programmers don’t usually implement any standard sorting routines on the job—even the good ones! Generally there exists a collection class or library that is well written and well tested. If you have a mentor she will tell you not to bother with your own interpretations of the standard body of algorithms. That problem as been solved.

However, I think you’re missing out. Knowing now to implement well known algorithms can help you understand when to use them, what their strengths and weaknesses are, and how to talk about them. Like in a job interview.

Bubble sort is a great place to start. In order to understand why it’s so bad you have to understand big O notation, how algorithms are classified, and how the ordering of input data impacts the performance of an algorithm.

It’s my opinion that bubble sort is only the most terribad sorting method in the platonic world of absolute randomly unordered data. Bubble sort is actually very appropriate for error detection or for data that is already mostly sorted. You know, the kind of data that you are likely to run into in real life.