This week the #puzzle is: How Long Is the River? #probability (Link at the bottom.)
| … a phenomenon known as a “river”, where spaces between words diagonally align from one line of text to the next. |
| Before getting to rivers, let’s figure out where spaces are likely to appear in the (fictional) Fiddlish language, which includes only three- and four-letter words. These words are separated by spaces, but there is no other punctuation. |
| Suppose a line of Fiddlish text is generated such that each next word has a 50 percent chance of being three letters and a 50 percent chance of being four letters. |
| Suppose a line has many, many, many words. What is the probability that any given character deep into the line is a space? |
And for extra credit:
| Fiddlish is written using a monospace font, meaning each character (including spaces) takes up the same amount of horizontal space. As before, lines of text are very, very long, and each next word has a 50 percent chance of being three letters and a 50 percent chance of being four letters. Each line begins with a new word (i.e., words at the end of a line are not hyphenated into the next line). |
| Suppose the 12th character of a specific line of text is a space. You want to know how long the river down and to the right from this space will be. For example, suppose the 13th character on the next line and the 14th character on the line after that are both spaces, but the 15th character on the very next line is not a space. In this case, the river would have a length of 3. (By this definition, the length of the river is always at least 1.) |
| On average, how long do you expect the resulting river from the given space (again, the 12th character in its line) to be? |

Highlight to reveal (possibly incorrect) solution:
First the gung-ho approach.
If we’re way out on a long line, everything is very, very random and evenly distributed. It’s therefore safe to assume, we’ve landed on a position somewhere in the sequence “aaa bbbb ” (a 3 letter word, a space, a 4 letter word). Of these 9 characters, 2 are spaces. The probability of landing on a space is 2/9 = 0.222.
Then a slightly more sober approach.
- p(n) is the probability the nth character in the line is a space
- p(1) = 0
- p(2) = 0
- p(3) = 0
- With probability 0.5, the first space is either in position 4 or 5
- p(4) = 0.5
- p(5) = 0.5
- If there’s a space 4 positions away, with p(n-4), and then a 3 letter word, p = 0.5 or a space 5 positions away, with p(n-5), and then a 4 letter word, p = 0.5, then there’s a space at position n
- p(n) = p(n-4) * 0.5 + p(n-5) * 0.5
- p(6) = 0
- p(7) = 0
- p(8) = 0.5 * 0.5 + 0 = 0.25
- p(9) = 0.5 * 0.5 + 0.5 * 0.5 = 0.5
- p(10) = 0 + 0.5 * 0.5 = 0.25
- p(11) = 0
- p(12) = 0.125
- p(13) = 0.375
- p(14) = 0.375
- p(15) = 0.125
- p(16) = 0.0625
And then I plug the formula into a program. p(1000) = 0.222. And that’s the result.
And for extra credit:
Probability that the river has length 1, because line 2, position 13 wasn’t a space: 1 – p(13) = 1 – 0.375 = 0.625.
Probability … 2, because line 3, position 14 wasn’t a space: p(13) * (1 – p(14)) = 0.375 * 0.625 = 0.234.
Probability … 3, … 4, position 15 …: p(13) * p(14) * (1 – p(15)) = 0.375 * 0.375 * 0.875 = 0.123.
Probability … 4 …: p(13) * p(14) * p(15) * (1 – (16)) = 0.375 * 0.375 * 0.125 * 0.9375 = 0.016.
And so on.
The expected length of the river is 1 * 0.625 + 2 * 0.234 + 3 * 0.123 + 4 * 0.016 + …
Or 1.526 + some more.
I also plug this into my program and get 1.5347. I also try to Monte Carlo the problem and get about the same result.
***