This week the #puzzle is: Can You or “Cantor” You Find the Distance? #probabilities #infinity #coding #montecarlo #permutations #distance
| I start with a number line from 0 to 1, and then I remove the middle third. Then I take each of my remaining pieces (the segment from 0 to 1/3 and the segment from 2/3 to 1) and remove their middle thirds. Now I have four segments (from 0 to 1/9, 2/9 to 1/3, 2/3 to 7/9, and 8/9 to 1) and remove their middle thirds. I do this over and over again, infinitely many times. |
| The points I’m left with are collectively known as the Cantor set. The Cantor set is not empty; in fact, it contains infinitely many points on the number line, such as 0, 1, 1/3, and even 1/4. |
| It’s possible to pick a random point in the Cantor set in the following way: Start with the entire number line from 0 to 1. Then, every time you remove a middle third, you give yourself a 50 percent chance of being on the left remaining third and a 50 percent chance of being on the right remaining third. Then, when you remove the middle third of that segment, you again give yourself a 50 percent chance of being on the left vs. the right, and so on. |
| Suppose I independently pick two random points in the Cantor set. On average, how far apart can I expect them to be? |
And for extra credit:
| Suppose I independently pick three random points in the Cantor set. Each point has some value between 0 and 1. |
| What is the probability that these three values can be the side lengths of a triangle? |

Can You or “Cantor” You Find the Distance? ![]()
Solution, possibly incorrect:
Method 1:
- Monte Carlo. I try different “sensitivities”. This corresponds with how many decimals (ternimals?), I have in my ternary number. Using 30 digits and 1000000 tries, I get an average of 0.40. Really? I am baffled.
Method 2:
- Brute force. For a couple of different numbers of digits (1-12), I try all the numbers with that many digits, and try these in all combinations. My average converges nicely to 0.40 again.
Method 3:
- Let’s try and look at this using probabilities. Let’s look at all the 2 digit ternary numbers.
| Distance | 003 | 023 | 203 | 223 |
| 003 = 0 | 0 | 2/9 | 6/9 | 8/9 |
| 023 = 2/9 | 2/9 | 0 | 4/9 | 6/9 |
| 203 = 6/9 | 6/9 | 4/9 | 0 | 2/9 |
| 223 = 8/9 | 8/9 | 6/9 | 2/9 | 0 |
- The average in this tiny example is 7/18 or about 0.3888. Very close to 0.40 already.
- In fact I should be looking at intervals. Not 003, but 003-013.
| Distance | 003-013 | 023-103 | 203-213 | 223-1003 |
| 003-013 or 0/9-1/9 | 0/9-1/9 | 1/9-3/9 | 5/9-7/9 | 7/9-9/9 |
| 023-103 or 2/9-3/9 | 1/9-3/9 | 0/9-1/9 | 3/9-5/9 | 5/9-7/9 |
| 203-213 or 6/9-7/9 | 5/9-7/9 | 3/9-5/9 | 0/9-1/9 | 1/9-3/9 |
| 223-1003 or 8/9-9/9 | 7/9-9/9 | 5/9-7/9 | 1/9-3/9 | 0/9-1/9 |
| Distance interval | Distance max | Frequency | Acc. freq. |
| 0/9-1/9 | 1/9 | 4/16 | 4/16 |
| 1/9-3/9 | 3/9 | 4/16 | 8/16 |
| 3/9-5/9 | 5/9 | 2/16 | 10/16 |
| 5/9-7/9 | 7/9 | 4/16 | 14/16 |
| 7/9-9/9 | 9/9 | 2/16 | 16/16 |
- Half of the distances are 3/9 or below.
- The other half of the distances aren’t distributed the same way (that would be half of them 6/9 or above). So the average should be dragged down a little from simply being 0.5.
I trust my results. The average is 0.40.
And for extra credit:
Same program, same logic etc. Some of the same bafflement.
The ratio is 0.40.
I actually got curious and conducted the same experiments without the cantor set limitation. Then my average would have been 1/3 and my ratio would have been 1/5. So it does make a difference.

















