How you average numbers doesn’t matter

Perl 6 averages “floating point” numbers correctly. That’s an interesting statement because it seems like it should be obvious and expected. However, people around computers long enough are used to round-off error. Sinan Ünür wrote How you average numbers matters to show the curious compounding of small effects. He shows some Perl 5 code:

my @data = (1_000_000_000.1, 1.1) x 50_000;
printf "Naive mean:                  %f\n", (sum @data) / @data;

This outputs the wrong answer instead of 500,000,000.6:

Naive mean:                  500000000.600916

I wanted to try this in Perl 6 and had been putting it off thinking I might have to do a work to see what’s going on.

my @data = slip(1_000_000_000.1, 1.1) xx 50_000;
printf "Naive mean: %f\n", ([+] @data) / @data.elems;

But no, Perl 6 give the right answer:

Naive mean: 500000000.600000

You can expand the example to show the identity of the first object in the list. The .^name meta method will tell you that:

my @data = slip(1_000_000_000.1, 1.1) xx 50_000;
put @data[0].^name;
printf "Naive mean: %f\n", ([+] @data) / @data.elems;

You see that the first thing (and all the other things) is a Rat, Perl 6’s builtin rational number type:

Rat
Naive mean: 500000000.600000

Perl 6 stored the number as a ratio instead of a floating point number. It stays as an exact value. You can add a Rat and another Rat and get another exact value in the resultant Rat. You can keep doing that almost as long as you like as long as Perl 6 can represent the numerator and denominator (and it reduces the fraction as it goes along). Beyond that there’s a FatRat that can take you even further.

You can sort it ascending or descending, like Sinan did. In his examples, each order got a different answer in the end. You can try that in the Perl 6 version:

my @data = slip(1_000_000_000.1, 1.1) xx 5;
printf "Naive mean: %f\n", ([+] @data) / @data.elems;

my @asc  = @data.sort: &infix:«<=>»;
printf "Naive mean: %f\n", ([+] @asc) / @asc.elems;

my @desc = @asc.reverse;
printf "Naive mean: %f\n", ([+] @desc) / @desc.elems;

But, every mean gives the same answer:

Naive mean: 500000000.600000
Naive mean: 500000000.600000
Naive mean: 500000000.600000

So, there’s nothing to see here. At least it’s off my to-do list. Many of the things Sinan sends me requires long reads and working out math by hand, but this was easy.

You can still have some small effects in the final operation to turn the rational number into a floating point, but that’s something unrelated to the averaging.


There are a few other interesting things in the simple code which I’ll cover quickly:

The reduction operator, [+] is a quick way to type that you want to apply that operation between every item in the list. You don’t have to use the addition operator. You can put almost anything in there.


The xx is the list repetition operator. But, I want each list to combine into a flatten larger list. The slip causes an inner list to lose its list structure and become separate items in the larger list. Without the slip you’d get a list of 50,000 sublists. Try it with only five sublists:

my @data = (1_000_000_000.1, 1.1) xx 5;
say @data;

The @data array has five items, all of which are lists:

[(1000000000.1 1.1) (1000000000.1 1.1) (1000000000.1 1.1) (1000000000.1 1.1) (1000000000.1 1.1)]

Now add the slip:

my @data = slip(1_000_000_000.1, 1.1) xx 5;
say @data;

The array now has is flat. The sublist inserted their items into the larger list without the structure:

[1000000000.1 1.1 1000000000.1 1.1 1000000000.1 1.1 1000000000.1 1.1 1000000000.1 1.1]

Leave a Reply

Your email address will not be published. Required fields are marked *