48kbps AAC public test results

These are the summary results of the 48kbps AAC public listening test.

User comments are available here.

How to interpret the plots

Each plot is drawn with the seven codecs on the x axis and the ratings given (1.0 through 5.0) on the y axis. N is the number of listeners used to compute the means (average ratings) and 95% confidence intervals. The mean rating given to each codec is indicted by the middle point of each vertical line segment, and the value is printed next to it. Each vertical line segment represents the 95% confidence interval (using ANOVA analysis) for each codec.
This analysis is identical to the one used in Roberto Amorim's listening tests

One codec can be said to rated better than another codec with 95% confidence if the bottom of its line segment is at or above the top of the competing codec's line segment.

Important note: These plots represent group preferences (for the particular group of people who participated in the test). Individual preferences will vary somewhat. The best codec for a person is dependent on his own preferences and the type of music she prefers.

Individual samples results

Plot Comment
sample description:
"experimental" modern music

results:
Bad result for CTv1 which  is tied to the low anchor
sample description:
jazz

results:
All contenders (except anchors) are tied.
sample description:
high pitched sounds

results:
On this sample the low anchor is clearly remove parts of the sound, due to its lowpass. 
Although it is statistaly tied to the other contenders, Nero v1 has a notation a little above others, probably because of its higher bitrate on this sample
sample description:
latin music, transcoded from MPEG Layer II

results:
All codecs (except anchors) are tied
sample description:
acoustic guitar with applaudes

results:
All codecs (except anchors) are tied
sample description:
male voice (singing) with music

results:
All codecs (except anchors) are tied
sample description:
female and male spoken voice

results:
All codecs (except anchors) are tied
sample description:
classical with vocals

results:
Good result of the HE-AACv1 contenders, which are tied to the high anchor, although they are only using 50% of its bitrate.
HE-AACv2 results are quite lower than v1 results.
sample description:
pop

results:
All codecs (except anchors) are tied
sample description:
male voice with background music

results:
All codecs (except anchors) are tied
sample description:
70's electronic

results:
Bad results of the high anchor (smeared transients). Lame, Nero v1 and v2, CT v2 are tied by a small margin.
sample description:
piano

results:
On this sample Nero v1 is statistically better than  Nero v2 and CT v2. It seems that parametric stereo is not at its best on this sample.
sample description:
wind instrument

results:
Contenders are tied, but only 3gpp is tied to the low anchor.
sample description:
hard rock

results:
Nero and CT encoders are sharing the first place, 3gpp is second.
sample description:
violin

results:
All codecs (except anchors) are tied, but it seems that HE-AACv1 encoders are scoring better than v2 encoders.
sample description:
a capella female voice

results:
Excellent results of the contenders. Except 3gpp, they are tied to the high anchor, despite using ony 50% of its bitrate.
3gpp, despite a respectable score, is beaten by Nero v2.
sample description:
pop music with artificial stereo separation and yelling singer

results:
HE-AAC v1 contenders are clearly better than HE-AAC v2 ones.
sample description:
rock

results:
All codecs (except anchors) are tied


This is the bitrate distribution table in kbps for the audio data:
Sample 3GPP HE-AACv1 CT HE-AACv1 CT HE-AACv2 Nero HE-AACv1 Nero HE-AACv2 L.A.M.E. 130 iTunes LC-AAC
1 48 47 46 49 48 161 48
2 48 46 46 47 49 143 48
3 48 47 46 54 49 124 48
4 48 47 46 48 49 133 48
5 48 46 46 47 48 130 48
6 48 46 46 47 49 148 48
7 48 46 46 48 48 106 48
8 48 46 46 48 48 100 48
9 48 46 46 47 48 130 48
10 48 46 46 46 47 102 48
11 48 46 46 49 49 138 48
12 48 46 46 49 48 95 48
13 48 46 46 45 48 138 48
14 48 46 46 48 48 146 48
15 48 46 46 48 48 134 48
16 48 47 46 49 49 95 48
17 48 46 46 47 48 155 48
18 48 46 46 47 48 147 48
average 48 46.22 46 47.94 48.28 129.17 48

Overall Ratings

The results for each sample were grouped together, withoutmodifications.
Then I performed an ANOVA analysis. The results are graphed below:

Here is a zoomed version, without the anchors:

Results comments