4.6 Covariance and correlation

Two traits might have a relationship. E.g. when the value for trait one is high you always see that trait two also has a high value (see figure below, the relationship between heart girth and live weight in cows) or just the opposite when trait 1 is high, trait 2 has a low value (see figure below, the relationship between live weight and feed conversion in pigs). The relationship can also be low (see figure below, the low relationship between live weight and sale price in cattle). This might be caused e.g. that these traits are (partly) based on the functions of the same genes. In animal breeding we frequently use the covariance, correlation or regression as a statistical description of such relationships between traits.

In statistical terms the covariance is equal to: cov(x,y) = E(xy) – E(x)* E(y)

Where E stands for the expectation, which can be calculated as the summation divided by the number of observations

The relation between two traits is in animal breeding mostly described as the correlation between the traits x and y.

In statistical terms the estimated correlation is: r(x,y) =  cov (x,y) / (st dev x * st dev y)                    

The correlation is usually denoted as r and has a value between -1 and + 1. A plus means that two traits are positively correlated: high values of trait x coincidence in most cases with high value of y (in case r=+1 always). A negative sign means that high values of x coincidence with low values of y.

The scheme below illustrates in plots relationships (correlations) between two traits in three different cases:

It is very important to understand that the correlation does not indicate cause and consequence or result. A live weight in pigs is not directly the cause of  a low feed conversion ratio in pigs (third example in the scheme above) or the reverse. The correlation only indicates that a relationship between the two traits exists. When based e.g. on the function of the same genes, this relationship can be used in breeding.