Models for preferences, established by ranking the available options according to isolated criteria, are developed here. Randomization of ranks allows us to compute the probability of each option being raised to the position of best choice. These partial measures of preference are finally aggregated into global measures. It is verified that, in horse races, measures of preference built through odds of the options to hold the first position according to some criterion or to a combination of criteria are more correlated with the measure of preference given by the final betting distribution than those given directly by the ranks.


Keywords: Random ranks – Modeling Preferences – Multicriteria Decision Analysis



1. Introduction

The choices of the bettors in horse races constitute a collective decision-making process, aiming to identify the true probability of each horse in the track becoming the winner of the next race.  Sounding the chords that announce the countdown to the opening of the starting-gates to the animals, hundreds of gamblers search the official boots where they will place their bets. Each bettor will choose the horse that, in his opinion, offers the lower cost/benefit ratio. In this way they will raise the ratio between placed bets and winning chances in the points where this ratio looks small. Thus, the moment when the race starts catches a probability distribution of bets that faithfully mirrors the point of view of the group about the winning chances of the horses. 

It is obvious that this probability distribution does not need to correctly represent the true probability distribution of the animals facing the starting gate become the winner by the end of the race, in the sense that only random disturbances, intervening during the race to modify that distribution, would determine the winner.  It is possible that the process of formation of preferences of the bettors do not take into due account factors that systematically affect the results of the races. Such factor may be out of the range of knowledge of the bettors that may also form their preferences based on erroneous theories. 

Besides, it is well established, since Kahneman and Tversky (1979), that, in many situations, the distribution of preferences effectively observed deviates from what would result from the rational application of the information and theoretic models available to the decision makers. Quantifying these deviations, thoroughly catalogued in McFadden (1999), has been object of research in the last decades.  We follow here the approach of Gomes and Lima (1992) of just trying to derive the functional form that better mirror the distortions objectively found throughout the whole set of possible odds, without trying to explain the effect of each factor on the evaluation of each option.

In the case of horse races, in the urge of last moment gambling, it is possible that emotional factors deviate in the same direction from the objective of profit from the possible distortions in the observed cost/benefit ratio of each possible bet.  Among these factors, are, by one side, the convenience in substantiating the choice on a simple comparison and, by another side, the unreliability inherent to the result of any rational procedure of choice, given the aspects that people are always forced to leave aside any analytical model.

The conjugation of these two principles would cause to concentrate the bets in the options of higher probability and to calculate the chance of the other options with regard to those.  If we rank from the less preferred to the most preferred, that means, by giving rank one to the worst option, and then replace such ranks by the probabilities of each option presenting the highest rank, we come to obey these two principles. We focus attention on the most probable results and strongly reduce the measures of preference for the less probable. Larger distances between the options of higher probability, leading to a probability distribution approximately exponential, have been observed in different contexts.  Various situations of that nature are described by Lootsma (1993).

 The replacement of ranks by probabilities of being the first is applied here to the results of application of ordering criteria: the preferences derived from the past performance of the competitors, supplied by the Track Official Program for special proofs and established before the riding commitments are signed by the jockeys, and the preference exercised by the best jockeys choosing their mounts.

We start modeling the preferences by ranking the animals according to each criterion. Then, we consider the rank attributed to each animal as the observed value of a random variable, whose expected value it estimates.  From the joint distribution of these random variables, built by applying hypotheses about the form of the distribution and hypotheses of independence and identical dispersion that may later be checked, we derive the probability of each option to be the preferred. 

If we were able to rank the options globally, this procedure might be applied directly to the global ranks do derive final preferences. But this is not usually the case and, after obtaining the probability of being the best according to each criterion, we still have to combine these partial evaluations.  The principle of simplification leads us to combine them through the comparison with the options in higher evidence.

In our case, the first criterion, official preference derived from former performances, is a more reliable criterion than the preferences signaled by the jockeys. These, not only may found their choices on their knowledge of the official ranking, but are also bounded by long-term links with trainers and owners. In such a situation, the reference option will be that preferred according to the main criterion. This option will be betting on the animal in the best position in the official ranking. Following this approach, we will reduce the pair of measures of preference for each animal according to the different criteria into a final one-dimensional measure by projecting the vector of preferences for each horse on the direction given by the vector of preferences by the animal preferred according to the main criterion.

These transformations are applied here to explain the preferences of the bettors in proofs for which the Jockey Club of Rio de Janeiro supplies ranks for the competitors. Conclusions can then be extracted from fitting models explaining the preference reflected in the final distribution of bets through the measures of preference derived from the two criteria above referred.  In the first place, it is shown that the adjustment of the linear regression model improves when we replace the ranks by the odds derived from the probability that the option occupies the position of highest preference. And improves even more when we aggregate the criteria by projecting on the direction determined by the option preferred according to the main criterion. 

In Section 2, it is discussed the randomization of the ranks to generate the probabilities of winning and subsequently the probabilities of being the best choice. Section 3 deals with the composition of the preferences derived from distinct criteria.  In Section 4, the examples of application to the races in the Jockey Club of Rio de Janeiro are presented.  In Section 5, final comments. 



2.      Adding a Random Component

In this section, it is developed a mechanism to introduce a random component in the model for the preference. With the addition of this random component, the preference, initially postulated in a deterministic fashion, comes to be seen as an estimate of the center of a probability distribution. 

The simplest form of supplying the initial measure of preference is through the ordering of the options, from the least preferable to the most preferable.  This ordering does not need to be strict, ties being admitted as well as positions left empty to allow for larger distance between some options. What matters is that, once the indications of preference are transformed into numerical values, by treating these values as observations of random variables we are able to calculate the probability of each option coming to take the position of highest preference. 

            Measurement errors are usually modeled with a normal distribution. Instead of that, to increase the possibility that options classified close together change their ranks, we impose the uniform distribution for the random components of the preference measures.

The expected value of each preference is estimated by the position where the option is deterministically placed in the initial classification. In the uniform family, the distribution around the expected value is perfectly identified by the information on a dispersion parameter.  If we wish to permit that any two options may invert their relative positions, the range of the distribution must be larger or equal to the difference between the initial highest and lowest preference measures.  If the preferences are given in terms of ranks, this difference equals the number of available options less 1. 

Formally, the transformation applied to the ranks will then consist of replacing the rank Rij of the j-th option according to the i-th criterion by the probability that, by this criterion, that option would be placed in the position of highest preference, under the assumption that, for all i and j, the preference by the j-th option according to the i-th criterion is a random variable uniformly distributed around the respective register Rij.

And these uniform distributions are assumed independent, all those relative to the same criterion endowed with the same range parameter, given, for the i-th criterion, by the maximum of the differences Rik – Ril, for k and l varying along all the options. 

Since the ratio between the expected value of the range of the random sample of size n extracted from a population uniformly distributed in a given interval and the range of this interval is  (n-1)/(n+1), to derive an estimate for the uniform populational range from the sample range, this should be divided by (n-1)/(n+1), raising to the available number of options plus 1 the range of each random rank, in the case of ordinal preferences.  This correction may, however, become excessive in the present situation, where the initial attribution of preferences is not carried out randomly, but, on the contrary, the random disturbances just reflect the inaccuracy in the knowledge of an underlying ordering.

Analogously, the fact that the expected values of the variables in the sample are different would make the estimates derived in the usual way from the sample standard deviation overestimate the dispersion. In fact, for the case of ranking n options, the discrete uniform distribution in the set of integers from 1 to n has variance given by n(n+1)(n-1)/12, much larger than that of the uniform distribution in an interval of range n-1.  Thus, its relative range, given by the ratio from the range to the standard deviation, has order n-1/2, decreasing with the sample size, whereas the relative range of the uniform distribution in any interval is constant.

Thus, assuming a uniform distribution for the random disturbance, it is not of good advice to estimate the standard deviation of each measured preference by the sample standard deviation. But, if we assume a normal distribution, for which the standard deviation is a natural parameter and the density gradually decreases with the distance from the mean, this may suitably increase the chance of rank inversion. In fact, in the normal case, the relation between the dispersion attributed to each measure and the dispersion observed in the initial measurements must be bigger than in the uniform case if we wish non-negligible probabilities of inversion. For instance, in the case of 10 options, the expected value of the normal relative range, around 3, implies a probability of inversion between the first and the last ranked option of, approximately, 0,1%.

 If the number of options is large, the hypothesis that every inversion is possible may be unrealistic.  Nevertheless, for the goal of calculating the probabilities of being the preferred, for samples of size 10 or more, little difference results from assuming the range to be 10 or any number closer to the precise number of options. 

In the opposite case, when the number of options is small, it may be suitable to model the dispersion with a larger range than the sample size, to more correctly represent the chances of preferences inversion. This may be performed, in practice, by adding one or two fictitious options in the extreme of lowest preference. 

We might also relax the assumption of identical dispersion and amplify or reduce the standard deviation of one or another rank, to correspond to a stronger or weaker conviction about the position of some better or worse known options. However, precisely modeling the dispersion is often difficult. 

The independence between the random components is also a simplifying assumption that may lead far from reality.  As the ordering comes from comparing the options, it would be more reasonable to assume a negative correlation. To model that precisely, it would be enough to assume identical correlations and derive their value from the fact that the sum of the ranks is a constant.  This correlation would, however, decrease quickly, in absolute value, with the increase of the number of options, also leaving to display considerable numerical effects in the case of 10 or more options.

Since the decisions evaluated in the present study involve from 10 to 20 options, we may feel comfortable with the assumptions of independence and identical uniform distribution with range determined by the sample range. The results presented below use these hypotheses. The normal distribution with standard deviations given by the sample standard deviation was also applied and led to similar results.



3.  Combination of Multiple Criteria


3.1.   Classes of Alternatives

The determination of the preference in terms of probability of the option being the best, starting from an initial deterministic classification, may be applied, separately, to simple criteria that will be later combined, or to a unique criterion that, possibly, results from previous combination of simpler criteria. This section presents the aggregating alternatives that will be tried ahead.  These alternatives are classified in two groups, according whether equal importance is initially given to all the involved criteria or a previous weighting of the criteria is applied. 

There are distinct alternatives to ensure equal importance to the criteria. Two of them are developed here. According to them, entering initially with equal probability, the different criteria may come to present very different influences in the final result.  The first is based on the composition of the probabilities of being the preferred option in a final probability. The second, based on Data Envelopment Analysis (DEA), measures the preference by the proximity of a convex envelope of the set of preference vectors.

After those, forms of aggregation based on weighting the criteria are listed.  Of these, to keep the practice of comparing to the option of higher preference, receives more attention a new form of composition, based on weights derived from the projection on the direction determined by the vector of preferences for the option preferred according to the most important criterion. 



3.2.   Compositions with Equal Importance

Equal initial importance for all criteria may be applied by several means.  The simplest of these consist on calculating the average of the measures of preference according to the various criteria or any norm of this vector of preferences. A probabilistic way to compose giving equal importance to the criteria consists in using as the global measure the probability of the option being the preferred by at least one of the criteria. Formally, denoting by Pij the probability of the j-th option being the best according to the i-th criterion, the global measure of the preference for that option, and the respective odd, will then be given by 1-PP(1-Pij) and by [1-PP(1-Pij)]/PP(1-Pij), i varying, in the product, along all the available criteria.

Another aggregation alternative, following the same principle of attributing higher preference to the options nearer the position of preferred according to some criterion, consists in measuring the preference by the proximity to the convex envelop of the set of vectors of preference.  This is the criterion of global efficiency of Farrel (1957), whose calculation can be implemented through the algorithm of Data Envelopment Analysis with Constant Returns of Scale (DEA-CRS) oriented to the minimization of the input. To formulate the problem in the language of DEA, it is enough to treat each option as an evaluated unit, taking the preferences according to the different criteria as outputs resulting from the application of a constant amount of a single input.

This last approach can be applied whether the initial preference measures according to each criterion are given in terms of ranks or of probabilities. Besides, the aggregate measure resulting from the application of this algorithm is invariant to changes of scale, that is, to changes that preserve the proportionality between the values of the preferences attributed to different options. 

The problem of optimization solved to apply this concept, assuming that the options under evaluation are ranked, according to each criterion, from the less preferable to the most preferable, has the following formulation.  Rij denoting the position of the j-th option according to the i-th criterion, the global preference by the o-th option is given by eo = max åwiRio, where the non-negative weights wi obey the constraints åwiRij £ 1 for j varying along the whole set of compared options.  In the summands are represented all the criteria admitted in the analysis. 

Allowing for multipliers wi with null value, we permit that the global preference by any option be increased by the exclusion of the criteria that place such option in a disadvantaged position.  This can result, for instance, in the attribution of a maximal final preference to an option presenting the same classification of another one in every criterion except some for which is given a null weight, even if, according to these last criteria, the other option would be preferred.  In the approach based on ranks, to avoid this possibility, it is enough to prohibit ties. 

The dual formulation of the optimization problem above set corresponds to the envelope formulation of the DEA model of Charnes, Cooper and Rhodes (1978) oriented for the input.  In this formulation, the level of efficiency of a production unit is given by the minimum of the possible quotients with denominator given by the volume of aggregate input applied by the evaluated unit and numerator given by the volume of aggregate input that a fictitious production unit, generated by combining the hypothetical result of reducing or increasing real production units proportionally in all of their inputs and outputs, must consume to produce a volume of output at least equal to that presented by the unit under evaluation.  When the preferences according to the available criteria take the place of outputs resulting from the application of a fixed input, the global preference is given by the minimal fraction of that standard amount of input that, applied to a fictitious combination of options, would result in a mixed rank at least equal to that of the option under evaluation.

Formally, the preference by the o-th option will be given by the minimum value qo of q such that qSljRij ³ Rio, for any criterion i, with all the lj nonnegative and adding to 1 and the sum carried out along the whole set of evaluated options. Since all variables represent preferences, all will grow in the same direction. This makes easy to visualize the contribution lj of a general option in the composition of the fictitious aggregated option which, applying only a fraction qo of the hypothetical input, would surpass the position of the o-th option.  This score q corresponds to the sum of the contributions qlj of the options of reference. 

The square norm provides a simpler form of treating all the criteria equally from a global point of view and, on evaluating each particular option, giving higher importance to the criteria according to which that option receives higher preference.  In fact, the norm measures the aggregate preference through a weighted average, with the weight of each criterion given by the own measure of preference for that option according to that criterion.  Ranking by the norm may then be thought as a simplification of the DEA approach, eliminating the search for the prices that maximize the relative efficiency, replaced by prices proportional to the volumes of the outputs.


3.3.  Criteria Weighting

The need to develop the decision process starting from the set of effectively available options may lead to weights for the criteria that vary according to the set of options to be compared.  In certain cases, some criteria can even be applied to compare some of the available options but not all of them.  For other decision processes, we may be able to establish a hierarchy among the criteria, with higher weights for the criteria presenting properties such as relevance, reliability of the respective measurement tools, absence of correlation to other criteria and so on.

A composition mechanism that, before proceeding to the aggregation, defines weights giving different importance to the different criteria consists in determining the global preference by a norm of the projection of the vectors of preference on a unique direction. If this direction is chosen on the set of available preference vectors, it provides an example of extracting weights from the classification of the options supplied by the own criteria. The direction we will choose, to keep the simplified approach of comparing with the option in higher evidence, will be that determined by the vector of preferences for the option considered best according to the most important criterion.

In decisions such as the gamblers bets, besides the preferences according to each criterion, we have a global measure of preference given by the amount bet on each option. Then we can derive weights by fitting a regression model from the global observed measure on the set of explanatory variables constituted by the partial preferences.  But, the weights derived from the estimates for the coefficients of the regression model do not necessarily apply well throughout the whole set of options. To improve fit, we may try other monotonic transformations of the explanatory variables. In the application below, we compare results obtained using the squared norm instead of the length of the projection on the direction of the option preferred according to the main criterion.   

The weights obtained from fitting a regression model can, also, be combined with variable a priori weights associated to peculiar kinds of preference vectors or made vary according to other systems. Another alternative, if we aggregate through the probability of being the best according to at least one criterion, is to apply different exponents to each probability of not being the preferred option.  To the same goal, if we order according to the distance to the excellence frontier, we may bound the relations between the shadow-prices, that means to bound the weight of each criterion. 



4.  Application to Horse Races

 This section presents the results of an empirical investigation of the influence of uncertainty and of the matrix of preferences effectively observed on the structure of weights of the criteria eventually adopted.  The data are of the preferences of the bettors in races of big prize proofs realized in Rio de Janeiro during the week of Big Prize Brazil.  Of these races it is analyzed, jointly and separately, the relation between the observed final bets and the distribution of preferences previously supplied by the Official Program and by the jockeys.  This second ranking was determined by ordering the animals according to the number of victories of the respective jockey in the last season.

These are two important criteria for the bettors, but, instead of two, more criteria should be combined, to produce more realistic models.  Since all the modeling alternatives studied extend trivially to more than two criteria, we use in this example the simplest model. 

We consider first the 2001 meetings. Initially, is adjusted a regression model having as dependent variable the vectors of observed odds derived from the final betting distribution and having as explanatory variables the vectors of preferences according to the two criteria: preference based on previous campaign and preference of the jockey.  These preferences are given, in the first fit, in the form of ranks.  In a second adjustment tried, they are given in the form of odds of the option being the preferred, calculated as indicated in Section 2. 

The hypothesis investigated is if it is possible to identify more uniform weights to explain the odds effectively determined by the bets by measuring the preferences through the probabilistic transformations of the vectors of ranks than through the original ranks.  The corroboration of this hypothesis turns possible to develop a strategy of calculation of weights attributed by the bettors to each criterion taking into account the distributions of preference eventually observed. 

Afterwards, forms of internal aggregation previous to the attribution of weights are considered.  First the criteria are aggregated attributing equal importance to both criteria and establishing the preferences in terms of closeness to the excellence frontier.  Two transformations of variables, based in the two forms of aggregation developed in Section 3.2, are then examined. The first is given by the odd of the option being the preferred by at least one of the criteria. The second is the distance to the DEA envelope of the vector of preferences for the option given in terms of odds according to the two criteria separately. 

Finally, another model is adjusted, as described in Section 3.3, through the projection of the vectors of ranks according to the two criteria on the direction determined by the ranks of the option preferred by the main criterion. Once determined the L2 or the squared L2 norm of this projection, we assume these measurements subject to uniformly distributed independent disturbances, with mean equal to the measurement provided and range determined by the maximal observed distance, and compute the odds of each option reaching the first preference position.

 The results of the adjustment of the regression models are presented in Table 4.1 below. 


Table 4.1.

Regression of Observed Odds Derived on Pairs of Explanatory Variables
















In all the regression models, the estimates of the coefficients of the explanatory variables are significant at the 1% significance level, except those of the rank according to jockey choice in the first regression and the odd derived from the L2 norm of the projection in the last one.  The estimate for the coefficient of this last variable is negative. The same happens to that of the score of closeness to the excellence frontier.  The p-value corresponding to this last estimate, although small, is, also, considerably higher than that of the other explanatory variable.  This suggests that, among the two correlated explanatory variables employed, we should prefer, in the two first regressions, the official ranking and the odds derived from it, in the next, the odd of being preferred according to one of the two criteria and, in the last one, the odd derived from the preference measured by the squared norm of the projection.

The analysis of the residuals of the four equations is clarifying.  Examining point by point, it is easy to perceive that the adjustment improves from the first to the last regression, as the model gives up fitting the large number of points with the dependent variable near the origin, that means, with small volume of bets.  The predictions for these points in the last regressions systematically overestimate the observations. The counterpart is a better approximation to the points corresponding to options of higher preference that substantially improves the global explanatory power.

The simple regression from the odds derived from the bets on the odds derived from the squared norm of the projection of the vector of ranks on that of the option preferred by the main criterion presents a R2 of 69% and F statistics equal to 207,6.  When the explanatory variable of the simple regression is the own norm of the projection, without the use of the calculation of the odd of the option being the better, the coefficient of determination falls to 14% and the F statistics to 14,5.  The sample coefficient of correlation falls from 83% to 37%. 

Thus, we find strong indication that the transformation of the ranks in odds of the option being the preferred increases the precision of the estimates of the coefficients of the linear models explaining the odds found in the distribution of observed bets.  The application of this transformation to explanatory variables built by combining the ranks through the projection on the direction found most important also increases the precision of the linear adjustment. After applying this transformation, we also find statistical support for the conjecture that the mechanism of aggregation involves projection on the direction of the option preferred by the main criterion and to the use of the squared norm of the vector of projected preferences.

Table 4.2 below presents the correlations corresponding to each proof examined, between the final odds offered by the bettors and each explanatory variable. As we advance from left to right in this Table, each pair of correlation columns corresponds to more complex transformations.  In the two first, the preferences according to each criterion are given directly by the ranks.  In the two following, they are given by the odds of the options being the preferred.  The variables of the following columns result from the composition of the two criteria using the two algorithms developed in the Section 3.2.  Finally, the two last result from the composition through projection, developed in Section 3.3. 

The total referred in the first line of this Table refers to the seven proofs in habitual distances for which were provided preliminary ranks, whose correlations are listed in the following lines of the Table. We can see, in the total, a correlation of 83% for the vectors of odds derived from the bets and of odds derived after projection on the direction determined by the option preferred according to the dominant criterion.

 The two main big prize proofs of the week, the own Big Prize Brazil and the Marvelous City Prize, run one after the other, were analyzed separately because the job of ordering the competitors to these races is carried out jointly, resulting, in this opportunity, in the assignment of two animals for the number 1 of the Marvelous City Prize and in the doubt about the effective composition of the field of animals lining up for this last proof, until a few hours before the races.  Beyond that, a peculiar answer to the official indications is expected of the bettor in those races because they are run in the long distance of one mille and a half, distance seldom practiced, where factors as the experience of the jockey and the genetic structure of the animal may be taken into account in a different way by the bettor.  The correlations relative to these proofs are included at the end of Table 4.2. 

In Big Prize Brazil, the big favorite of the bettors, Canzone, jumped inside the starting gate, injuring itself, and was retreat.  The bets were reopened for a few minutes, but, given the overcrowding of the Track and the circumstances of the retreat, only a few bets were redirected.  The correlations relative to this race were, then, calculated maintaining Canzone between the competitors and attributing it 30% of the finally observed bets. The correlations of bets and odds derived from squared norms of projections are, also for these proofs, among the highest.


Table 4.2.

Correlation of Observed Odds with other Preference Measures


Jockey Rank

Official Rank

Jockey Odd

Official Odd

DEA Score

Best for one at least

Projection Norm

Squared norm




























































































Using natural logarithms of odds, the results obtained are similar, the advantage for the projection approach being not, however, so accentuated.  This loss of correlation can be explained by the effect of the application of logarithms to increase distances between options of low preference that the probabilistic approach had reduced. Table 4.3 presents the correlation, in the set of seven races in the first lines of Table 4.2, of the logarithms of odds derived from the bets with the same explanatory variables of that Table, replaced, also, those corresponding to odds by the natural logarithms. 


Table 4.3.

Correlation of Observed Logarithmic Odds with Other Measures

Jockey Rank

Official Rank

ln Jockey Odd

ln Official Odd

DEA Score

ln best for one at least

ln Projection Norm

ln Squared Norm












5. Conclusion

The strategy of ranking the options and then derive the probabilities of each option to be classified in the first place led to distributions of probabilities of each option finally being chosen closer to the distributions of observed bets than the vectors of ranks.  This transformation improves prediction also when applied to the projections on the direction in main evidence. 

The squared norm of the projection produced the best transformation of the data, not considering the probabilistic approach.  It is interesting to notice that this conclusion would not be possible without the subsequent probabilistic transformation.

 The results obtained seem impressive in the context of formation of preferences of bettors in horse races. The two classifications provided explained most of the final preference. To extend this approach to other contexts just requires the assumption that the motivation to take the best option as reference is present. This should be, by its turn, object of empirical investigation. 



