Evaluating the quality of segmentations is an important process in image processing, especially in the medical domain. Many evaluation metrics have been used in evaluating segmentation. There exists no formal way to choose the most suitable metric(s) for a particular segmentation task and/or particular data. In this pa- per we propose a formal method for choosing the most suitable metrics for evaluating the quality of segmenta- tions with respect to ground truth segmentations. The proposed method depends on measuring the bias of metrics towards/against the properties of the the seg- mentations being evaluated. We firstly demonstrate how metrics can have bias towards/against particular properties and then we propose a general method for ranking metrics according to their overall bias. We fi- nally demonstrate for 3D medical image segmentations that ranking produced using metrics with low overall bias strongly correlate with manual rankings done by an expert.