Abstract:
In inter-releases software fault prediction, the data from the previous version of the software
that is used for training the classifier might not always be of same granularity as that of the
testing data. The same scenario may also happen in the cross project software fault prediction.
So, one major issue in it can be the difference in granularity ,i.e., training and testing datasets
may not have the metrics at the same level. Thus, there is a need to bring the metrics at the
same level. In this work, eight different aggregation techniques are explored. In addition to
Median and Summation aggregation techniques that have been used earlier in Software Fault
Prediction, three other aggregation techniques ,i.e., Average Absolute Deviation (AAD), Median
Absolute Deviation (MAD) and Interquartile Range (IQR) that have not been used in Software
Fault Prediction so far are also explored in this work. Three novel aggregation techniques ,i.e.,
Average of Quarter Medians (QM_AVG), Median of Quarter Medians (QM_MED) and Sum of
Quarter Medians (QM_SUM) are also explored in this work.