We identify fundamental issues with discretization when estimating information-theoretic quantities in the analysis of data. These difficulties are theoretical in nature and arise with discrete datasets carrying significant implications for the corresponding claims and results. Here we describe the origins of the methodological problems, and provide a clear illustration of their impact with the example of biological network reconstruction. We propose an algorithm (shared information metric) that corrects for the biases and the resulting improved performance of the algorithm demonstrates the need to take due consideration of this issue in different applications.
On the Theory and Algorithm for rigorous discretization in applications of Information Theory.pdf
"KAUST shall be a beacon for peace, hope and reconciliation, and shall serve the people of the Kingdom and the world."
King Abdullah bin Abdulaziz Al Saud, 1924 – 2015
Thuwal 23955-6900, Kingdom of Saudi Arabia
© King Abdullah University of Science and Technology. All rights reserved