IIGM is an instrumental body of the

IIGM is an instrumental body of the Compagnia di San Paolo Foundation

Search close icon

What are you looking for?

Statistical inference and computational biology


The Unit’s research program is related to the development of computational techniques inspired by statistical mechanics applied to biological research: from the quantitative analysis of large-scale biological databases to the inference of complex interaction networks.

The emergence of modern high-throughput experimental techniques has completely revolutionized areas in the life sciences such as genomics, proteomics, transcriptomics, and imaging allowing a quantitative and systemic approach to the study of biological systems. Paradigmatic is also the attempt to systematize this knowledge in shared structured forms with the aim of making accessible this huge amount of knowledge (ontological, functional, compartmental, structural, etc.).

Public databases such as KEG-Biocyc, Human Proteome Atlas, Pfam, PDB, Uniprot, allow access to gigantic collections of data on elementary biological components and their interconnections. At the same time, our ability to analyze and mathematically model this knowledge is limited by the lack of appropriate algorithmic/mathematical techniques.

The Statistical Inference and Computational Biology unit develops mathematical and computational modeling techniques inspired by the statistical physics of complex systems for problems of inference, optimization, and for applications dedicated to the analysis of experimental databases and in particular to inference problems in complex biological systems.

The activity of the group is developed along the following main lines:

  • algorithms. The development of general mathematical methodologies for statistical inference in the field of biological systems constitutes the infrastructure of the group activity at the basis of the different applications: study of inverse problems, development of artificial intelligence techniques for supervised, semi-supervised, and non-supervised classification problems, message passing methods, stochastic and combinatorial optimization problems;
  • co-evolutionary analysis of sequences and generative models. Inference of structural and functional properties of amino acid sequences, modeling of evolutionary landscapes, analysis of mutational scanning data (phage display, selex, directed-evolution), generative models of coding sequences with target function, analysis of Rep-Seq type immunological data;
  • quantitative biology. Single cell experiments of post-transcriptional gene regulation, cancer cell physiology and vesicular trafficking. To study these systems, innovative computational and experimental techniques are developed to calibrate mathematical models;
  • cellular metabolism in the proliferative regime. Development of mathematical models for prediction of aberrant phenotypes, growth laws, characterization of the functional space of regulatory networks, modeling of cell cultures with competitive heterogeneous phenotypic/genotypic traits. (PI: A. Pagnani).