DEVELOPMENT OF COMPUTATIONAL TOOLS FOR ANALYSIS OF BIG BIOLOGICAL DATA
We routinely develop computational methods for the analysis of big biological data. We are developing tools in several fields. Two examples are community detection in single cell RNA-Seq data, and quantitative set analysis for gene expression. Our main interest lies in the analysis of antibody repertoire sequencing data. We recently published two complementary sets of tools (pRESTO and Change-O), that combine many tools required for the analysis of lymphocyte repertoire dynamics. They supply means to process raw sequencing reads of antibodies, estimate mutability patterns of somatic hypermutations, quantify antigen-driven selection, and detect novel germline alleles. We continue to develop these tools and adapt them to address important biological questions such as how to quantify affinity dependent selection at the codon level, how to connect these estimates to the antibody structure, how to utilize mutability models to reconstruct B cell lineages, and how to detect common antibody motifs between individuals that share a common clinical status.