01 Microbiome analysis: Computational techniques and challenges

Hosted by Serghei Mangul (University of California, Los Angeles)
Technological advances and the decreasing costs of ‘next-generation’ sequencing (NGS) make it the technology of choice for many applications, including studying the human microbiome composed of bacterial, viral, fungi and other eukaryotic communities. Recently, high-throughput sequencing has revolutionized microbiome research by enabling the study of thousands of microbial genomes directly in their host environments. This approach, which forms the field of metagenomics, avoids the biases incurred with traditional culture-dependent analysis. The metagenomics approach also allows the comparison of microbial communities’ composition in their natural habitats across different human tissues and environmental settings. Specifically, metagenomic profiling is proven useful for analyzing microbes such as eukaryotic and viral pathogens, which were previously impossible to study in an unbiased way with target 16S ribosomal RNA gene.

Tentative list of topics: We will be discussing recent methods to study microbial communities. We will be discussing the challenges in metagenomics analysis and limitation of the current methods. The goal will be to identify the best strategy to analyze metagenomics data.

We will start with discussion the most popular ‘marker genes’ methods, which are suggested to have poor sensitivity and also may result in false positives, detecting dangerous pathogens which are not present in the metagenomics sample. We also will discuss methods aimed to study microbiome at strain level and methods to study non-bacterial organisms, including viruses and fungi.

Outcomes: One possible outcome can be a joint effort to write an educational paper introducing metagenomics for researchers with no background in computational genomics or bioinformatics.

Difficulty: Intro/Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Escobar-Zepeda, A., de León, A.V.P. and Sanchez-Flores, A., 2015. The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics. Frontiers in Genetics, 6.
2. Simon, C. and Daniel, R., 2011. Metagenomic analyses: past and future trends. Applied and Environmental Microbiology, 77(4), pp.1153-1161. 3. Schmidt, C., 2017. Living in a microbial world. Nature Biotechnology, 35(5), p.401.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
4. Sczyrba, A., Hofmann, P., Belmann, P., Koslicki, D., Janssen, S., Droege, J., Gregor, I., Majda, S., Fiedler, J., Dahms, E. and Bremges, A., 2017. Critical Assessment of Metagenome Interpretation− a benchmark of computational metagenomics software. Biorxiv, p.099127.
5. Nayfach, S., Rodriguez-Mueller, B., Garud, N. and Pollard, K.S., 2016. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Research, 26(11), pp.1612-1625.
6. Afshinnekoo, E., Meydan, C., Chowdhury, S., Jaroudi, D., Boyer, C., Bernstein, N., Maritz, J.M., Reeves, D., Gandara, J., Chhangawala, S. and Ahsanuddin, S., 2015. Geospatial resolution of human and bacterial diversity with city-scale metagenomics. Cell Systems, 1(1), pp.72-87.
7. Huffnagle, G.B. and Noverr, M.C., 2013. The emerging world of the fungal microbiome. Trends in Microbiology, 21(7), pp.334-341.

02 Statistical methods to refine and redefine phenotypes

Hosted by Andy Dahl (University of California, San Francisco)
With many multitrait datasets--like EHRs--the observed traits do not parsimoniously or precisely represent the underlying biology. For downstream analysis, the observed traits would ideally be summarized by a small number of latent and unknown traits that describe distinct and clear biological mechanisms. I hope to read papers on related dimensionality reduction problems, including both relevant methodological stats/ML papers and genetics papers using rigorous multitrait methods.

Difficulty: Advanced

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Van Der Maaten, L., Postma, E. and Van den Herik, J., 2009. Dimensionality reduction: a comparative. J Mach Learn Res, 10, pp.66-71.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
1. Lawrence, N., 2005. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. Journal of machine learning research, 6(Nov), pp.1783-1816.
2. Cortes, A., Dendrou, C., Motyer, A., Jostins, L., Vukcevic, D., Dilthey, A., Donnelly, P., Leslie, S., Fugger, L. and McVean, G., 2017. Bayesian analysis of genetic association across tree-structured routine healthcare data in the UK Biobank. bioRxiv, p.105122. PLUS SUPPLEMENT.
3. Joshi, S., Gunasekar, S., Sontag, D. and Joydeep, G., 2016, December. Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization. In Machine Learning for Healthcare Conference (pp. 17-41).

03 Modern statistical methods with application to genomics

Hosted by Marzia Cremona (Pennsylvania State University)
The goal of this journal club is to review and discuss modern statistical approaches that have been used in genomics research, or that can potentially be applied to analyze genomics data. We will discuss both theoretical aspects and genomics applications. Topics can include functional data analysis, variable selection and sufficient dimension reduction, inference methods, methods for big data, and will be chosen on the basis of participants' interest.

Difficulty: Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Wang, J.L., Chiou, J.M. and Müller, H.G., 2016. Functional data analysis. Annual Review of Statistics and Its Application, 3, pp.257-295.

Additional papers to potentially discuss:
2. Reimherr, M. and Nicolae, D., 2014. A functional data analysis approach for genetic association studies. The Annals of Applied Statistics, 8(1), pp.406-429.
3. Matsui, H. and Konishi, S., 2011. Variable selection for functional regression models via the L1 regularization. Computational Statistics & Data Analysis, 55(12), pp.3304-3310.
4. Kayano, M., Matsui, H., Yamaguchi, R., Imoto, S. and Miyano, S., 2016. Gene set differential analysis of time course expression profiles via sparse estimation in functional logistic model with application to time-dependent biomarker detection. Biostatistics, 17(2), pp.235-248.
5. Taylor, S. and Pollard, K., 2009. Hypothesis tests for point-mass mixture data with application to ‘omics data with many zero values. Statistical Applications in Genetics and Molecular Biology, 8(8), pp. 1-43.
6. Nye, T.M., 2011. Principal components analysis in the space of phylogenetic trees. The Annals of Statistics, pp.2716-2739.

04 Outbreak detectives in the genomics era: Computational methods in molecular epidemiology

Hosted by Pavel Skums (Georgia State University)
Molecular epidemiology is a new computationally-intensive discipline, which seek to allow to investigate disease outbreaks and track pathogen transmissions using viral genomic data sampled from infected individuals. In the recent years, computational genomics methods were successfully used for emerging diseases outbreaks (such as Ebola and Zika), as well as for the long-standing epidemics (such as HIV and HCV). The ultimate goal of computational molecular epidemiology is to develop methods allowing to reconstruct transmission histories and answer the question, who infected whom. This task is complicated by incomplete and noisy sequencing and epidemiological data, as well as by the extreme genetic heterogeneity of many viruses, which rapidly evolve within their hosts. We plan to discuss recent computational advances in the area, as well as pose and discuss open computational problems.

Difficulty: Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Read introductions to papers 3 and 4 (and references therein). There are no review papers in this field yet.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
1. Campo, D.S., Xia, G.L., Dimitrova, Z., Lin, Y., Forbi, J.C., Ganova-Raeva, L., Punkova, L., Ramachandran, S., Thai, H., Skums, P. and Sims, S., 2015. Accurate genetic detection of hepatitis C virus transmissions in outbreak settings. The Journal of infectious diseases, 213(6), pp.957-965.
2. Jombart, T., Cori, A., Didelot, X., Cauchemez, S., Fraser, C. and Ferguson, N., 2014. Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data. PLoS computational biology, 10(1), p.e1003457.
3. De Maio, N., Wu, C.H. and Wilson, D.J., 2016. SCOTTI: efficient reconstruction of transmission within outbreaks with the structured coalescent. PLoS computational biology, 12(9), p.e1005130.
4. Skums, P., Zelikovsky, A., Singh, R., Gussler, W., Dimitrova, Z., Knyazev, S., Mandric, I., Ramachandran, S., Campo, D., Jha, D. and Bunimovich, L., 2017. QUENTIN: reconstruction of disease transmissions from viral quasispecies genomic data. Bioinformatics.

05 Introduction to single-cell genomics and new research directions towards a human cell atlas

Hosted by Vasilis Ntranos (University of California, Berkeley)
Our main goal in this journal club will be to familiarize ourselves with some of the key problem formulations in single-cell genomics and get exposed to the computational challenges emerging from the new types of data that are becoming available in this field [R1, R2]. After the initial overview [R3], participants will be free to choose specific papers/methods that best align with their interests and have them discussed in more detail by the group. Suggested topics include spatial reconstruction [P1], single-cell entropy in differentiation [P2] and lineage tracing by genome editing [P3]. Participants with diverse backgrounds are welcome, as we would like to engage in broad discussions about new research directions and potentially draw connections to existing methods and ideas from related fields such as phylogenetics and metagenomics.

Difficulty: Introductory/Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
R1. Yuan, G.C., Cai, L., Elowitz, M., Enver, T., Fan, G., Guo, G., Irizarry, R., Kharchenko, P., Kim, J., Orkin, S. and Quackenbush, J., 2017. Challenges and emerging directions in single-cell analysis. Genome Biology, 18(1), p.84.
R2. Regev, A., Teichmann, S., Lander, E.S., Amit, I., Benoist, C., Birney, E., Bodenmiller, B., Campbell, P., Carninci, P., Clatworthy, M. and Clevers, H., 2017. The Human Cell Atlas. bioRxiv, p.121202.
R3. Wagner, A., Regev, A. and Yosef, N., 2016. Revealing the vectors of cellular identity with single-cell genomics. Nature Biotechnology, 34(11), pp.1145-1160.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
P1. Satija, R., Farrell, J.A., Gennert, D., Schier, A.F. and Regev, A., 2015. Spatial reconstruction of single-cell gene expression data. Nature biotechnology, 33(5), pp.495-502.
P2. Teschendorff, A.E. and Enver, T., 2017. Single-cell entropy for accurate estimation of differentiation potency from a cell's transcriptome. Nature Communications, 8, p.15599.
P3. McKenna, A., Findlay, G.M., Gagnon, J.A., Horwitz, M.S., Schier, A.F. and Shendure, J., 2016. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science, 353(6298), p.aaf7907.

06 Genome rearrangements guided by 3D structure of chromosomes

Hosted by Nikita Alexeev (George Washington University)
Genome rearrangements are evolutionary events that shuffle genomic architectures, which break a genome at several positions and glue the resulting fragments in a new order. They were widely studied with graph theory methods. The common belief is that rearrangements are possible between the fragile genome regions which are close in 3D. In this journal club we are going to discuss how knowledge about 3D structure of chromosomes obtained with Hi-C protocol allows us to understand the nature of genome rearrangements and discover the transformations that happened between different genomes.

Difficulty: Intermediate (requires a basic understanding of graph theory)

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. The following manuscript: Guillaume Fertin, Anthony Labarre, Irena Rusu, Eric Tannier and Stéphane Vialette. Combinatorics of Genome Rearrangements. MIT press, 2009. Available at: https://mitpress.mit.edu/sites/default/files/titles/content/9780262062824_sch_0001.pdf
2. Chapters 5.1-5.2 of: Jones, N.C. and Pevzner, P., 2004. An introduction to bioinformatics algorithms. MIT press. Accessible at: https://www.dropbox.com/s/z97nf7cvk6j5cwb/Chaters5.2to5.2.pdf?dl=0

Papers to discuss (read before the meeting when they are scheduled to be discussed):
3. Swenson, K.M., Simonaitis, P. and Blanchette, M., 2016. Models and algorithms for genome rearrangement with positional constraints. Algorithms for Molecular Biology, 11(1), p.13.
4. Véron, A.S., Lemaitre, C., Gautier, C., Lacroix, V. and Sagot, M.F., 2011. Close 3D proximity of evolutionary breakpoints argues for the notion of spatial synteny. BMC Genomics, 12(1), p.303.
5. Pulicani, S., Simonaitis, P. and Swenson, K.M., 2017. Rearrangement Scenarios Guided By Chromatin Structure. bioRxiv, p.137323.

07 Computational epigenetics

Hosted by Chloe Robins (Emory University)
A survey of computational methods for the analysis of epigenetic data, from DNA methylation to chromatin-level modifications. We will start with DNA methylation and move from there.

Difficulty: Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Jones, P.A., 2012. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nature Reviews Genetics, 13(7), pp.484-492.
2. Bock, C., 2012. Analysing and interpreting DNA methylation data. Nature Reviews Genetics, 13(10), pp.705-719.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
3. Wu, H., Xu, T., Feng, H., Chen, L., Li, B., Yao, B., Qin, Z., Jin, P. and Conneely, K.N., 2015. Detection of differentially methylated regions from whole-genome bisulfite sequencing data without replicates. Nucleic Acids Research, 43(21), p.e141.
4. Hansen, K.D., Langmead, B. and Irizarry, R.A., 2012. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biology, 13(10), p.R83.
5. Slieker, R.C., van Iterson, M., Luijk, R., Beekman, M., Zhernakova, D.V., Moed, M.H., Mei, H., Van Galen, M., Deelen, P., Bonder, M.J. and Zhernakova, A., 2016. Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms. Genome Biology, 17(1), p.191.
6. Bell, C.G., Xia, Y., Yuan, W., Gao, F., Ward, K., Roos, L., Mangino, M., Hysi, P.G., Bell, J., Wang, J. and Spector, T.D., 2016. Novel regional age-associated DNA methylation changes within human common disease-associated loci. Genome Biology, 17(1), p.193.
7. Houseman, E.A., Kile, M.L., Christiani, D.C., Ince, T.A., Kelsey, K.T. and Marsit, C.J., 2016. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC Bioinformatics, 17(1), p.259.
8. Rahmani, E., Zaitlen, N., Baran, Y., Eng, C., Hu, D., Galanter, J., Oh, S., Burchard, E.G., Eskin, E., Zou, J. and Halperin, E., 2016. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nature Methods, 13(5), pp.443-445.

08 New genomic data and methods for inferring human population history in Eurasia

Hosted by Ilan Gronau (Cornell University)
The past couple of years have produced an extreme wealth of genome sequence data that can be used to retell the story of human population dispersal out of Africa and into Eurasia. We will review some of the main sources of data that emerged from these studies (present-day and ancient DNA), as well as the statistical methods used to produce demographic models from these data. Some of the interesting questions addressed in these studies: how many waves of migration out of Africa can we trace in present-day Eurasian populations? How was Europe populated? How do present-day populations relate to early farmers of the Middle East?

Difficulty: Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Nielsen, R., Akey, J.M., Jakobsson, M., Pritchard, J.K., Tishkoff, S. and Willerslev, E., 2017. Tracing the peopling of the world through genomics. Nature, 541(7637), pp.302-310.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
2. Mallick, S., Li, H., Lipson, M., Mathieson, I., Gymrek, M., Racimo, F., Zhao, M., Chennagiri, N., Nordenfelt, S., Tandon, A. and Skoglund, P., 2016. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature, 538(7624), pp.201-206.
3. Pagani, L., Lawson, D.J., Jagoda, E., Mörseburg, A., Eriksson, A., Mitt, M., Clemente, F., Hudjashov, G., DeGiorgio, M., Saag, L. and Wall, J.D., 2016. Genomic analyses inform on migration events during the peopling of Eurasia. Nature, 538(7624), pp.238-242.
4. Fu, Q., Posth, C., Hajdinjak, M., Petr, M., Mallick, S., Fernandes, D., Furtwängler, A., Haak, W., Meyer, M., Mittnik, A. and Nickel, B., 2016. The genetic history of Ice Age Europe. Nature, 534(7606), pp.200-205.
5. Lazaridis, I., Nadel, D., Rollefson, G., Merrett, D.C., Rohland, N., Mallick, S., Fernandes, D., Novak, M., Gamarra, B., Sirak, K. and Connell, S., 2016. Genomic insights into the origin of farming in the ancient Near East. Nature, 536(7617), pp.419-424.
6. Malaspinas, A.S., Westaway, M.C., Muller, C., Sousa, V.C., Lao, O., Alves, I., Bergstrom, A., Athanasiadis, G., Cheng, J.Y., Crawford, J.E. and Heupink, T.H., 2016. A genomic history of Aboriginal Australia. Nature, 538(7624), pp.207-207.
7. Lipson, M. and Reich, D., 2017. working model of the deep relationships of diverse modern human genetic lineages outside of Africa. Molecular Biology and Evolution, p.msw293.

09 (Un)breaking the chain: statistical methods to uncover the molecular cascade of genotype → molecular phenotypes → disease

Hosted by Nick Mancuso (University of California, Los Angeles)
Genome-wide association studies have been wildly successful in identifying genomic regions associated with disease risk. To date, thousands of loci have been reproducibly identified, yet most fail to provide mechanistic insight. Multiple lines of evidence have demonstrated significant enrichment of GWAS risk loci in functional regions of the genome, which suggests regulatory control of intermediate phenotypes (e.g., gene expression, splice variation, chromatin state).

Taken together, this paints a broad landscape where genetic variation influences intermediate molecular phenotypes and ultimately disease risk. Recently, several nascent computational approaches have been proposed that link genetic variation to intermediate phenotype and disease risk. This journal club will review the state-of-the-art in this area of research and prepare attendees for independent investigation.

Difficulty: Intermediate/Advanced

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B.W., Jansen, R., De Geus, E.J., Boomsma, D.I., Wright, F.A. and Sullivan, P.F., 2016. Integrative approaches for large-scale transcriptome-wide association studies. Nature genetics, 48(3), pp.245-252.
2. Mancuso, N., Shi, H., Goddard, P., Kichaev, G., Gusev, A. and Pasaniuc, B., 2017. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. The American Journal of Human Genetics, 100(3), pp.473-487.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
3. Gusev, A., Mancuso, N., Finucane, H.K., Reshef, Y., Song, L., Safi, A., Oh, E., McCaroll, S., Neale, B., Ophoff, R. and O'Donovan, M.C., 2016. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. bioRxiv, p.067355.
4. Park, Y., Sarkar, A.K., Bhutani, K. and Kellis, M., 2017. Multi-tissue polygenic models for transcriptome-wide association studies. bioRxiv, p.107623.

10 Integrative analysis of multiple types of genomic data

Hosted by William Wen (University of Michigan)
The goal here is to survey the current literature of integrative analysis of multiple types of genomic data to (1) perform fine-mapping of genetic association signals; (2) understand molecular mechanism of complex traits; and (3) variant effect prediction.

Difficulty: Beginner to intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Ritchie, M.D., Holzinger, E.R., Li, R., Pendergrass, S.A. and Kim, D., 2015. Methods of integrating data to uncover genotype-phenotype interactions. Nature Reviews Genetics, 16(2), pp.85-97.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
2. Pickrell, J.K., 2014. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. The American Journal of Human Genetics, 94(4), pp.559-573.
3. Gusev, A., Lee, S.H., Trynka, G., Finucane, H., Vilhjálmsson, B.J., Xu, H., Zang, C., Ripke, S., Bulik-Sullivan, B., Stahl, E. and Kähler, A.K., 2014. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. The American Journal of Human Genetics, 95(5), pp.535-552.
4. Ionita-Laza, I., McCallum, K., Xu, B. and Buxbaum, J.D., 2016. A spectral approach integrating functional genomic annotations for coding and noncoding variants. Nature genetics, 48(2), pp.214-220.

11 Computational modeling of protein-RNA interactions

Hosted by Yaron Orenstein (Massachusetts Institute of Technology)
Protein-RNA interactions, mediated through both RNA sequence and structure, play vital role in all cellular processes. In recent years, technologies have been developed to measure these interactions in high-throughput manner. In this journal club, we will discuss computational solutions in modeling protein-RNA binding from these data, and focus on how RNA structure is incorporate in these models.

Difficulty: Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Cook, K.B., Hughes, T.R. and Morris, Q.D., 2014. High-throughput characterization of protein–RNA interactions. Briefings in functional genomics, 14(1), pp.74-89.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
2. Maticzka, D., Lange, S.J., Costa, F. and Backofen, R., 2014. GraphProt: modeling binding preferences of RNA-binding proteins. Genome Biology, 15(1), p.R17.
3. Orenstein, Y., Wang, Y. and Berger, B., 2016. RCK: accurate and efficient inference of sequence-and structure-based protein–RNA binding models from RNAcompete data. Bioinformatics, 32(12), pp.i351-i359.

12 Epistasis and evolution: methods and applications

Hosted by Or Zuk (Hebrew University of Jerusalem)
The effect of many genetic variants is mediated by other variants, leading to genetic interactions, also termed epistasis. Such interactions affect the selection forces acting on individual variants and can thus leave signals of co-evolution when looking at these variants in genomes of related species or of individuals from the same species. Consequently, these signals can be used to detect pairs of interacting variants, and to computationally infer the role of individual variants, including for example contacts of amino-acids in a protein, and compensatory cis-regulatory mutations.

We will review recent papers which study the evolution of both regulatory and coding interacting variants and discuss their models and computational approaches, with the goal being proposing modifications and extensions of the new methods to large-scale genomic datasets.

Difficulty: Advanced

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Phillips, P.C., 2008. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics, 9(11), pp.855-867.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
2. Jordan, D.M., Frangakis, S.G., Golzio, C., Cassa, C.A., Kurtzberg, J., Davis, E.E., Sunyaev, S.R. and Katsanis, N., 2015. Identification of cis-suppression of human disease mutations by comparative genomics. Nature, 524(7564), pp.225-229.
3. Sohail, M., Vakhrusheva, O.A., Sul, J.H., Pulit, S.L., Francioli, L.C., van den Berg, L.H., Veldink, J.H., de Bakker, P.I., Bazykin, G.A., Kondrashov, A.S. and Sunyaev, S.R., 2017. Negative selection in humans and fruit flies involves synergistic epistasis. Science, 356(6337), pp.539-542.
4. Hopf, T.A., Ingraham, J.B., Poelwijk, F.J., Schärfe, C.P., Springer, M., Sander, C. and Marks, D.S., 2017. Mutation effects predicted from sequence co-variation. Nature Biotechnology, 35(2), pp.128-135.

13 Causal inference in biology

Hosted by Michael Bilow (University of California, Los Angeles)
In this journal club, we'll cover some of the basics of causal inference as it relates to molecular biology, with a particular focus on inference on biological networks. We'll also be covering computational tools to solve biologically-focused problems, especially distributed graph processing using GraphX.

Difficulty: Intermediate

Papers covering assumed knowledge (read or know in advance of the first journal club meeting):
1. Kleinberg, S. and Hripcsak, G., 2011. A review of causal inference for biomedical informatics. Journal of biomedical informatics, 44(6), pp.1102-1112.
2. Pearl, J., 2009. Causal inference in statistics: An overview. Statistics surveys, 3, pp.96-146.

Papers to discuss (read before the meeting when they are scheduled to be discussed):
3. Şenbabaoğlu, Y., Sümer, S.O., Sánchez-Vega, F., Bemis, D., Ciriello, G., Schultz, N. and Sander, C., 2016. A multi-method approach for proteomic network inference in 11 human cancers. PLoS computational biology, 12(2), p.e1004765.
4. Djordjevic, D., Yang, A., Zadoorian, A., Rungrugeecharoen, K. and Ho, J.W., 2014. How difficult is inference of mammalian causal gene regulatory networks?. PloS one, 9(11), p.e111661.
5. Siahpirani, A.F. and Roy, S., 2017. A prior-based integrative framework for functional transcriptional regulatory network inference. Nucleic acids research, 45(4), pp.e21-e21.