......
Bioinformatics Algorithms

Bioinformatics algorithms book

ROSALIND

ROSALIND programming website

Speciation & Population Genomics: a how-to-guide

The tutorials and resources on this site are written for the Physalia Speciation genomics course that we have been teaching since 2018. We are really glad to hear that the resources are of use to others that do not participate in the class and we are happy for you to use them.

Computational Biology - Genomes, Networks, and Evolution

This text covers the algorithmic and machine learning foundations of computational biology combining theory with practice. We cover both foundational topics in computational biology, and current research frontiers. We study fundamental techniques, recent advances in the field, and work directly with current large-scale biological datasets.

MLCB24 - Machine Learning in Cell Biology 2024
Training Courses of Babraham Institute

As part of its work with the Babraham Institute, the Bioinformatics group runs a regular series of training courses on many aspects of bioinformatics. These courses are run regularly on the Babraham site but we are also able to come out and present them on other sites and also deliver them remotely. You can see the list of current Babraham dates which are available, and you can contact us to discuss options for running courses on your site.

Comp790-166: Computational Biology

Modern, high-throughput assays allow us to efficiently profile a variety of biological processes to gain a systems-level understanding of health and disease. Recent technologies and experimental assays generate an abundance of detailed information that needs to be extracted, summarized, and interpreted. In this course we will discuss the methodology used to extract signal from (e.g. process, engineer features from, combine, etc.) data generated by some of the most cutting-edge experimental paradigms, such as single-cell assays and imaging. We will go into detail about the methods and theory underlying bioinformatics algorithms, originating from numerical linear algebra, graph-signal processing, and machine learning. While computational biology is a very broad field, we will focus here on applications in single-cell biology (CyTOF, single-cell RNA sequencing), multiomics/multi-modal analysis, systems immunology, and benchmarking. For each class of algorithms introduced for some task on biological data, we will also go over necessary theory and mathematical intuition. The course covers the foundations for biomedical data science and does not assume any biological knowledge.

Bioinformatics Tutorial By Lu Zhi

Bioinformatics Tutorial By Lu Zhi Lab

Computational Genomics with R

The aim of this book is to provide the fundamentals for data analysis for genomics. We developed this book based on the computational genomics courses we are giving every year.

Modern Statistics for Modern Biology

Modern Statistics for Modern Biology, by Susan Holmes and Wolfgang Huber

Misc notes

Various notes collected over the years

The Biostar Handbook: 2nd Edition

A guide to bioinformatics data analysis

GitHub - crazyhottommy/getting-started-with-genomics-tools-and-resources: Unix, R and python tools for genomics and data science

Unix, R and python tools for genomics and data science - crazyhottommy/getting-started-with-genomics-tools-and-resources

harvardx

HarvardX Biomedical Data Science By rafalab

Introduction to Bioinformatics and Computational Biology

Introduction to Bioinformatics and Computational Biology By Xiaole Liu Lab

Orchestrating Single-Cell Analysis with Bioconductor

Bioconductor's scRNA tutorial

Single-cell best practices
GWAS ATLAS

This atlas is a database of publicly available GWAS summary statistics. Each GWAS can be browsed with the manhattan plot, risk loci, MAGMA (i.e. gene-based) results, SNP heritability and genetic correlations with other GWAS in the database. 600 GWAS were performed in this project based on UK Biobank release 2 data under application ID 16406. Full summary statistics can be downloaded from the original source following the provided links.

ChIP-Atlas

A data-mining suite for exploring epigenomic landscapes by fully integrating 428,000 ChIP-seq, ATAC-seq and Bisulfite-seq experiments.

ARCH4

All RNA-seq and ChIP-seq sample and signature search (ARCHS4) (https://maayanlab.cloud/archs4/) is a resource that provides access to gene and transcript counts uniformly processed from all human and mouse RNA-seq experiments from the Gene Expression Omnibus (GEO) and the Sequence Read Archive (SRA).

refine.bio

refine.bio Compendia packages the data processed by refine.bio pipelines for flexible and broad use by computational biologists and data scientists

Google docs: Single cell study database
Single-cell Pan-species atlas

SPEED is the largest pan-species single cell database at present, covering 127 species to date. SPEED also collects various scRNA-seq atlases in evolution, development and disease, providing an ecological and evolutionary perspective for the study of development and disease.

Single Cell Portal

Single Cell Portal by BroadInstitute

CELLxGENE

scRNA datasets collections

Principles of Spatial Transcriptomics Analysis with Bioconductor

This is the website for the online book Principles of Spatial Transcriptomics Analysis with Bioconductor. This book provides interactive examples and discussion on key principles of computational analysis workflows for spatial transcriptomics data using Bioconductor in R. The book contains chapters describing individual analysis steps as well as extended workflows, each with examples including R code and datasets. The book is organized into several parts, consisting of introductory materials, and analysis steps and workflows for sequencing-based and imaging-based spatial transcriptomics platforms.

Zhonglab's 3D genome tutorial

Zhonglab's 3D genome tutorial

Ming Tang's ChIP-seq notes

Ming Tang's ChIP-seq notes

Hi-C Data Analysis Bootcamp

Hi-C Data Analysis Bootcamp

Github: Hi-C data analysis tools and papers

Hi-C data analysis tools and papers collection

Orchestrating Hi-C analysis with Bioconductor

Orchestrating Hi-C analysis with Bioconductor

Cooltools documentation

Cooltools tutorial

Neural networks and deep learning

Neural Networks and Deep Learning is a free online book. The book will teach you about: Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data Deep learning, a powerful set of techniques for learning in neural networks

Dive into Deep Learning — Dive into Deep Learning 1.0.3 documentation

Interactive deep learning book with code, math, and discussions Implemented with PyTorch, NumPy/MXNet, JAX, and TensorFlow

Distill

Machine Learning Research Should Be Clear, Dynamic and Vivid. Distill Is Here to Help.

Nature Review Genetics Collection : Machine learning in genomics

Machine learning has made significant contributions to the field of genetics, revolutionizing the way researchers analyse and interpret the vast amounts of genomic data. This collection brings together Reviews written by key opinion leaders in the field that explain cutting-edge machine learning methodology and how these tools are being applied in specific research areas. From predictive modelling to pattern recognition, this collection showcases the innovative ways in which machine learning is offering unprecedented insights into genetic variation, disease mechanisms and evolutionary dynamics.

Deep Learning in Genomics

Can deep learning models that have defeated gamers or recognized images better than humans also help us understand genomics? How far will this interdisciplinary research take us on our quest to cure cancer? In an era with faster-than-Moore’s-Law exponential growth of the genomics data (Berger et al. 2016), deep learning methods are finally able to assist in solving essential problems in the field. However, these exciting developments also face challenges that are unique to working with data from our DNA. As researchers trying to combine deep learning and genomics, we have to think carefully about applying these models effectively to genomics tasks. Is it appropriate to use deep learning for our application? What model should we use? Will our approach improve our understanding of the data or the problem? In this course, you will answer these questions by our coverage of recent research literature in the class. You will learn about different genomics tasks, deep learning models, and how they fit together. The course is designed to enable critical thinking and allows students to work together to apply these models.

Deep Learning in the Life Sciences

This courses introduces foundations and state-of-the-art machine learning challenges in genomics and the life sciences more broadly. We introduce both deep learning and classical machine learning approaches to key problems, comparing and contrasting their power and limitations. We seek to enable students to evaluate a wide variety of solutions to key problems we face in this rapidly developing field, and to execute on new enabling solutions that can have large impact. As part of the subject students will implement solutions to challenging problems, first in problem sets that span a carefully chosen set of tasks, and then in an independent project. Students will program using Python 3 and TensorFlow 2 in Jupyter Notebooks, a nod to the importance of carefully documenting your work so it can be precisely reproduced by others.

Statlect : Fundamentals of mathematical statistics

Learn the mathematical foundations of statistics, through a series of rigorous but accessible lectures on the most frequently utilized statistical concepts.

Github: Awesome statistics

The repository consists of a dataset with curated links to material dealing with statistics and data. There is a total of 1918 active links in the dataset. The 289 awesome/recommended links in the dataset are listed below. Feel free to add additional links to the dataset.

Data Science Specialization in Coursera

Use R to clean, analyze, and visualize data.

Learning Statistics with R

Textbook by Danielle Navarro. Back in the grimdark pre-Snapchat era of humanity (i.e. early 2011), I started teaching an introductory statistics class for psychology students offered at the University of Adelaide, using the R statistical package as the primary tool. I wrote my own lecture notes for the class, which have now expanded to the point of effectively being a book. The book is freely available, and as of version 0.6 it is released under a creative commons licence (CC BY-SA 4.0)

Data Wrangling and Visualization with R

This is the website for the Data Wrangling and Visualization with R part of Introduction to Data Science. The website for the Statistics and Prediction Algorithms Through Case Studies part is here.

Statistics and Prediction Algorithms Through Case Studies

This book started out as part of the class notes used in the HarvardX Data Science Series.

An R companion to Statistics: data analysis and modelling

A book about how to use R related to the book Statistics: Data analysis and modelling.

Likelihood Theory

Likelihood Theory ( by Patrick Breheny)

Statistical Rethinking

This book means to help you raise your knowledge of and confidence in statistical modeling. It is meant as a scaffold, one that will allow you to construct the wall that you need, even though you will discard it afterwards. As a result, this book teaches the material in often inconvenient fashion, forcing you to perform step-by-step calculations that are usually automated. The reason for all the algorithmic fuss is to ensure that you understand enough of the details to make reasonable choices and interpretations in your own modeling work. So although you will move on to use more automation, it’s important to take things slow at first. Put up your wall, and then let the scaffolding fall.

Bayesian Regression: Theory & Practice

This site provides material for an intermediate level course on Bayesian linear regression modeling. The course presupposes some prior exposure to statistics and some acquaintance with R.

Statistical rethinking with brms, ggplot2, and the tidyverse

A Solomon Kurz's notes about Statistical rethinking

Bayes Rules! An Introduction to Applied Bayesian Modeling

Bayesian statistics?! Once an obscure term outside specialized industry and research circles, Bayesian methods are enjoying a renaissance. The title of this book speaks to what all the fuss is about: Bayes Rules! Bayesian methods provide a powerful alternative to the frequentist methods that are ingrained in the standard statistics curriculum. Though frequentist and Bayesian methods share a common goal – learning from data – the Bayesian approach to this goal is gaining popularity for many reasons: (1) Bayesian methods allow us to interpret new data in light of prior information, formally weaving both into a set of updated information; (2) relative to the confidence intervals and p-values utilized in frequentist analyses, Bayesian results are easier to interpret; (3) Bayesian methods can shine in settings where frequentist “likelihood” methods break down; and (4) the computational tools required for applying Bayesian techniques are increasingly accessible. Unfortunately, the popularity of Bayesian statistics has outpaced the curricular resources needed to support it. To this end, the primary goal of Bayes Rules! is to make modern Bayesian thinking, modeling, and computing accessible to a broader audience.

An Introduction to Bayesian Data Analysis for Cognitive Science

This book is intended to be a relatively gentle introduction to carrying out Bayesian data analysis and cognitive modeling using the probabilistic programming language Stan (Carpenter et al. 2017), and the front-end to Stan called brms (Bürkner 2024). Our target audience is cognitive scientists (e.g., linguists, psychologists, and computer scientists) who carry out planned behavioral experiments, and who are interested in learning the Bayesian data analysis methodology from the ground up and in a principled manner. Our aim is to make Bayesian statistics a standard part of the data analysis toolkit for experimental linguistics, psycholinguistics, psychology, and related disciplines.

Rafalab: Statistical Learning: Algorithmic and Nonparametric Approaches

Statistical Learning: Algorithmic and Nonparametric Approaches

Rafalab:Applied Nonparametric and Modern Statistics

Applied Nonparametric and Modern Statistics Official title: Advanced Generalized Linear Models IV 140.753-754

Regression Modeling Strategies

All standard regression models have assumptions that must be verified for the model to have power to test hypotheses and for it to be able to predict accurately. Of the principal assumptions (linearity, additivity, distributional), this course will emphasize methods for assessing and satisfying the first two. Practical but powerful tools are presented for validating model assumptions and presenting model results. This course provides methods for estimating the shape of the relationship between predictors and response using the widely applicable method of augmenting the design matrix using restricted cubic splines. Even when assumptions are satisfied, overfitting can ruin a model’s predictive ability for future observations. Methods for data reduction will be introduced to deal with the common case where the number of potential predictors is large in comparison with the number of observations. Methods of model validation (bootstrap and cross-validation) will be covered, as will auxiliary topics such as modeling interaction surfaces, efficiently utilizing partial covariable data by using multiple imputation, variable selection, overly influential observations, collinearity, and shrinkage, and a brief introduction to the R rms package for handling these problems. The methods covered will apply to almost any regression model, including ordinary least squares, longitudinal models, logistic regression models, ordinal regression, quantile regression, longitudinal data analysis, and survival models. Statistical models will be contrasted with machine learning so that the student can make an informed choice of predictive tools.

Overview | Learning Statistical Models Through Simulation in R

Textbook on statistical models for social scientists.

Summary | Mixed Models with R

This is an introduction to using mixed models in R. It covers the most common techniques employed, with demonstration primarily via the lme4 package. Discussion includes extensions into generalized mixed models, Bayesian approaches, and realms beyond.

Free Code Camp

Community for practice programming skills

Notes of "Data Structures and Algorithms"

Notes of "Data Structures and Algorithms" (CMPSC 465) at Penn State

Data Wrangling Functions

This repository contains examples of packages::functions() I commonly use when wrangling education research data. I am building this for myself, my team, and anyone else who comes across this site, to use as a reference for data cleaning projects.

Big Book of R

R sources collections

Mastering Software Development in R

The book covers R software development for building data science tools. As the field of data science evolves, it has become clear that software development skills are essential for producing useful data science results and products. You will obtain rigorous training in the R language, including the skills for handling complex data, building R packages and developing custom data visualizations. You will learn modern software development practices to build tools that are highly reusable, modular, and suitable for use in a team-based environment or a community of developers.

R Packages (2e)

Learn how to create a package, the fundamental unit of shareable, reusable, and reproducible R code.

R for Data Science

R4DS teaches you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you’ll learn how to clean data and draw plots—and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You’ll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You’ll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualising, and exploring data.

fasteR: Fast Lane to Learning R!

fasteR: Fast Lane to Learning R!

R in action

Good tutorial for R beginner

Deep R Programming

Deep R Programming is a comprehensive and in-depth introductory course on one of the most popular languages for data science. It equips ambitious students, professionals, and researchers with the knowledge and skills to become independent users of this potent environment so that they can tackle any problem related to data wrangling and analytics, numerical computing, statistics, and machine learning. This textbook is a non-profit project. Its online and PDF versions are freely available at https://deepr.gagolewski.com/.

cxli233/Online_R_learning: Online R learning for applied statistics

Online R learning for applied statistics . Contribute to cxli233/Online_R_learning development by creating an account on GitHub.

Genomic Data Visualization

Course Website for EN.580.428 Genomic Data Visualization By JEFworks Lab

awesome-genome-visualization

A list of interesting genome visualizers, genome browsers, or genome-browser-like implementations

ggplot2: Elegant Graphics for Data Analysis (3e)

This is the on-line version of work-in-progress 3rd edition of “ggplot2: elegant graphics for data analysis” published by Springer. You can learn what’s changed from the 2nd edition in the Preface.

Datavis.ca

Friendly's website

Psyc 6135: Psychology of Data Visualization

Information visualization is the pictorial representation of data. Successful visualizations capitalize on our capacity to recognize and understand patterns presented in information displays. Conversely, they require that writers of scientific papers, software designers and other providers of visual displays understand what works and what does not work to convey their message. This course will examine a variety of issues related to data visualization from a largely psychological perspective, but will also touch upon other related communities of research and practice related to this topic: history of data visualization, computer science and statistical software, visual design, human factors. We will consider visualization methods for a wide range of types of data from the points of view of both the viewer and designer/producer of graphic displays.

Milestones in the history of thematic cartography, statistical graphics, and data visualization.

The graphic portrayal of quantitative information has deep roots. These roots reach into histories of thematic cartography, statistical graphics, and data visualization, which are intertwined with each other. They also connect with the rise of statistical thinking up through the 19th century, and developments in technology into the 20th century. From above ground, we can see the current fruit; we must look below to see its pedigree and germination. There certainly have been many new things in the world of visualization; but unless you know its history, everything might seem novel.

Friends Don't Let Friends Make Bad Graphs

Friends don't let friends make certain types of data visualization - What are they and why are they bad. Author: Chenxin Li, Ph.D., Assistant Research Scientist at Department of Crop & Soil Sciences and Center for Applied Genetic Technologies, University of Georgia.

From Data to Viz

From Data to Viz leads you to the most appropriate graph for your data. It links to the code to build it and lists common caveats you should avoid.

Fundamentals of Data Visualization

Data Visualization Principle

Vim Notes

Vim notes by Dave Tang

How linux works

Introduction to linux

Python Distilled

The richness of modern Python challenges developers at all levels.

Automate the Boring Stuff with Python
Uri Alon Lab

Our group combines mathematical modelling, large-scale biomedical data and experiments, to understand human physiology, aging and disease. We are physicists, biologists, computer scientists and MDs working together to form the basic equations of hormone circuits, antibiotics, autoimmunity, cancer, mood disorders and age-related diseases. Our style emphasizes teaching good communication skills, listening and having fun being creative while going together into the unknown.

Briscoe Lab

Developmental Dynamics of Tissue Formation

Sebé-Pedrós Lab

We study the evolution of cell type programs and associated genome regulation (transcription factors, histone modifications, and more). To this end, we combine single-cell genomics, chromatin profiling, and comparative genomics methods in a phylogenetically diverse array of multicellular animals and unicellular eukaryotes.

Engreitz lab

We are mapping the regulatory wiring of the genome to understand the genetic basis of heart diseases.

Delas Lab

Our lab is interested in understanding how cis-regulatory elements control cell fate decisions during development. While we know these non-coding sequences control gene expression, how they achieve this is still not understood. We have a particular interest in repressors, and how they act via cis-regulatory elements to mediate gene silencing.

Tjian + Darzacq Group

Tij is a professor of biochemistry and molecular biology at the University of California, Berkeley. Trained as a biochemist, he has made major contributions to the understanding of how genes work during three decades at Berkeley. He was named an HHMI investigator in 1987 and served as president of the Howard Hughes Medical Institute from 2009 until 2016.

Mirny Lab

The Mirny Lab is a dynamic research group at MIT dedicated to unraveling the mysteries of the 3D genome. We combine cutting-edge experimental techniques, computational modeling, and theoretical insights to understand how the intricate folding of DNA influences gene expression, cell function, and human health.

Carmona lab

We study patterns of variation across tumors to identify general principles of immune system regulation during tumor progression and response to therapy. We combine data science and computational methods development with high-throughput single-cell and spatial omics technologies with the goal to reveal biological insight and develop predictive models of disease progression and response to treatment. ...

Andersson Lab

The Andersson lab focuses on modeling gene regulation to gain insights into molecular mechanisms by which enhancer or promoter dysregulation contributes to disease risk.

Teschendorff Lab

Our lab is focused on developing and applying advanced statistical and computational methods to enable a more meaningful interpretation of large-scale, high-throughput, multi-dimensional omic data. In particular, our long-term interest and goal is to elucidate and understand the systems-biology of oncogenesis (i.e. why do specific cells turn cancerous), and in parallel, to help develop cancer risk prediction tools that enable P4 Medicine strategies. To address these goals, we are using computational methods to (1) help map epigenetic and genetic alterations that accrue in normal cells as a function of age and exposure to major cancer risk factors, and (2) to help understand how these molecular alterations may lead to cancer development. As cell-type heterogeneity presents a major challenge, we are particularly interested in developing statistical methods to help dissect cell-type heterogeneity in both single-cell as well as bulk-sample contexts. We are adapting and pursuing methods from network/complexity science (network physics & graph theory), statistical mechanics, signal processing and machine learning, increasingly in the context of integrative multi-omic data.

Lu Lab

Lu Lab in Tsinghua University, devoting to developing bioinformatics technologies, and practicing evidence-based precision medicine for diseases like cancer and immune-mediated diseases. We utilize AI technologies and noncoding RNA (ncRNA) centered multi-omics data, to understand how genetic information is encoded in the structured DNA and RNA sequences, and how they interact and regulate each other in a biological system. Ultimately, this will help us understand and cure human diseases, know and improve ourselves.

a'ayan Laboratory

The Ma’ayan Laboratory applies machine learning and other statistical mining techniques to study how intracellular regulatory systems function as networks to control cellular processes such as differentiation, dedifferentiation, apoptosis and proliferation.

Regulatory Genomics Lab

Regulatory Genomics Lab in Westlake lead by Kai Zhang

JEFworks Lab

Bioinformatics research lab in the Center for Computational Biology and the Department of Biomedical Engineering at Johns Hopkins University

Bioinfo Articles

Mostly blogs

Bioinfo Social Media

Bioinfo twitter and blusky list

BlueSky Science-Related Feeds
List : Transcription & Chromatin II
List: Transcription & Chromatin
Scientific Journals
Bioinformatics+Genomics starter pack

Accounts and feeds to follow in bioinformatics and genomics. High signal, low politics. Prioritizing accounts with interesting original posts (not those that mostly repost). If I've forgotten you, *please* let me know and I'll add you!

List: single-cell computational methods
List: Genomics, Evolution, and More. pt1
List: Algorithmic Genomics

Peeps actively working on algorithms and data structures for omics data. NOTE: This list started off-hand, based on ppl I see actively on this platform, and is certainly highly incomplete. If you think you belong here, let me know!

List: Genomics, Evolution, and More. pt2
Genomics Technologies
List: Cancer research

A starter pack of curious cancer researchers - from immunology to genomics to cell biology and so much more

List: Metagenomics, microbial genomics

Genomicists who study microbial genomes and communities

Data Story

A podcast on data visualization with Enrico Bertini and Moritz Stefaner, http://datastori.es/ . Interviews with over 100 graphic designers & developers.

Night Science

Where do ideas come from? In each episode, scientists Itai Yanai and Martin Lercher explore science's creative side with a leading colleague. New episodes come out every second Monday.

MirnyLab

Videos for publications of Mirny Lab @ MIT http://mirnylab.mit.edu

StatQuest with Josh Starmer

Learn statistics in a easy way. BAM!!!!

Xiaole Shirley Liu

Xiaole Shirley Liu's YouTube

3Blue1Brown

3Blue1Brown. Math Youtube

Manolis Kellis

Manolis Kellis' YouTube

Towards Data Science

Towards Data Science started in 2016 as a community-powered hub where data science and machine learning practitioners can share their knowledge and ideas with their peers — regardless of their location or background. This fundamental goal remains the same today.

Biostar forum

Bioinfomatics forum

A. Solomon Kurz

I am a full-time clinical research psychologist at the VISN 17 Center of Excellence, and a part-time adjunct professor at The Chicago School of Professional Psychology.

Notes from a data witch

A blog by Danielle Navarro

Visualising Data

ANDY KIRK's blog. Visualisingdata.com was originally launched in 2010 originally to serve as a blog to help continue the momentum of my learning from studying the subject via a Masters degree. I continue to publish articles and share announcements that track developments in my professional experiences as well as developments in the data visualisation field at large.

Junk Chart

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.

Flowing data

Data visualization blog

R some blog

Roman Pahl's blog

R bloggers

R news and tutorials contributed by hundreds of R bloggers

Hilary Parker

Hello, I’m Hilary Parker! I’m a Data Scientist, previously of Stitch Fix, Etsy, and the 2020 Biden for President Campaign. I'm passionate about the intersection of data science and product, from deeply understanding users to designing new experiences that depend on innovative data pipelines and client interactions. My work from Stitch Fix was featured in…

omic.ly

A weekly email newsletter on omics and clinical laboratory diagnostics.

Blogs from Briscoe Lab
Dave Tang's blog

Computational biology and genomics

Dave Tang's blog

Computational biology and genomics

Karl Broman's blog

Karl Broman's blog

Bits of DNA

Reviews and commentary on computational biology by Lior Pachter

Bioinformatics and other bits

Damien Farrell's blog

CHATOMICS

Ming Tang's blogs

Gilbert Han

Oh! This is my blog.

Paired Ends | Substack

Stephen Turner's blog

The Book of Why: The New Science of Cause and Effect

"Correlation is not causation." This mantra, chanted by scientists for more than a century, has led to a virtual prohibition on causal talk. Today, that taboo is dead. The causal revolution, instigated by Judea Pearl and his colleagues, has cut through a century of confusion and established causality -- the study of cause and effect -- on a firm scientific basis. His work explains how we can know easy things, like whether it was rain or a sprinkler that made a sidewalk wet; and how to answer hard questions, like whether a drug cured an illness. Pearl's work enables us to know not just whether one thing causes another: it lets us explore the world that is and the worlds that could have been. It shows us the essence of human thought and key to artificial intelligence. Anyone who wants to understand either needs The Book of Why.

Lewin's GENES XII

Long considered the quintessential molecular biology textbook, for decades Lewin's GENES has provided the most modern presentation to this transformative and dynamic science. Now in its twelfth edition, this classic text continues to lead with new information and cutting-edge developments, covering gene structure, sequencing, organization, and expression. Leading scientists provide revisions and updates in their respective areas of study offering readers current research and relevant information on the rapidly changing subjects in molecular biology. No other text offers a broader understanding of this exciting and vital science or does so with higher quality art and illustrations. Lewin's GENES XII continues to be the clear choice for molecular biology and genetics.

An Owner's Guide to the Human Genome: an introduction to human population genetics, variation and disease

In this book I describe the forces that govern genetic variation including mutation, drift, recombination and selection, as well as what genetics teaches us about human history, and the role of genetic variation in human phenotypes and diseases. When complete, the book will combine the three pillars of human population genetics - population genetics, population history, and trait genetics - under a single umbrella, with a focus on examples and applications in human genetics. Moreover, each section emphasizes the essential interplay between theory, statistical methods, and biological applications, with a focus on building intuition while avoiding heavy technical detail where possible.

Human Genome Variation

This is the course homepage and digital textbook for Human Genome Variation with Computational Lab