Blog posts

2024

Genetic LLM Evo: A Third-Party Perspective

less than 1 minute read

Published: October 24, 2024

Learning biological patterns is hard, even for large language models. Learning meaningful biology at the genetic level? That’s even harder. Yet, a recently proposed LLM, Evo, asks the bold question: Is DNA all you need?

2022

Expectation-Maximization (EM) algorithm part II (EM-PCA)

4 minute read

Published: June 20, 2022

In this section, we will explore another practical application of the EM-algorithm to speed up the computation of PCA. This section assumes the readers have already read my introduction part of EM algorithm.

Expectation-Maximization (EM) algorithm part I (Introduction)

4 minute read

Published: April 17, 2022

Introduction

Maximum likelihood estimation (MLE) is a way of estimating the parameters of a statistic model given observation. It is conducted to find the parameters that maximize observations’ likelihood under certain model distribution assumptions. However, in many real life problems, we are dealing with problems with parameters that are not directly available to infer given the limited data we have, which are called hidden variables Z. Many problems in the areas of genomics involve dealing with hidden variables. Typical examples in genomics are (i) inferring microbial communities (Z: different communities) (ii) inferring the ancestries of a group of individuals (Z: different ancestries) (iii) inferring the cell type content from specific sequencing data (Z: different cell types). Problems involving hidden variables like these are typically hard to directly performing maximum likelihood estimation.

Boyang Fu (付伯阳）

Blog posts

2024

Genetic LLM Evo: A Third-Party Perspective

2022

Expectation-Maximization (EM) algorithm part II (EM-PCA)

Expectation-Maximization (EM) algorithm part I (Introduction)

Introduction