Summary of PhaSepDB update

phasepdb_update_summary.jpg

1. The original intention of PhaSepDB

Phase separation (PS) regulates various biological processes, such as the assembly of membraneless organelles (MLOs), signaling transduction, transcription regulation, protein degeneration. Various proteins have been proven to undergo PS, which is important for them to execute their functions, and aberrant PS can lead to diseases.
In 2019/10/04, we released PhaSepDB 1.0 which provided experimental verified PS proteins and MLO related proteins. Through these years, PhaSepDB has been well-acknowledged by the phase separation community, with about 1000 views per day. This work has been cited 63 times and rated as a highly cited paper in Web of Science.
Numerous phase separation-related works have been published since then, thus we keep curating new phase separating proteins and updating the database. In July 2021, we released PhaSepDB 2.0. In June 2022, we updated the database to version 2.1. PhaSepDB 2.1 (http://db.phasep.pro/) contains 1419 phase separation entries, 770 low throughput membrane-less organelle-related entries, and 7303 high throughput entries. We also provided more detailed annotations of phase separation-related proteins such as material states and co-phase separation partners.

2. PS entries collection and classification

We rechecked literatures in PhaSepDB 1.0 and filtered PubMed search results with keywords:
“(phase transition[Title/Abstract] OR phase separation[Title/Abstract] OR membraneless organelles[Title/Abstract] OR biomolecular condensates[Title/Abstract]) AND protein AND cell”
1419 PS entries for 868 proteins were extracted from these literatures, and 280 entries for 193 proteins came from the first version. As a comparation, the other two well-established phase separation databases PhaSePro and LLPSDB v2 contain 121 and 586 PS proteins, respectively. Thus, PhaSepDB 2.1 is currently the largest manually collected PS protein database.
PS entries were classified into “PS-self” and “PS-other” classes, the former means that the protein can undergo PS in vitro by themselves, while the latter means the protein require other partners to form biomolecular droplet in vitro, or can form biomolecular droplet in vivo.In addition, we included phase-separated proteins identified by high throughput methods AICAP which was introduced in 'Quantifying the phase separation property of chromatin-associated proteins under physiological conditions using an anti-1,6-hexanediol index'. We collected protein with AICAP<1 before and after treatment.

3. PS entries annotation

For PS entries, each entry was assigned with one or more of the four experiments based on original publication.
1) “in vitro droplet formation”, 2) “In vivo droplet formation”: the protein can form or be recruited in biomolecular droplet.
3) “In vitro FRAP”, 4) “In vivo FRAP”: Fluorescence recovery after photobleaching (FRAP) experiments show the mobility of proteins within the droplet.
We annotated entries with following information:
1. Phase diagrams which show the formation of droplets in different experimental conditions.
2. Material states of the PS droplets (liquid, hydrogel, solid).
3. Regions used in PS experiments, and domains which are important for the proteins to undergo PS.
4. PS partners for the protein, including proteins, RNAs and others (such as chemicals and DNAs).
5. Regulation of proteins' PS ability, including post translational modifications (PTMs), mutations, oligomerizations, repeats and alternative splicings.

4. MLO and MLO related entries collection

Besides the entries with clear phase separation evidence, PhaSepDB 2.1 contains 770 low-throughput MLO related entries as well. These entries refer to proteins localize in MLOs with in vivo experiments, but no clear phase separation evidence was provided. Furthermore, we collected 7303 high-throughput MLO related entries from 20 papers, these papers were listed below.
For users interested in specific MLO, we provide navigation that enable users to browse based on specific MLO location. Currently, information for 73 MLOs were gathered from dispersive literatures, including both classic and newly discovered bodies.