清华生物信息学shell
1MOE Key Laboratory of Bioinformatics, Bioinformatics Division TNLIST / Department of Automation, Tsinghua university express
实验室17年4月的在核酸研究上的文章,开发了shell script为主的选择性剪切方法,分析比较也得出了比较好的结果。ide
对AFE的定量以及功能解释有了很好的说明,同时对现存的统计模型进行了有优化和新的解释。oop
首先把咱们第一个外显子的全部分布状况作一个全面的展现和列举优化
用五种方法在细胞核内、细胞质中进行链特异性和非特异性AFE事件的辨认率的比较。spa
Performance assessment of the five methods for FE identication using the reference CAGE data. (A and B) The receiver operating characteristic (ROC) curves of the five methods on non-strand-specic RNA-seq data of the nuclear (A) and cytoplasmic fractions (B) of the KhES cell line. (C and D) The ROC curves of the five methods on strand-specic RNA-seq data of the nuclear (C) and cytoplasmic fractions (D) of the H1-hES cell line. The logistic regression model has the best performance in all cases.3d
Identication and features of FEs across multiple cell types,鉴定在不一样的细胞系中FE的识别,依据CAGE数据做为reference dataorm
咱们在A图的相关性的图中能够看出,cor score都比较高。在B图的比较中能够看出,在TSS区域CAGE的曲线分布做为金标准,CAGE可检出的为红色,不可检出的为天蓝色,SEASTAR方法的已知的是深蓝色,新发现的是蓝绿色。能够发现SEASTAR的结果与CAGE的趋势基本保持一致同时有相差不大的average coverage;blog
具体举例:事件
Examples of differentially used AFEs and tandem TSSs between the GM12878 and K562 cell lines. (A) Differentially used AFEs in gene RPS6KA1.(正义链) (B) Differentially used AFEs in gene BIN1(反义链). (C) Differentially used tandem TSSs in gene ATP6V1E2(反义链). (D) Differentially used tandem TSSs in gene SLC35D1(反义链).
以后常规操做以后,AFE事件PSI定量完成。画出AFE的psi随着多能细胞分化过程的PSI分化图;同时画出126个特异性表达的转录因子热图,一样的也是change along with IPSC reprogramming process;热图展现AFE的PSI以及TF因子的PSI的传统pearson相关性参数,都有比较明显的cluster以及特异性的变化趋势。
右边的基因贴上去是top10 variable的candidates
最后,针对一个例子,mycn。将其筛选出来的标准为 P-value=0.00028
we found multiple TFs known to be key regulators of reprogramming including the top ranked N- Myc (Mycn) gene (with P-value of 0.00028). AFEs containing the Mycn motif were signicantly enriched towards the top of the AFEs positively correlated with Mycn expression in our enrichment analysis. We further investigated the expression level of Mycn,as well as the average PSI values of AFEs that contain the Mycn motif and have strong positive correlation with Mycn expression (PCC > 0.5) (Figure 6C). The signicant increase of Mycn expression during iPSC reprogramming (P-value =8.9e–16, ANOVA test) was accompanied by an increase in the relative usage of these AFEs. The coordinated change in expression levels between Mycn and the differentially used AFEs containing the Mycn motif suggests that Mycn binds to and promotes the usage of these AFEs. Mycn is known to play an essential role in the maintenance of pluripotency. Mycn can cooperate with other TFs to reprogram adult cells into other differentiated cells or into iPS cells。 Msx2, another transcription factor identied in our enrichment analysis, is a major driver of de-differentiation in mammalian muscle cells .
Collectively, these data imply that TFs with high scores from the enrichment analysis of differential AFEs play important roles in iPSC reprogramming and the regulation of the pluripotent state.#总结性陈述。
Mycn基因随着发育阶段的基因表达以及关于FEs事件的PSI值的变化的demo