some transcripts quantification brief comprehensions

zoukankan html css js c++ java

some transcripts quantification brief comprehensions

Some biases in the standard rnaseq analysis

有参拼接：Stringtie 、 Cufflinks and Traph

flow network algorithm ： maximal and minimal methods respectively

无参组装：Trinity

clustering by K-mers

salmon： as for now the best performance software

DAG：有向无环图

bagging是减少variance，而boosting是减少bias

High variance 是model过于复杂overfit，记住太多细节noise，受outlier影响很大；high bias是underfit，model过于简单，cost function不够好。
A－ bagging随机选取data的subset，outlier因为比例比较低，参与model training的几率也比较低，所以bagging降低了outliers和noise对model的影响，所以降低了variance。
B－boosting参zh Bright的答案，minimize loss function by definition minimize bias.
==========================================================

==========================================================
Streaming fragment assignment for real-time analysis of sequencing experiments

流形碎片的实时测序实验

TIPS：

在估计丰度时候很容易用错或者是正确使用与否是很关键的一步：

RPKM：Reads Per Kilobase of exon modelper Million mapped reads (每千个碱基的转录每百万映射读取的reads)，主要用来对单端测序（single-end RNA-seq）进行定量的方法。
RPKM= total exon reads/ (mapped reads (Millions) * exon length(KB))

FPKM：

Fragments Per Kilobase of exon model per Million mapped fragments(每千个碱基的转录每百万映射读取的fragments)，主要是针对pair-end测序表达量进行计算

TPM：

Transcripts Per Kilobase of exonmodel per Million mapped reads (每千个碱基的转录每百万映射读取的Transcripts)，优化的RPKM计算方法，可以用于同一物种不同组织的比较。
TPM (推荐软件，RSEM) 的计算公式：

TPMi={( Ni/Li )*1000000 } / sum( Ni/Li+……..+ Nm/Lm )

CPM／RPM：

Reads/Counts of exon model per Million mapped reads (每百万映射读取的reads).
RPM的计算公式：
RPM=total exon reads / mapped reads (Millions)

查看全文

相关阅读:
[POJ 1050]To the Max
P1678 烦恼的高考志愿
 P1873 砍树
 P1102 A-B 数对
 P6771 [USACO05MAR]Space Elevator 太空电梯
 P2347 砝码称重
 P1832 A+B Problem（再升级）
P1679 神奇的四次方数
 P1877 [HAOI2012]音量调节
 P1049 装箱问题

原文地址：https://www.cnblogs.com/beckygogogo/p/9223911.html

some transcripts quantification brief comprehensions

bagging是减少variance，而boosting是减少bias