small RNA学习（八）：降解组分析-阿根廷世界杯成绩-2002世界杯决赛_82年世界杯

small RNA学习（八）：降解组分析

Date:2025-07-05 13:57:52
Category:阿根廷世界杯成绩
1877

1. 降解组（测序）是什么？分析原理是什么？

在植物中，通常miRNA与靶标序列结合紧密，在结合位点的第10/11位碱基处直接剪切mRNA从而调控下游的mRNA表达。被剪切后的mRNA变为两段，其中之一是含有3'-polyA尾巴且5'不含cap的片段。在测序过程中，针对这种断片特异捕获，就是降解组测序了。

那分析思路是怎么样的呢？

首先是将降解组测序reads比对到转录本，比对深度明显增高的那些位点重点关注，这些位点的5'的上下游各取15bp左右的序列作为A集合，再将小RNA序列集跟A序列集比较看是否能反向互补上（也可以理解为，将小RNA比对到转录本，比对上的位点记为B集合，看A集合与B集合是否有重叠），若能，则记录下小RNA与转录本的对应关系。

根据上述的原理，不难发现，降解组分析主要是三个数据文件：

降解组测序数据

转录本序列

小RNA序列

2. 安装

需要提前准备的几个Perl模块和软件

Getopt::Std

Math::CDF

bowtie (version 0.12.x or 1.x)

bowtie-build

RNAplex (from Vienna RNA package)

GSTAr.pl (Version 1.0 or higher -- distrubuted with CleaveLand4)

samtools

#确保都已添加到环境变量PATH中

安装细节详见：https://github.com/MikeAxtell/CleaveLand4.git，这里就不赘述了。

3. 举个栗子

#在有多个版本Perl可以使用的情况下，用哪一个Perl安装的预备模块，就用绝对路径调用哪一个Perl。

perl CleaveLand4.pl \

-e degradome.clean.fa.gz \

-u osa-miR211a.fasta \

-n xxx.fasta \

-p 0.05 -c 2 \

-t -o ./test_plot > out.txt

#-e: Path to FASTA-formatted degradome reads

#-u: Path to FASTA-formatted small RNA queries

#-n: Path to FASTA-formatted transcriptome

#-p: p-value阈值，以此来过滤不大可信的小RNA与靶标的对应关系

#-c: DegradomeCategory共有5类，0-4，越小表示对应关系越可信

#-t: 输出文件制表符分割

#-o: T-plots存放的文件夹名称，注意该名称不能含有"/"符号也就是不能有子文件夹，类似./test/plot1

看看结果

$ ls

out2.txt degradome.clean.fa.gz_dd.txt test2_plot

$ head -n 30 degradome.clean.fa.gz_dd.txt #将降解组reads比对到转录本之后的结果-hsy

# CleaveLand4 degradome density

# Wed Mar 6 10:19:26 CST 2019

# Degradome Reads:./degradome.clean.fa.gz

# Transcriptome:./xxx.fasta

# TranscriptomeCharacters:12257028

# Mean Degradome Read Size:20 #降解组reads平均长度-hsy

# Estimated effective Transcriptome Size:11901028

# Category 0:8011 #这几个分类下面会讲-hsy

# Category 1:1616

# Category 2:9783

# Category 3:10382

# Category 4:74247

@ID:chr1A:4210110-4210858

@LN:748

77 1 4 #位置 reads数分类

78 1 4

621 1 4

626 1 4

634 2 1

649 2 1

698 1 4

#out2.txt存放的是小RNA序列与转录本序列（位置）的对应关系，以及一些指标

#test2_plot文件夹存放的是图，out2.txt有多少个记录，就有多少个图，如下

4. 软件介绍

CleaveLand定义了几个分类，下面来看一下

Modes（模式）:

1. Align degradome data, align small RNA queries, and analyze. #就是我上面例子的模式，直接输入三个初始文件

REQUIRED OPTIONS: -e, -u, -n

DISALLOWED OPTIONS: -d, -g

2. Use existing degradome density file, align small RNA queries, and analyze. #-d：比如我上面例子得到的degradome.clean.fa.gz_dd.txt文件

REQUIRED OPTIONS: -d, -u, -n

DISALLOWED OPTIONS: -e, -g

3. Align degradome data, use existing small RNA query alignments, and analyze. #-g表示小RNA与转录本的比对记录

REQUIRED OPTIONS: -e, -n, -g

DISALLOWED OPTIONS: -d, -u

IRRELEVANT OPTIONS: -a, -r

4. Use existing degradome density file and existing small RNA query alignments, and analyze.

REQUIRED OPTIONS: -d, -g

DISALLOWED OPTIONS: -e, -u

IRRELEVANT OPTIONS: -a, -r

Categories:

Category 4: Just one read at that position

Category 3: >1 read, but below or equal to the average* depth of coverage on the transcript

Category 2: >1 read, above the average* depth, but not the maximum on the transcript

Cateogry 1: >1 read, equal to the maximum on the transcript, when there is >1 position at maximum value

Cateogry 0: >1 read, equal to the maximum on the transcript, when there is just 1 position at the the maximum value

5. 再举个栗子

如果我们需要查询的小RNA有多个，该怎么处理呢？

将含有多个小RNA的文件拆成多个小RNA.fasta文件，依次运行模式1，也就是我上面的例子；

但是这样还不够简洁，因为降解组数据比对到转录本每次都是一样的，所以可以运行1次模式1，剩下n-1都运行模式2

比如我有10个小RNA

for i in {1..10}

if [ "${i}" -eq 1 ]; then

mkdir ${path6}${sample}/${i}

cd ${path6}${sample}/${i}

${path1}perl ${path2}CleaveLand4.pl -e ${path3}degradome.clean.fa.gz \

-u ${path4}num${i}_miRNA.fasta -n ${path5}${sample}.xxx.fasta \

-p 0.05 -c 2 -t -o ./plot${i} > out${i}.txt

if [ "${i}" -gt 1 ]; then

mkdir ${path6}${sample}/${i}

cd ${path6}${sample}/${i}

${path1}perl ${path2}CleaveLand4.pl -d ${path3}degradome.clean.fa.gz_dd.txt \

-u ${path4}num${i}_miRNA.fasta -n ${path5}${sample}.xxx.fasta \

-p 0.05 -c 2 -t -o ./plot${i} > out${i}.txt

done

乐町女装风格如何？年轻女性喜爱吗？
桃胶多久吃一次最好吃桃胶的禁忌