这是indexloc提供的服务,不要输入任何密码
Published December 13, 2023 | Version v1
Dataset Open

Intermediate data for TE calculation

Creators

Description

This dataset includes intermediate data from RiboBase that generates translation efficiency (TE). The code to generate the files can be found at https://github.com/CenikLab/TE_model.

We uploaded demo HeLa .ribo files, but due to the large storage requirements of the full dataset, I recommend contacting Dr. Can Cenik directly to request access to the complete version of RiboBase if you need the original data.

The detailed explanation for each file:

human_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in human.

human_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in human.

human_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in human.

human_TE_rho.rda: TE proportional similarity data as genes by genes matrix in human.

mouse_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in mouse.

mouse_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in mouse.

mouse_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in mouse.

mouse_TE_rho.rda: TE proportional similarity data as genes by genes matrix in mouse.

All the data was passed quality control. There are 1054 mouse samples and 835 mouse samples:
 * coverage > 0.1 X
 * CDS percentage > 70%
 * R2 between RNA and RIBO >= 0.188 (remove outliers)

All ribosome profiling data here is non-dedup winsorizing data paired with RNA-seq dedup data without winsorizing (even though it names as flatten, it just the same format of the naming)

####code
If you need to read rda data please use load("rdaname.rda") with R

If you need to calculate proportional similarity from clr data:
library(propr)
human_TE_homo_rho <- propr:::lr2rho(as.matrix(clr_data))
rownames(human_TE_homo_rho) <- colnames(human_TE_homo_rho) <- rownames(clr_data)

Files

human_TE_cellline_all_plain.csv

Files (4.2 GB)

Name Size Download all
md5:b1a92cb1791955616a2e4d4de79e260b
34.1 MB Download
md5:dcb48f58b2a44f5a9aa17d440866d46f
33.8 MB Download
md5:bae3ffdf91183c53cd5b8bab52aa840f
90.2 MB Download
md5:3e54e62877adebb0b3fdc2d51ae41309
943.8 MB Download
md5:45c9336ecf4ea8d12b1c81bb9304af74
15.7 MB Preview Download
md5:63f7230e0fcdf64d65a5f76dd11c0648
944.1 MB Download
md5:7e3fc2071d65ef88a60744699f1d719a
26.4 MB Download
md5:1cf927f7d66409e25fce27327bfd977d
27.6 MB Download
md5:065d0bbb0635f98e1f7cf68a75c8ae22
73.1 MB Download
md5:98c3bb06ad803b0f76a115a2bbe10596
993.5 MB Download
md5:8cdfc2b233c46b9c704160055b913c4a
14.2 MB Preview Download
md5:3991de0778b8210762f591406f819a75
993.1 MB Download