Intermediate data for TE calculation
Creators
Description
This dataset includes intermediate data from RiboBase that generates translation efficiency (TE). The code to generate the files can be found at https://github.com/CenikLab/TE_model.
We uploaded demo HeLa .ribo files, but due to the large storage requirements of the full dataset, I recommend contacting Dr. Can Cenik directly to request access to the complete version of RiboBase if you need the original data.
The detailed explanation for each file:
human_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in human.
human_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in human.
human_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in human.
human_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in human.
human_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in human.
human_TE_rho.rda: TE proportional similarity data as genes by genes matrix in human.
mouse_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in mouse.
mouse_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in mouse.
mouse_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in mouse.
mouse_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in mouse.
mouse_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in mouse.
mouse_TE_rho.rda: TE proportional similarity data as genes by genes matrix in mouse.
All the data was passed quality control. There are 1054 mouse samples and 835 mouse samples:
* coverage > 0.1 X
* CDS percentage > 70%
* R2 between RNA and RIBO >= 0.188 (remove outliers)
All ribosome profiling data here is non-dedup winsorizing data paired with RNA-seq dedup data without winsorizing (even though it names as flatten, it just the same format of the naming)
####code
If you need to read rda data please use load("rdaname.rda") with R
If you need to calculate proportional similarity from clr data:
library(propr)
human_TE_homo_rho <- propr:::lr2rho(as.matrix(clr_data))
rownames(human_TE_homo_rho) <- colnames(human_TE_homo_rho) <- rownames(clr_data)
Files
human_TE_cellline_all_plain.csv
Files
(4.2 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:b1a92cb1791955616a2e4d4de79e260b
|
34.1 MB | Download |
|
md5:dcb48f58b2a44f5a9aa17d440866d46f
|
33.8 MB | Download |
|
md5:bae3ffdf91183c53cd5b8bab52aa840f
|
90.2 MB | Download |
|
md5:3e54e62877adebb0b3fdc2d51ae41309
|
943.8 MB | Download |
|
md5:45c9336ecf4ea8d12b1c81bb9304af74
|
15.7 MB | Preview Download |
|
md5:63f7230e0fcdf64d65a5f76dd11c0648
|
944.1 MB | Download |
|
md5:7e3fc2071d65ef88a60744699f1d719a
|
26.4 MB | Download |
|
md5:1cf927f7d66409e25fce27327bfd977d
|
27.6 MB | Download |
|
md5:065d0bbb0635f98e1f7cf68a75c8ae22
|
73.1 MB | Download |
|
md5:98c3bb06ad803b0f76a115a2bbe10596
|
993.5 MB | Download |
|
md5:8cdfc2b233c46b9c704160055b913c4a
|
14.2 MB | Preview Download |
|
md5:3991de0778b8210762f591406f819a75
|
993.1 MB | Download |