0. Datasets
sORFs.org reprocesses publically available datasets with our sORF identification pipeline. First, datasets
are manually inspected on data quality using both mapping statistics and FastQC, a summary of the data quality
is available at the datasets page. If the data quality is deemed satisfactory,
the data is processed using the original parameters described by the authors where appropiate. Datasets supplied by users which are
not (yet) publically available can be provided upon request.
Our pipeline is capable of identifying sORFs both on data where translating ribosome are captured (snapfreezing,CHX,EME) as well as
data where translating ribosomes and initiating ribosomes are captured (HARR,LTM). When available, the pipeline with initiating ribosomes
is preferred as it provides an additional layer of evidence. Datasets with multiple runs/experiment are pooled together when both the same cell line
and treatment method is used. When multiple treatment types are available, harringtonine is preffered over lactomidomycin for initiating ribosomes data and snapfreezing
is preferred over cycloheximide which in turn is preferred over emetine.
The navigation bar of the left represents the different steps in our RIBOsORF pipeline, click on a step to acquire more information. Additional information can be obtained
from our sORFs.org publication or by contacting us.
The RIBOsORF pipeline is not (yet) publicly available, however, it can be provided upon request.