repository of small ORFs identified by ribosome profiling

0. Datasets reprocesses publically available datasets with our sORF identification pipeline. First, datasets are manually inspected on data quality using both mapping statistics and FastQC, a summary of the data quality is available at the datasets page. If the data quality is deemed satisfactory, the data is processed using the original parameters described by the authors where appropiate. Datasets supplied by users which are not (yet) publically available can be provided upon request.

Our pipeline is capable of identifying sORFs both on data where translating ribosome are captured (snapfreezing,CHX,EME) as well as data where translating ribosomes and initiating ribosomes are captured (HARR,LTM). When available, the pipeline with initiating ribosomes is preferred as it provides an additional layer of evidence. Datasets with multiple runs/experiment are pooled together when both the same cell line and treatment method is used. When multiple treatment types are available, harringtonine is preffered over lactomidomycin for initiating ribosomes data and snapfreezing is preferred over cycloheximide which in turn is preferred over emetine.

The RIBOsORF pipeline is not (yet) publicly available, however, it can be provided upon request.