The latest news from The Hyve on Open Source solutions for bioinformatics

Recent Posts

Simple data loading into cBioPortal with Docker

December 28, 2017 | By Sander Tan

cbioportalIn our recent blogpost we discussed an easy solution to set up a local instance of cBioPortal using Docker. The next step is to load a study into the database.

In cBioPortal Datahub, there are plenty of public studies that have staging files available for loading directly into a private cBioPortal instance. An up-to-date list of public studies is maintained here.

In this tutorial we will load a study using the same docker image used to set up cBioPortal.

 

Requirements
Before loading the study, make sure the cBioPortal container and the database container are up and running. We will also generate a validation report, which will point out potential issues in our data. In this tutorial, we will use the staging files for the TCGA Bladder Cancer study.


Commands
# Download and extract TCGA bladder cancer staging files
wget https://github.com/thehyve/datahub/raw/master/public/blca_tcga.tar.gz
tar -xvzf blca_tcga.tar.gz

# Start a Docker container to load the study
docker run -it --rm --net cbio-net \
-v "$PWD/blca/tcga:/study:ro" \
-v "$PWD:/outdir" \
cbioportal-image \
metaImport.py -u http://cbioportal-container:8080/cbioportal -s /study --html=/outdir/report_blca.html -v -o

This will run the data validator first, which checks for common issues and reports them in the report file. If there are no errors, it will continue and import the data. The study is loaded successfully when you see “Updating study status to : 'AVAILABLE' for study: blca_tcgta”!

The last thing to do is restart Tomcat to retrieve any updates made to the database. This is done by restarting the Docker container that runs cBioPortal:

# Restart
docker restart cbioportal-container

In your local cBioPortal instance, e.g. http://localhost:8081/cbioportal, the TCGA Bladder Cancer study is now visible.

Further information on loading data and the used parameters can be found on GitHub and in the cBioPortal documentation. For commercial support to help with data loading or transformation, contact us:  

Contact us