The latest news from The Hyve on Open Source solutions for bioinformatics

Recent Posts

Simple data loading into cBioPortal with Docker

December 28, 2017 | By Sander Tan

cbioportalIn our recent blogpost we discussed an easy solution to set up a local instance of cBioPortal using Docker. The next step is to load a study into the database.

In cBioPortal Datahub, there are plenty of public studies that have staging files available for loading directly into a private cBioPortal instance. An up-to-date list of public studies is maintained hereIn this tutorial we will load a study using the same docker image used to set up cBioPortal.

Requirements

Before loading the study, make sure the cBioPortal container and the database container are up and running.

Commands

The following commands will download the TCGA BLCA study from cBioPortal Datahub. This process will also generate a validation report, which can point out potential issues in the data. 

# Download and extract TCGA bladder cancer staging files
wget https://github.com/thehyve/datahub/raw/master/public/blca_tcga.tar.gz
tar -xvzf blca_tcga.tar.gz
 
# Start a Docker container to load the study
docker run -it --rm --net cbio-net \
    -v "$PWD/blca/tcga:/study:ro" \
    -v "$PWD:/outdir" \
    thehyve/cbioportal:1.14.0 \
    metaImport.py \
        -u http://cbioportal-container:8080/cbioportal \
        -s /study --html=/outdir/report_blca.html -v -o

This will first run the data validator, which checks for common issues and reports them in the report file. If there are no errors, the process will continue and run the data importer. The study is loaded successfully when you see “Updating study status to : 'AVAILABLE' for study: blca_tcgta”. 

Finally, restart Tomcat to retrieve any updates made to the database. This is done by restarting the Docker container that runs cBioPortal:

# Restart
docker restart cbioportal-container

The TCGA Bladder Cancer study is now visible in your local cBioPortal instance. Further information on loading data and the used parameters can be found on GitHub and in the cBioPortal documentation. For commercial support to help with data loading or transformation, contact us:  

Contact us