In cBioPortal Datahub, there are plenty of public studies that have staging files available for loading directly into a private cBioPortal instance. An up-to-date list of public studies is maintained here.
In this tutorial we will load a study using the same docker image used to set up cBioPortal.
Before loading the study, make sure the cBioPortal container and the database container are up and running. We will also generate a validation report, which will point out potential issues in our data. In this tutorial, we will use the staging files for the TCGA Bladder Cancer study.
# Download and extract TCGA bladder cancer staging files
tar -xvzf blca_tcga.tar.gz
# Start a Docker container to load the study
docker run -it --rm --net cbio-net \
-v "$PWD/blca/tcga:/study:ro" \
-v "$PWD:/outdir" \
metaImport.py -u http://cbioportal-container:8080/cbioportal -s /study --html=/outdir/report_blca.html -v -o
This will run the data validator first, which checks for common issues and reports them in the report file. If there are no errors, it will continue and import the data. The study is loaded successfully when you see “Updating study status to : 'AVAILABLE' for study: blca_tcgta”!
The last thing to do is restart Tomcat to retrieve any updates made to the database. This is done by restarting the Docker container that runs cBioPortal:
docker restart cbioportal-container
In your local cBioPortal instance, e.g. http://localhost:8081/cbioportal, the TCGA Bladder Cancer study is now visible.
Further information on loading data and the used parameters can be found on GitHub and in the cBioPortal documentation. For commercial support to help with data loading or transformation, contact us: