Quantitative Analysis of Pharmacogenomics in Cancer


Quantitative Analysis of Pharmacogenomics in Cancer (QAPC) is a web-based application to access, correlate, and reconcile the drug sensitivity data from the Cancer Cell Line Encyclopedia (CCLE), Genomics of Drug Sensitivity in Cancer (GDSC) and the Cancer Therapeutics Response Portal (CTRP). In this manual drug sensitivity metrics are highlighted in red and Portal controls and interface elements are shown in italic blue.


1.    Interactive graphic interface to access drug sensitivity data for 585 drugs/drug combinations and 1201 cell lines.

2.    Choice of 6 drug sensitivity metrics:

EC50 - half maximal effective concentration

IC50 - half maximal inhibitory concentration

AUC EC50 – area under the dose response curve calculated from EC50 model

AUC IC50 – area under the dose response curve calculated from IC50 model

Adjusted AUC EC50 – area under the dose response curve calculated from EC50 model adjusted for the range of tested drug concentrations

Adjusted AUC IC50 – area under the dose response curve calculated from IC50 model adjusted for the range of tested drug concentrations

3.    Option to reconcile data from more than one database using Adjusted AUC EC50 or Adjusted AUC IC50. This permits a fair comparison of the heterogenous data for drugs analyzed by more than one project and reconciles all data for the drug (including drug/cell line combinations present in one but not both databases) into a combined list for pooled high power downstream analysis.

4.    Functionality to visualize dose-response curves for each drug/cell line combination to evaluate the quality of the raw data and the accuracy of the dose response model. 

5.    Graphic representation of the agreement of the drug sensitivity data between CCLE, GDSC, CTRP databases with 2D and 3D scatterplots

6.    Functionality to download raw data, EC50 and IC50 regression model parameters, adjusted AUC calculations, and correlation statistics for the drug.



The systematic analysis of the drug sensitivity data from the large pharmacogenomics highlighted the difficulties of comparing heterogenous pharmacologic data and choosing the best drug sensitivity metric (link to the paper when accepted). The large number of incomplete dose responses prevent the use of traditional drug sensitivity metrics such as IC50 or EC50. AUC depends on the range of drug concentrations tested (that varies significantly between studies) and, therefore, cannot be compared directly. Previous attempts to match drug responses from CCLE and GDSC provided discordant results (https://www.ncbi.nlm.nih.gov/pubmed/24284626; https://www.ncbi.nlm.nih.gov/pubmed/26570998).

We found that the novel drug sensitivity metric, AUC adjusted for the range of tested drug concentrations (Adjusted AUC), produces the best agreement between the existing databases and allows reconciliation of heterogenous data for pooled analysis.

Drug sensitivity metrics calculations

1.    Raw dose-response data was modeled using 4-parameter log-logistic regression with the following formula:

 where, Amin and Amax are lower and upper asymptotes of the sigmoid curve (no response and the maximal response to the drug, respectively), EC50 is drug concentration causing the effect equal to the 50% of the Amax, and Hill is a Hill slope of the dose response curve. Maximal and minimal asymptotes were allowed within the following ranges: Rmin ≤ Amin ≤ 0 and min(0, Rmin) ≤ Amax ≤ max (100, Rmax). Rmin and Rmax are minimal and maximal measured drug responses, respectively.

2.    The coordinate of the upper bend point of the dose response curve was calculated using the formula:

If more two or more data points are present with drug concentration x > xbend, the curve was assumed to be complete and the upper asymptote estimated accurately. Otherwise, regression analysis was repeated with the Amax = 100 (for the lack of a better upper asymptote estimate for incomplete curves). Amax = 100 was also set for very low amplitude curves (Amax - Amin < 30) to avoid the noise being misinterpreted as true drug response.

3.            For IC50 estimation Amin and Amax were set to 0 and 100, respectively (by the definition of IC50). The parameters of the sigmoid curve (IC50 or EC50 modeling) were than used for AUC calculations.

4.            Unadjusted AUC was estimated in between minimal (xmin) and maximal (xmax) tested concentrations (nM) for the curve. AUC was calculated using the following formula:

            where f(x) = formula (1)

5.            For the purpose of comparing two or more dose response curves, an adjusted AUC was calculated, where xmin and xmax were set to cover the range of concentrations shared by all curves.



QAPC layout consists of the Sidebar Panel on the left that contains controls for what and how the data is visualized and the Main Panel on the right, which actually shows the data. Various functions of the QAPC are accessed by navigating between Tab Panels on the top the Main Panel.


Tab Panels

Summary Tab shows raw or reconciled, sorted or unsorted drug sensitivity data for the drug as well as raw dose-response curves and model estimates.

Correlation Plots Tab illustrates the agreement between drug sensitivity data from CCLE, GDSC and CTRP with 2D and 3D scatterplots.

Help Tab contains this document.

Credentials Tab contains the information about the authors, the link to the companion paper and the contact information for the correspondence.


Sidebar Panel

Choose Drug select box contains the list of all drugs in the analysis. To avoid scrolling through the long list, drug name can also be typed into the select box. When Correlation Tab is selected Choose Drug select box shows only drugs that are analyzed by more than one database and therefore can be correlated.

One of the six drug response metrics may be chosen with Choose Metric select box. Adjusted AUC IC50 and Adjusted AUC EC50 are not available in the Summary Tab. Instead, AUC is adjusted for the range of tested concentrations, when Reconcile Data check box is selected.

Select Database checkboxes permit selection of the databases to show the data from.

Data Reconciliation. When Reconcile Data checkbox is selected, drug sensitivity data is combined from two or three databases data using Adjusted AUC drug sensitivity metric. This option is not active, if drug is represented in only one database, if IC50 or EC50 drug sensitivity metric is selected (due to the presence of non-finite from incomplete curves reconciliation cannot be performed for IC50 or EC50), or if less than two databases are selected. Data can only be reconciled within the Summary Tab.

Sort Data checkbox sorts drug responses from the lowest to the highest.


Main Panel.

Main Panel content varies depending on the Tab Panel selected.

When the Summary Tab is selected, Main Panel contains a plot of drug sensitivities. Each dot represent a unique database/drug/cell line combination. The units and the scale of the y-axis depend on the chosen drug sensitivity metric. The raw data points are color coded based the source database: red dots – CCLE, blue dots – GDSC and magenta dots – CTRP. For the visualization purposes only (but not for analysis/reconciliation) infinite IC50 or EC50 values are capped to the maximal tested concentration from the respective databases.

If data reconciliation is requested, main plot shows Adjusted IC50 or Adjusted EC50 values that are averaged, if drug/cell line pair is analyzed in more than one database. The Adjusted IC50 values for drug/cell line combination tested by one study only are also recalculated for xmin and xmax shared by all databases being reconciled and therefore may differ from the raw unadjusted AUC. The colors of the reconciled data points are different from those of raw data: the dose responses averaged from one, two or three databases are shown in in blue, cyan and red, respectively.

Hovering the mouse pointer over the data point shows a tooltip with the information about the source database and the value of the selected drug sensitivity metric. If data point represents reconciled data, an average and up to 3 raw unreconciled AUC values may be shown in the tooltip.

Clicking on the data point brings Dose-Response Curve plot(s). Three plots correspond to the data from three databases, from left to right: CCLE, GDSC and CTRP. These plots allow quick visual assessment of the raw data quality and the accuracy of the regression analysis model. Sigmoid curve parameters are also shown.   

The Main Panel of the Correlation Plots Tab contains scatterplots illustrating the correlation between drug/cell line data points analyzed by more than one study. If drug is analyzed by all 3 databases, 3D scatterplot is shown by default, but 2D correlation plots may be seen by unselecting one of the databases in the Sidebar Panel. Non-finite EC50 and IC50 values are not plotted on scatterplots, therefore only a few data points may be visible on EC50 and IC50 scatterplots for the drug that shows no or little activity in the proliferation assay. The Pearson and Spearman correlations, the p-values and the number of data points are shown on 2D scatterplots. The Pearson correlation cannot be calculated for EC50 and IC50 drug sensitivity metrics due the presence of the non-finite numbers from incomplete dose response.


Download Button.

Clicking on the Download Button located at the bottom of the Sidebar Panel generates a Microsoft Excel file (drug_name.xls) that contains all raw data, reconciliation analysis and correlation statistics for the drug.

The first worksheet named “Unadjusted raw data” contains raw data as it is generated by the curve fitting algorithms.

Column name



The name of the database (CCLE, GDSC or CTRP)


Consensus drug name


Consensus cell line name


Drug concentrations tested (nM)


Relative effect of the drug in proliferation assay (% inhibition; 0 corresponds to no effect and 100% is a maximal effect (all cells dead)).


Lowest concentration tested (nM)


Highest concentration tested (nM)


Parameters of the 4-point logistic regression model (fitting EC50 curve )





EC50, where the values outside the range of tested concentrations are set to the maximal tested concentrations (used to visualize all data points in a Summary Tab Main Panel


Normalized area under the curve calculated from the EC50 model


Parameters of the 4-point logistic regression model (fitting IC50 curve; minimal and maximal asymptotes are set to 0 and 100, respectively)





IC50, where the values outside the range of tested concentrations are set to the maximal tested concentrations (used to visualize all data points in a Summary Tab Main Panel


Normalized area under the curve calculated from the IC50 model


EC50, where the values outside the range of tested concentrations are set to the Infinity (Inf) (to compare EC50 between databases)


IC50, where the values outside the range of tested concentrations are set to the Infinity (Inf) (to compare IC50 between databases)


If the drug was tested by more than one study, up to four worksheets (“CCLE and GDSC Reconciled”, “CCLE and CTRP Reconciled”, “CTRP and GDSC Reconciled”, and “CCLE, GDSC, and CTRP Reconciled”) contain Adjusted AUC EC50 and Adjusted IC50 values recalculated for the range of drug concentrations shared by the curves being compared.

“Correlation” worksheet lists Pearson and Spearman correlation coefficients for 6 drug sensitivity metrics, p-values obtained by random permutations and bootstrapping, and 95% confidence intervals.





Analysis Idea

Nikita Pozdeyev, MD, PhD

Bryan Haugen, MD

Rebecca Schweppe, PhD

Aik-Choon Tan, PhD


Data Analysis Pipeline Development

Nikita Pozdeyev, MD, PhD


Raw data processing

Nikita Pozdeyev, MD, PhD

Ryan Mackie


Online Portal Development

Minjae Yoo, PhD

Nikita Pozdeyev, MD, PhD

Aik-Choon Tan, PhD







Raw Data Sources

1.       The Cancer Cell Line Encyclopedia. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012; 483: 603–607. http://www.broadinstitute.org/ccle

2.       The Genomics of Drug Sensitivity in Cancer. Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, Greninger P, Thompson IR, Luo X, Soares J, Liu Q, Iorio F, Surdez D et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012; 483: 570–575. http://www.cancerrxgene.org/downloads

3.       The Cancer Therapeutics Response Portal, version 2. Basu A, Bodycombe NE, Cheah JH, Price EV, Liu K, Schaefer GI, Ebright RY, Stewart ML, Ito D, Wang S, Bracha AL, Liefeld T, Wawer M et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell. 2013; 154: 1151–1161. https://ctd2.nci.nih.gov/dataPortal


Contact Information

E-mail questions, comments, ideas, and suggestions to Nikita.Pozdeyev@ucdenver.edu