Announcements


July 10th, 2023

  • Bug fix: The "Peturbation Effects" tab of the gene page was not correctly coloring points for lines with hotspot or damaging mutations. This has been fixed.


July 6th, 2023

  • We've added additional single point compound screening data from the PRISM Repurposing project to the portal. You can see it in the portal labeled, "PRISM Repurposing Public 23Q2" and is a superset of the data that was previously released as "Repurposing Primary"


May 9th, 2023

  • We are excited to share new CRISPR and omics data, a Chronos update and a new Target Discovery app in this 23Q2 release! You’ll also notice an update to our cell line page. We hope this new format will allow users to more easily navigate the information presented for each cell line.

    In this release, you will likely notice some changes in the data now that Chronos 2.0 integrated CRISPR data is being produced by a joint Chronos run. This change reduces the number of false positives in the integrated dataset by 5%. Please read the full announcement for more information.

    Additionally, we are aware that there is an artifact in the CRISPR data which causes background correlation in dependency between genes located on the same chromosome arm. To account for this, we've aligned the mean gene effect of each chromosome arm to be the same in every cell line following the original copy number correction. Overall we see an improvement in data quality, as well as a reduction in clustering by chromosome arm in UMAP embeddings.

    See more information about this release see this forum post describing the full set of changes.


December 14th, 2022

  • DepMap is excited to announce several important updates and pipeline improvements in this release. This release has been focused on changes to our processing pipelines and the format in which we release data. See this forum post for more information. Due to the large number of significant changes in this release, if you have previously used DepMap data, we strongly recommend consulting the README for this dataset before downloading the newest release.


May 9th, 2022

  • Data Release Schedule Update After this quarter, DepMap will begin releasing new data once every 6 months, as opposed to every quarter. We will continue to announce new portal features, tools, and other resources as they are released throughout the year. However, new DepMap data will only be released in Q2 and Q4 of each year. Our next data release will be in 22Q4 (November 2022).

  • Ancestry Bias Correction We have found that cell-line specific alterations in the targeted sequences of sgRNAs can preclude Cas9-mediated genome editing, introducing ancestry-related biases into the Achilles data. To account for this, we will remove data from guides with SNV-induced sequence mismatches in a given cell line (~1.5% of sgRNAs on average). This correction allows Chronos to compute gene effects without being biased by the reduced cutting efficacy produced by these sgRNA-variant interactions. Please stay tuned for a preprint describing these analyses in more detail. In the meantime, check out the related web tool for more information.

  • New Terra Workspace for CCLE BAM Files To facilitate easier access to the publicly available sequencing data from the CCLE project, we have created a Terra workspace to share these data. This also includes germline variant calls using HaplotypeCaller. For further information please see our post on the DepMap forum here.

  • Updates to Cell Line Metadata We continue to update and improve our cell line metadata which can be found in the sample_info file on the data downloads page. To more clearly annotate information about engineered cell lines we added the columns model_manipulation and model_manipulation_details, and correspondingly removed ‘engineered’ as a category in the lineage and primary disease columns so these represent the actual values. To document cell lines that come from the same patient or that are derivatives, we added patient_id and parental_depmap_id columns. Our patient IDs indicate known isogenic relationships based on SNP-fingerprinting and information reported in the Cellosaurus and may not include all relationships. Finally, we have added Cellosaurus_NCIt_disease and Cellosaurus_issues columns by pulling this information from the Cellosaurus (v41), and have updated our age and sex annotations for a large number of cell lines by gathering this data from the Cellosaurus and other sources.

  • Misidentified Cell Lines This quarter we reviewed all of the ICLAC (v11) misidentified cell lines in DepMap, in most cases verifying that DepMap has the correct sample. In cases where no known authentic stock exists (such as KPL1 which is actually MCF7), we have noted the misidentification in the depmap_public_comments column of the sample_info file and have updated the cell line metadata accordingly. For example, the age, sex, lineage, and other metadata for KPL1 has been updated to be the same as MCF7 and it has been marked as an MCF7 derivative. More details can be found in this forum post.

  • Upgraded RSEM and STAR We updated STAR to v2.7.10a, RSEM to v1.3.3, and GENCODE to version 38 (used for generating index for STAR and reference for RSEM) in our expression pipeline.

  • SNP Fingerprinting Changes Using SNP-fingerprinting, we have identified four cell lines where the stock used by DepMap is not from the expected cell line. For three of these cell lines, HSB2 (ACH-001737), COLO699 (ACH-001041), and NCIH1339 (ACH-000921) we have decided to keep the data, using the same DepMap ID, but rename the cell line and update the metadata to annotate these as derivatives of cell line the data is from. For the fourth, HS571T (ACH-001092), we have decided to remove our data, which includes mutations from hybrid capture data, and copy number from SNP array data. More details can be found in this forum post. We apologize for any impacts these errors might have had.


February 22, 2022

  • Bug fix in gene page that prevented "Explore relationships with other data" button to load page.


February 14, 2022

  • New Webster Mini-Site We're excited to share a new interactive tool for exploring relationships between genes and biological functions inferred from large-scale CRISPR screening datasets using the machine learning method Webster. Read more about the method here.

  • New DepMap Documentation We are excited to share that the first of many more DepMap portal tutorials now available on the portal! The document “How to Identify Genetic Dependencies Using the DepMap Portal” is available for reference on the Resources page. We will continue to develop and release new tutorials in response to common user questions to share with the DepMap community.

  • DepMap YouTube Channel DepMap is now on YouTube! As we continue to expand the documentation and resources available for the portal, all DepMap-related informational videos and video tutorials will be made available on our YouTube. Be sure to subscribe to our channel to stay up to date on the latest video content from DepMap.

  • Metadata Changes Beginning this quarter we will be moving some of the information previously contained in the "sample_info.csv" file into a separate file (called "Achilles_metadata.csv") in order to clarify the different types of metadata we’re reporting. The “sample_info.csv” file will contain information about the cancer cell lines, whereas the new “Achilles_metadata.csv '' file will contain information specifically about the Broad’s Project Achilles CRISPR screens. Both files will be available on the Downloads page. In addition, we have updated the media annotations reported (under ‘culture_media’, now in the "Achilles_metadata.csv") to consistently reflect the media used for each cell line in the Broad’s CRISPR screens. This change mostly serves to fill in missing media annotations, though it also introduces some minor changes from previous versions (e.g. “EMEM” is now referred to as “MEM”).

  • Reprocessed CNV files from Sanger WES To address an apparent batch effect in the segmented copy number calls derived from Sanger’s WES data, we updated the panel-of-normals used to process Sanger WES data to use normal cell line data generated by Sanger.

  • Reorganized Downloads Section We've reorganized the downloads section so that you now can more quickly find files per DepMap data release. In addition, if there is a file you are looking for, you can enter the name in the search bar at the top of the site to be taken to that file directly.

  • Uncorrected Chronos Data Now Available Beginning this quarter, we will be providing the raw Chronos output (i.e. gene effect prior to CN correction and scaling) as part of our quarterly data releases. This file is available as "Achilles_gene_effect_uncorrected.csv" on the Downloads page.

  • DepMap LinkedIn Page DepMap is now on LinkedIn! Check out our new page, and be sure to follow us to stay up to date with latest news, job opportunities, and more from the DepMap team.


November 8, 2021

  • DepMap User Survey We're taking a look at the existing resources available on the DepMap portal as we chart a course for the upcoming 2022 releases, and we'd love to hear from you! What resources and documentation would you find most helpful to have on the DepMap portal? What new features and data would you like to see next? Let us know in this survey!

  • Releasing WES CNV as a Downloadable Files The primary copy number release files are generated by prioritizing WGS data over WES if the data for both exists. In order to complement this dataset, we are also releasing the data produced by WES specifically.

  • No QN in Harmonia Some researchers noticed an artifact in the combined CRISPR datasets where gene effect scores for tumor suppressors appear to be truncated at weakly positive values. We investigated and concluded this was the inevitable result of the quantile normalization steps performed while integrating Sanger’s Project Score’s dataset . We've removed these steps and found the artifact was resolved, with little effect on the quality of the batch correction. Users will likely notice a change in some gene effect profiles in the combined CRISPR datasets moving forward, especially in genes with positive gene effect scores.

  • Remove Genes Flipping Chronos Solutions In the Achilles data, we noticed GOLGA6A (342096) and SNRPD3 (6634) have inconsistent gene effect scores across Chronos runs. We found this is due to disagreeing guides, as both of these genes have a pair of guides whose readcounts indicate they are common essentials and a pair of guides whose readcounts indicate their knockout has no effect. As we cannot determine which Chronos solution is correct, these two genes have been removed from the Achilles gene effect and dependency matrices. However, guide-level data for both GOLGA6A and SNRPD3 can still be found in the readcount and log-fold change matrices.

  • CN Gene Name Update For gene-level copy number data we updated the version of the Ensembl gene names to be consistent with the version currently used in our gene expression data (Ensembl Archive Release 102, Nov 2020).

  • Achilles Gene Name Update We've updated gene symbols in the Achilles files to the latest NCBI Consensus CDS protein set
    (CCDS, downloaded on 9/5/2021). This resulted in 6 genes being dropped: C12orf74, PALM2, C17orf47, KIAA1107, GUCA1A, AKAP2.

  • Changes to Cell Lines For ACCS (ACH-001011) and KP1NL (ACH-001108) we only had Hybrid Capture data. These two lines are known to be contaminated according to Cellosaurus. We removed both lines from our omics data. Furthermore, there was only an older version of the copy number available for UCH1 (ACH-002335) with no other accompanying datasets or source file. We dropped this data to make the copy numbers harmonized across cell lines. We also dropped the CNV of MM1610113 (ACH-002512) due to low quality scores.


October 13, 2021

  • CCLE miRNA expression has been replaced with a new version in which IDs have been mapped to MIMAT IDs based on miRbase


September 21, 2021

  • Telomere content estimates from Kevin Hu et al. *eLife

  • 2021 have been added to the downloads section as well as Data Explorer.


August 19, 2021

  • 21Q3 Data Released

    We're pleased to announce the portal has been updated with the 21Q3 data release.

  • SNP Fingerprinting & DepMap Data Issues

    This quarter we performed a systematic SNP-fingerprinting analysis of all the sequencing data used in DepMap. Based on this analysis, along with cross-referencing to external databases, we have identified a number of mislabeled omics data files, as well as a few cell line swaps and duplicates. A detailed description of these issues (and the actions we took to resolve them) can be found here. All errors have all been corrected in the 21Q3 release, and we apologize for any impacts these errors might have had.

  • New Predictive Modeling Tab and Page

    We have added a new "Predictability" tile and linked predictability tab where we highlight features which are predictive of each dependency profile. See the predictability tab on gene pages for more details.

  • Chronos Becomes Default CRISPR dataset

    Last quarter, we introduced CRISPR data processed using the new Chronos algorithm for users to explore. This quarter, we're making Chronos-processed CRISPR data the default datasets used in the portal. CERES-processed data will be available for download this quarter, however going forward we will no longer produce new CERES data (though previous CERES datasets will still be available for download). Details about the Chronos method can be found in this preprint, and the differences between CERES and Chronos in this blog post. Note that you can still access previous CERES-processed datasets on the downloads page, and on Figshare.

  • CN Gene Names Update

    For gene leve copy number data we updated the version of the Ensembl gene names to be consistent with the version (i.e. Ensembl Archive Release 102, Nov 2020) currently used in our gene expression data.

  • Resolved Incorrect Entrez IDs in the Expression Data

    For 72 genes the entrez ID in the expression dataset were mapped to a secondary gene with similar coordinates. One example was MCL1, which was mapped to the entrez ID 574406 rather than 4170. We resolved this issue in the current release.

  • MetMap Data on Public Portal

    The MetMap 500 data is now available in data explorer for comparisons with other data.

  • Ability to Bulk Download Co-Dependencies

    You can now download the top 100 correlated dependency profiles by clicking on the download link on the Co-dependency tile on gene pages.

  • Temporarily Dropped Data from the Mutation Files

    ACH-000901 WES failed QC in 21Q2 and its copy number data was reverted to SNP array. This change had not propagated to the mutation data. In this quarter we dropped the data from the mutations file as well and plan to replace both with new WGS data. Additionally, the Sanger WES data for ACH-000195 was dropped from our mutation calls due to issues with the raw sequencing file. WGS is available for this line and will replace the mutation calls in the upcoming quarter.


May 7, 2021

  • Chronos CRISPR Data

    This quarter, we're introducing a successor to CERES named Chronos. Chronos is based on a cell population dynamics model and provides several improvements over CERES. You can read more about the model and how we evaluated it in our preprint, as well as on this blog post. This quarter, Chronos will be available as an option for downloading or exploring in the portal. Next quarter, we plan to make it the default CRISPR dataset in the DepMap portal for gene cards and other tools.

  • Revised Dataset Names

    This quarter we've standardized the nomenclature used for identifying dependency datasets in Data Explorer and other portions of the portal. Please be aware that these names have changed. For example, the dataset which was previously labeled "CRISPR (Avana) Public 21Q1" is now named "CRISPR (DepMap 21Q1 Public, CERES)".

  • We discovered that our existing DNA sequencing data for the cell line ACH-002010 was actually from a different cell line (ACH-002011), and similarly that the DNA sequencing data for ACH-000314 was actually from ACH-000851. We have dropped the erroneous CNV and mutation data for these two lines, and plan to regenerate correct DNA-seq data.

  • Expected Gene Counts

    We are including expected gene counts for protein coding and isoform data as downloadable files.

  • Binary Mutation Matrices

    We are adding the mutation binary matrix for download. This file contains the information on whether or not each gene is mutated for each cell line.


February 1, 2021

  • Integrated CRISPR Datasets

    We received a lot of questions about whether we will ever have the Broad and Sanger CRISPR screens together in one dataset. We're happy to say the answer is yes! Beginning this quarter, all new versions of the Achilles datasets will be available with Sanger's Project Score dataset integrated. We've produced a preprint describing our methodology.

  • Changes Queued From The 20Q4 Release

    The following changes are present for the first time in the 21Q1 datasets. These changes were announced in the 20Q4 newsletter. However, they were not made available due to the 20Q4 data correction that occurred.

    • CN Data From Whole Genome Sequencing

    We are now including copy number data that was derived from WGS data into our datasets where available. We prioritize WGS data for estimating copy number when multiple data sources are available (e.g. WES, SNP array) for a cell line.

    • Y Chromosome Missing - CN Dataset

    Y Chromosome CN data is new (was part of initial 20Q4 data reverted) (now normal male/female X chromosome is reported as 0.5/1 and normal male Y chromosome as 0.5).

    • Amplification Status

    We are now adding amplification status (-, 0, +) for each segment in the segment copy number. X chromosome amplification status was removed due to its bias in female samples, caused by our purely male PoNs.

    • Removal of Cell Line CN and Mutation Datasets

    We have removed the following cell lines from our CN and mutation datasets: ACH-001189, ACH-002303, ACH-002315, ACH-002341.

  • Additional 20Q4 Fixes

    Fixed issues with initial 20Q4 data release (before files were reverted in 20Q4 v2). For more details see this post in the DepMap Public Forum.

  • Expression Dataset Changes

    • Updated Gene Annotations

    We have updated the Ensembl gene annotations used in the following files: CCLE_expression_full, CCLE_RNAseq_reads, CCLE_RNAseq_transcripts. The latest version of annotations (v102) from ensembl's biomart was used. The number of genes in these datasets will be different as compared to previous quarters.

    • Fix - Duplicated Genes

    A handful of genes were duplicated in the CCLE_expression file due to multiple transcripts mapping to the same gene name (i.e. PINX1 and TBCE plus 12 noncoding genes). We consolidated them by addition of their TPM values.

    • Fix - Mislabelled Line

    All the expression files for ACH-001321 were mistakenly labelled as being the expression profile for ACH-000561. We have resolved this issue.

  • CN Dataset Changes

    • Copy number calls for the normal sample ACH-002359 were removed in 21Q1 due to sample renaming. We plan to add the data back in 21Q2.

    Format Change

    We have removed the 'chr' chromosome prefix in segmented CN file.

  • Fusion Dataset Changes

    • Updated Fusion Pipeline

    We have updated the fusion pipeline to use a newer version of STAR-Fusion(v1.7.0) for the new RNAseq data. The previously released fusion calls are not affected by this change.

  • Portal Bug Fixes

    • We've fixed a problem where per-dose viability compound data was incorrectly displayed when viewing GDSC and CTD^2 drug screening data.


January 6th, 2021

  • We've corrected a mistake where "DepMap Public 20Q4 v2" on the downloads page was incorrectly downloading files from the "DepMap Public 20Q4" release. The download links have been corrected.


December 21, 2020

  • We identified issues with several files in the “DepMap Public 20Q4” dataset and have loaded a new release labeled "DepMap Public 20Q4 v2", which addresses those issues, into the portal. See this post on the forum for more details..


November 15, 2020

  • New data in 20Q4

    We are happy to report that we have released CRISPR data for 19 lines this quarter to bring our total number of cell lines with Avana CRISPR data to 808.

  • Target validation protocols

    Our most commonly used target validation protocols are now public at protocols.io - an open repository of protocols.

  • Whole Genome Sequencing (WGS) in DepMap

    We have begun generating WGS rather than WES for cell lines. Although only CN and exonic mutations are reported for this release, we are working to add additional data from WGS in future releases.

  • Removing Cell Lines from all Omics datasets

    We are removing ACH-001189, ACH-002303, ACH-002315, ACH-002341 from the all omics datasets since they are duplicate entries.

  • Y Chromosome Missing - CN dataset

    Y chromosome was missing from copy number calls in some cell lines. This issue has been resolved in the new release by using a new panel of normal samples. For some cell lines, minor changes have occurred to overall copy number calls in all chromosomes. Lines which significantly deviated in their autosomal CNV from previous release have not been incorporated and are being assessed further for future release.

  • Removal of cell lines - expression and fusion datasets

    We are removing 10 lines that did not pass QC in the expression and fusion datasets. Please see details in the README files in the Downloads page.

  • Bug fix

    Some expandable sections on tiles were broken and did not actually allow for expansion. This has been fixed

  • Gene Pages

    On the Gene pages, tiles are now loaded asynchronously so that the page should overall load quicker.

  • *Download button fix

  • The download button on the perturbation tab was missing in an earlier update. This has been restored as well as the download button on characterization tab has been fixed.

  • CN dataset changes

    • We have changed the way we compute gene level copy number data. We previously used an average of all segments within a gene, but are now using the average based on segment length within genes. This may change the data for a small number of genes.
    • A new column in the segmented CN datasets has been added to indicate amplification (+), deletion (-), and neutral status (0) of the gene. (This is from the GATK pipeline)
    • We have fixed the sequencing type column for some lines in the segment level CN dataset from Broad WES to Broad SNP array.

  • Mutation dataset changes

    • The column ‘wgs’ in the mutation file has been renamed to ‘legacy_wgs_exon_only’ to reflect that these are legacy mutation calls from an older pipeline. The new WGS mutation data (exonic regions only) is incorporated into the ‘CGA_WES_AC’ column.
    • The field ‘SangerRecalibWES_AC’ has been removed from the mutation dataset. This was reporting mutations called by the CGA pipeline run on data generated by the Sanger Institute. This information is now incorporated into the CGA_WES_AC field, which reports mutations called with the CGA pipeline using either Broad- or Sanger-generated data.
    • Some RNA mutations were being reported twice. We have resolved this issue.
    • We have removed allelic ratio of ‘0:0’ in the mutations dataset

  • Expression dataset changes

    • We have added some new genes in the expression dataset that were not present in previous releases.
    • we have removed some genes from the expression dataset that were not listed in hgnc and were not expressed by any cell lines
    • In the protein-coding gene expression file, we have removed any genes that did not map to an entrez ID.

  • Fusion dataset changes

    • We are releasing fusion data that was missing in 20Q2. However, we are not releasing any fusion data for the lines that are newly added in this data release.


August 25, 2020

  • Please note that the 20Q3 omics datasets are identical to what was released in 20Q2. There is no new omics data for this quarter, please check back in 20Q4!

  • We are very excited to announce the new DepMap Community Forum! Here you can ask DepMap questions or provide your thoughts and suggestions about the data and analyses being presented on the portal. Test it out and let us know what you think.

  • Data Explorer now includes the Custom Analysis tool for performing two-class comparisons or computing Pearson correlation across all features in a dataset.

  • You can now download subsets of datasets by cell lines, genes, and compounds. Check it out under Custom Downloads.

  • Find cell lines that most closely resemble a tumor type of interest with the new Celligner tool.

  • Arm-level CNA data from the recent pre-print, "Selective vulnerability of aneuploid human cancer cells to inhibition of the spindle assembly checkpoint." is now available on Data Explorer and Downloads.

  • The per-lineage dependency figure has been updated with an option to show/hide sublineages. The figure is also used for omics.

  • The raw readcounts and logfold change for failing Achilles replicates are now available for download in the files Achilles_raw_readcounts_failures and Achilles_logfold_change_failures.

  • Cell lines can now be labeled on violin and scatter plots in Data Explorer. Click on a point to toggle the label on or off.

  • We have modified how we compute context enrichment. Instead of running the limma R package, we now run a t-test for each gene/compound, for each context.


June 9, 2020

  • The 20Q2 README, CCLE_RNAseq_reads, CCLE_expression_full, CCLE_expression, CCLE_RNAseq_transcripts files have been updated with changes for three cell lines' RNA seq and expression data. The new files are available as README_v2.txt, CCLE_RNAseq_reads_v2.csv, CCLE_expression_full_v2.csv, CCLE_expression_v2.csv, and CCLE_RNAseq_transcripts_v2.csv, respectively.


May 12, 2020

  • Genes without unique guides now dropped in post-CERES files.

  • We corrected a data swap between ACH-001630 and ACH-002004. Note data sets prior to 20Q2 will have this data redacted.

  • Changed the batches of pDNA measurements. pDNA now falls in batches 2, 3, and 4.


March 30, 2020

  • Want tips on how to think about DepMap data? Interested in knowing more about the decisions that go into the portal? CDS is launching its official blog to address this and more. Check it out at https://cancerdatascience.org/blog/.


Mar 19, 2020

  • In the compound overview page, the dose range for Repurposing Primary was previously incorrectly reported as 10μM. The dose range is actually 2.5μM, and this is has now been corrected.


Feb 21, 2020

  • The 20Q1 sample_info file was previously missing lineage information for some cell lines. This has been corrected, and the new version is available as sample_info_v2.csv.


Feb 6, 2020

  • Corrected units on expression plots. Originally reported 'TPM (log2)' but the units are actually 'log2(TPM+1)'


Jan 21, 2020

  • Drug Repurposing data from Corsello, Nature Cancer (2020) now out, and fully incorporated in Data Explorer and explorable on the Dose Curves tab! Download the data here or explore the data on the paper landing page


Jan 13, 2020


Dec 19, 2019

  • Cell line pages now have Cellosaurus links, COSMIC links, STR profiles, and cell line aliases where applicable.

  • Selecting a single feature will show a violin plot on Data Explorer

  • Compound pages have been updated to include a new data availability tile, as well as a new compound response vs. expression correlation tile.

  • New Cell Line Selector supports adding features and filtering by cell line data.

  • Users are also able to select cell line as points on a plot and define as a group in Cell Line Selector.


Nov 18, 2019

  • The Public 19Q3 copy number file has been amended to have the log2 transform correctly applied to it. In addition, some duplicate cell lines were removed. This new version is named CCLE_gene_cn_v3.csv, and can be found here.


Oct 2, 2019

  • The Public 19Q3 copy number file CCLE_gene_cn.csv has been updated to correctly reflect released cell lines. This new version is named CCLE_gene_cn_v2.csv, and can be found here.


Aug 12, 2019

  • New drug data for the PRISM Repurposing Primary Screen now out! Data for 4,686 compounds screened against 578 cell lines can be found here.


Aug 7, 2019

  • A new Portal FAQ can be found here.

  • Expression, fusion, and copy number are now all called from hg38 realigned bams

  • Fusions have updated filtering criteria.

  • RNAseq count data is now created using RSEM and may look slightly different.

  • Cell line star-ing has been upgraded to save multiple lists! Lists can be applied to data explorer and gene/compound pages to highlight your favorite cell lines as stars. Previously saved cell lines should be automatically imported upon opening Tools -> Cell Line Selector.

  • Latest CCLE datasets loaded in the portal. Added data from the latest Ghandi et al., 2019 CCLE paper. Absolute copy number, translocation, RRBS, and RPPA datasets as shown in gene and cell line pages have been updated. Global chromatin profiling, miRNA expression, and metabolomics data are now available to plot in in Data Explorer. Downloads from this paper are available under CCLE 2019, and previously shown CCLE omics data remain available under CCLE 2018.

  • CERES-processed Sanger Project Score CRISPR data is now available on gene pages, as well as for download on the Downloads page

  • We are moving away from CCLE names, due to frequent misunderstanding surrounding the latter half of the name (i.e. ‘_LUNG’ out of ‘A549_LUNG’). In an effort to provide the most accurate and easily interpreted information, we are switching to using the first half of the CCLE name (i.e. ‘A549’) combined with primary disease information (‘Lung Cancer’). New lines that are added to the projects will no longer be assigned a CCLE name.

  • New columns of metadata have been added to sample_info file, and some columns have been renamed. “Primary site” is now called “Sample collection site”, and “disease”, “disease_subtype”, and “disease_sub_subtype” are now “lineage”, “lineage_subtype”, and “lineage_sub_subtype” respectively. The columns “primary/met”, “sex”, “source”, “age”, “RRID”, and “additional_info” have been added.

  • Mutations that could be plot as 0 or 1 along axes in data explorer mistakenly included hotspot mutations that were silent. This is is now corrected; see the mutation annotations for definitions of mutation categories.


May 22, 2019

  • Added 19Q1 RNAseq reads to data downloads. View it here


May 7, 2019

  • We’ve restructured data file names, consolidated cell line metadata, and written more comprehensive READMEs in an attempt to be more clear and intuitive in how we present data files.

  • The terminology ‘DepMap_ID’ is replacing ‘Broad_ID’. IDs will remain the same, but column names have changed.

  • Gene level copy number changed from log2(CN) to log2(CN+1)

  • Tools that use a binarized definition of mutations have been updated to reflect the categories we use in visualizations. See the mutation annotations for mutation categories used.

  • Look and interface of the downloads page and the cell line page have been updated.


Feb 20, 2019

  • Updated sources for certain cell lines. See the corrected dataset here.


Feb 6, 2019


Feb 1, 2019

  • New co-dependency card in gene overview.

  • Updated the Combined RNAi (Broad, Novartis, Marcotte) dataset; please see D2_combined_gene_dep_scores.csv in DEMETER2 Data v5. This fixes a minor bug with Marcotte LFC data that caused hairpins targeting multiple genes to appear multiple times in the LFC matrix. The bug created bias in the seed effect estimates for those hairpins, causing very minor differences to the resulting model parameters.

  • Updated pipelines used for expression, mutation, and copy number. See more details here.

  • Hotspot mutations are now shown in the portal. See the mutation annotations for definitions of mutation categories

  • Updated the per-gene mutations table to enable showing/hiding columns.

  • Corrected the 18Q4 transcript-level expression data, which previously was collapsed to just per gene levels. The corrected dataset has counts on a per transcript level and is available here.

  • Added CTD2 dose-level data. See dose-response curves on compound sensitivity pages by clicking on a cell line.

  • Updated the Sanger GDSC data to the March 2018 update (v17.3).

  • Added ability to search cell lines by aliases and DepMap IDs.


Dec 19, 2018

  • Updated the DepMap Public 18Q4 README to version 2.


Nov 9, 2018

  • Corrected the expression data to fix incorrect gene symbols for some genes. See the corrected dataset here.


November 5, 2018

  • Expression data now TPM instead of RPKM

  • New mutation categories, see here for details


August 31, 2018

  • Correction to 18Q1, 18Q2 and 18Q3 copy number datasets

  • 18Q1, 18Q2 and 18Q3 copy number datasets have been updated to correctly reflect released cell lines. These updated files are named "public_18Q3_gene_cn_v2.csv" and so on.


August 6, 2018

  • More data, new apps, and a new way to get an overview.

  • New 18Q3 Avana + Omics DepMap Release

  • All RNAi data is now processed by DEMETER2

  • New downloads section includes all data from Achilles and CCLE portals

  • New gene overview provides at-a-glance results for a gene (and works on mobile!)

  • New Cell STRAINER app allows users to assess genetic discordance between their cell line sample to the corresponding reference in CCLE


May 7, 2018

  • Updated DepMap data to "Public 18Q2"

  • Avana CRISPR screen dataset expanded to include 436 cell lines.

  • Mutation, indel, copy number and gene expression data also expanded to include new lines in Avana dataset

  • Added Combined Broad, Novartis, Marcotte et al. RNAi data: 713 unique cell lines screened across three large-scale RNAi datasets, with integrated gene dependencies estimated using the DEMETER2 algorithm


June 6, 2018

  • Corrected incorrect copy number data which was shown for some cell lines.