Release Notes

September, 2022
  • Microbiome data is now supported on OmixAtlases. The data can be searched and queried on Polly and can be downloaded in the form of a BIOM file.
  • ‘Cancer Stage’ is the newest metadata field and it has been curated for 974 datasets on GEO.
  • Reports present in workspaces can now be linked to datasets in OmixAtlases.

Update

  • ‘Select All’ is now an option for the filter results while searching and filtering datasets on the OmixAtlas UI.
  • Identify datasets that do not contain data matrices on the OmixAtlases through the new ‘Metadata-only’ labels.
  • Send requests to obtain data matrices for specific datasets.

August, 2022

New

  • The Nextflow integration with Polly CLI allows users to run any Nextflow bioinformatics pipeline with parallel processing on Polly’s scalable computational infrastructure.
  • OmixAtlases are now available as shareable links through Workspaces.
  • 797 clinical datasets from MIMIC are now available on Polly, curated for four metadata fields “Drug, Dose, Frequency, and Strength.”
  • With the recent Polly Python version release, users can now:
    • Copy files/folders in workspaces from one workspace to another.
    • Add datasets to and delete datasets from OmixAtlases.
    • Link and fetch reports to a dataset on any OmixAtlas.
    • Get auto-generated metadata summaries for datasets present on the GEO OmixAtlas by giving the GEO accession ID as an input. This helps to improve findability and estimate the relevance of the dataset.
    • Cell line recommendations are now available, to select multiple related cell lines. Users can start the search with a disease, tissue or cell line and receive recommendations for related or matching cell lines.
    • The recommend functionality is available for disease, tissue and cell lines at sample level metadata queries as well.
  • New datasets from the following repositories have been added to Polly OmixAtlases in the last month:
    • DepMap OmixAtlas - 4312
    • GEO OmixAtlas - 922
    • Single-Cell OmixAtlas - 240

Update

These are the major Polly Python updates to existing functionalities:
  • The complete schema for tables in an OmixAtlas can be fetched in the form of a dictionary.
  • Schema for feature level metadata is now available.
  • Schema for single cell and GWAS data can be retrieved.
  • Ontology recommendations for sample-level queries have been enabled.

March, 2022

New

  • Users can now host docker based applications and add their own notebook environments using Polly CLI.
  • Users can launch opensource notebooks from GitHub directly on Polly’s compute environment.
  • Users can save dataset from OmixAtlas to workspace as well as upload/download files and folders to/from workspace using polly-python.
  • Users can filter the schema in an Omixatlas specific to the source & data_type using polly-python.
  • Users can access the installed version of polly-python inside python shell or jupyter notebook cell.
  • Users can create cohorts for TCGA Transcriptomics and Mutation data using polly-python.
  • Users can now create an Omixatlas using polly-python.
  • Users can query datasets, samples and features on polly-python across multiple OmixAtlases at once. Find examples here.
  • Recommended disease ontologies are displayed when a user queries disease field in OmixAtlas.
  • Users can now copy/move files (except analysis files) or folder across workspaces and folders on Polly frontend.
  • Users can now launch notebooks situated within folders or sub-folders.
  • Users can also filter folders from workspace contents.

Update

  • polly-python users can now access schema functions via both repo_id and repo_name.
  • polly-python users can easily convert .gct file format to .maf file format in TCGA and cBioportal repositories using a file format converter function.
  • 211 Single Cell datasets were added to OmixAtlas with cluster-level cell type annotation.
  • For 3.5k GEO datasets platform field was updated on Polly.
  • Users can now also preview tsv files along with other files types on Polly.
  • Cell types were added for fetal single cell atlas in Polly.

January, 2022

New

  • Users can create workspaces and fetch list of workspaces using polly-python.
  • Change in authentication process - Until now, the users had to authenticate each class separately. In this release, a global authentication mechanism has been developed using which users can authenticate multiple classes (such as OmixAtlas, Workspaces) using a single authentication step.
  • Users can publish personal notebook on Polly workspace environment and analyse vcf files using Hail docker.
  • Preview of all standard file types is available on Polly UI. Different file types like xls/xlsx, pdf, html, csv, png/jpg/gif, ipynb can be opened directly on Polly without a third party service.

Update

  • 2792 datasets, 74713 samples have been annotated for tissue and cell line tags.
  • Strain has been added as a queryable field at dataset-level for GEO datasets.
  • A bug in Sort by Relevance when searching over description in OmixAtlas table view was fixed. Other sort related bugs have also been fixed.
  • 360k datasets from the UK Biobank were added on Polly.
  • 149k Immport lab datasets were added on Polly.
  • 59k RCSB datasets were added on Polly.
  • Cell type curated for 229 Single Cell datasets were added to OmixAtlas.
  • Sample level age labels were added to all datasets for TCGA and GEO

November, 2021

New

  • Workspace contents like files, analyses, reports can now be sorted in workspaces based on name, created date, last modified, author, and type.
  • 15,000 Microarray and WES (Whole Exome Sequencing) datasets from PPMI have been curated and added to Polly.
  • Dataset Overview (containing Title, Publication, Abstract, Tags for the Data, sample information as summary plots and table, processed data as a table) for every dataset can be viewed using the “View Details” option beside datasets in OmixAtlases.
  • The Datasets (gct, h5ad, vcf) files can be downloaded from the Options Menu in the Card view or from the View Details Page.
  • Request for Dataset option is available at multiple places within the OmixAtlases.

Update

  • Resolved a bug in El-MAVEN instance termination.
  • Curation information of 45,000 datasets from gnomAD and DepMap has been updated.

October, 2021

New

  • In addition to Liver OmixAtlas, polly now has GDC, GEO, cBioPortal, PharmacoDB, LINCS and Metabolomics OmixAtlases.

Update

  • Added 17500 datasets on Polly
  • Genotype, age and gender annotation were added to 3,900,000 GEO samples

September, 2021

New

  • New compute machines Mi5xlarge (32 vCPU, 250GB RAM), Mi6xlarge (64 vCPU, 500GB RAM), Mi7xlarge (64 vCPU, 970GB RAM), GPUsmall (1 GPU, 8 vCPU, 60GB RAM) and GPUmedium (4 GPU, 32 vCPU, 240GB RAM) were added to EL-MAVEN.
  • Introduced Polly Files (beta version), a desktop application for transferring files between computer and Polly Workspaces.

Update

  • Resolved a bug that caused an error in SQL query on “’” expansion.
  • gnomAD was enriched with 96,000 new datasets of WES and WGS type.
  • Fixed table view bugs and enhanced UI on OmixAtlas.
  • New datasets totalling 48,000 were added to Immport, HPA, CPTAC and GTex.
  • Auto-curated tags totalling 770,000 were added to polly datasets.
  • A bug affecting folder deletion was fixed.

August, 2021

New

  • Introduced Polly Python Library facilitating powerful search capabilities across dataset, sample, and feature level metadata on any computational environment through code.
  • Introduced “View Only Access” on Polly Workspaces – an enterprise grade permission giving more control to admins.
  • Enabled Voila Dashboards within Polly Notebooks.
  • Introduced an application resource monitor on EL-Maven, enabling users to monitor the progress of a job and make decisions about requirement of a bigger machine.

Update

  • Over 155,000 datasets were added to LINCS OmixAtlas on Polly.

July, 2021

New

  • Liver OmixAltas released.

June 18th, 2021

New

  • Added 12,200 new curated transcriptomics and single cell datasets to various Data Lake.

June 4th, 2021

New

  • Made Dual Mode Visualizatopn application and Untargeted Pipeline Application lighter for heavy datasets to avoid memory leakage.
  • Added 3.350 new curated transcriptomics and single cell datasets to various Data Lake.

Update

  • Resolved issue in Lipidomics, Dual Mode and Polly El-MAVEN Applications.

May 21st, 2021

New

  • Introduced Google Slide Intergration with Polly Notebooks.
  • Added Reporting feature using Markdown in Dual Mode Visulaization Application.
  • Added 23.350 new curated transcriptomics and single cell datasets to various Data Lake.

Update

  • Added Scree Plot under Quality Check in Dual Mode Visualization Application.
  • Resolved forgot password issue.

May 7th, 2021

New

  • El-MAVEN latest beta version is now available on Polly.
  • Added 25.120 new curated transcriptomics and single cell datasets to various Data Lake.

Update

  • Added two-way ANOVA capability in Dual Mode visualization Application along with combining multiple conditions or cohorts while performing differential expression.

April 23rd, 2021

New

  • Introduced an Admin Dashboard to provide account administrators the convenience to manage their accounts.
  • Added 10,120 new curated transcriptomics and single cell datasets to various Data Lake.

Update

  • HTML as a data file can now be opened through Workspace itself.
  • Resolved an issue to make Dropbox work seamlessly with Workspaces.

April 9th, 2021

New

  • Created Shiny and Studio applicationS for feature level search of GEO Datasets.
  • Added 11,177 new curated transcriptomics datasets to TCGA Data Lake.

Update

  • PDF as a data file can now be opened through Workspace itself.
  • Resolved an issue to make Google Drive work seamlessly with Workspaces.

March 26th, 2021

New

  • TEDDY (The Environmental Determinants of Diabetes in the Young) and DEPMAP (Dependency Map) Data Lakes have been added on Polly.
  • Introduced option to directly export data to the workspace from a Studio Preset.
  • Added 37,177 new curated datasets corresponding to various omics to different Data Lakes.

Update

  • Updated Polly Login User Interface.
  • Added additional filters to TEDDY Data Lake.
  • Resolved issue with app hosting infrastructure to increase stability of apps for better user experience.

March 12th, 2021

New

  • Introduced Docker building feature on Polly CLI which enable users to build dockers, check their build status and logs and push dockers to Polly.
  • Added 11,470 new curated transcriptomics and single cell datasets to different Data Lakes.

Update

  • Better accessibility to datasets within OmixWiki with accessibility to metadata filtering options.

February 26th, 2021

New

  • Introduced the functionality that enables the users to host their own application on Polly by using Polly CLI.
  • Enabled feature level querying for GEO Data Lake.
  • Added Genomics docker for variant calling and annotation.
  • Added a new notebook environment for Genomics Variant Analysis.
  • Enabled partial string search for dataset id in the search bar.
  • Added 20,096 new curated transcriptomics and single cell datasets to different Data Lakes.

February 12th, 2021

New

  • Introduced the status page for real time updates on Polly’s status, downtime, incidents, and maintenance.
  • Added auto-run feature for selected Studio Presets.
  • Enabled component updating and versioning by component creator.
  • Added 11,580 new curated transcriptomics datasets to GEO and LINCS Data Lakes.

Update

  • Updated the UI of visualization dashboard of Data Studio for better visibility.
  • Updated all notebook dockers with the latest version of discoverpy (0.0.10).
  • Added finer error and warning messages to CLI.
  • Removed the 1000 row limit on query results in CLI.

January 29th, 2021

New

  • Public sharing of the reports created within any Studio session is now available on Polly.
  • Added 14,727 new curated transcriptomics and metabolomics datasets with 9,513 transcriptomics datasets being added to the LINCS Data Lake.

Update

  • Added specific error message to indicate presence of multiple groups with the same compound name in Labeled LC-MS Workflow.
  • Added specific error message in Labeled LC-MS Workflow if isotopologues of the compound are spread over different metagroups in El-MAVEN output.

January 15th, 2021

New

  • GTEx Correlation and Enrichment Analysis preset is now available which can be used to identify enriched pathways based on the gene correlations.
  • Added TraceFinder Downstream Analysis preset with additional feature of translating the analytical insights into shareable dashboards.
  • Added 1,836 new curated transcriptomics and proteomics datasets to different Data Lakes.

Update

  • Enabled use of retention time information for metabolite identification and updated Untargeted Pipeline library to handle already identified metabolities.

January 1st, 2021

Update


December 18th, 2020

New

  • LINCS(Library of Integrated Network-Based Cellular Signatures) repository with 19,520 curated datasets has been added in Data Lake.

Update

  • Added ANOVA Test and updated Limma Test with extra filters for volcano plot and Heatmap for the differentially expressed results in the Dual Mode Data Visulaization.

December 4th, 2020

New

  • We now support reactions from Chinese Hamster Ovary (CHO) for integrated pathway analysis in IntOmix.

Update

  • Resolved timeout error for opening a folder containing large number of files within a Workspace.
  • Resolved issue with Workspace root directory redirection on selection.

November 20th, 2020

New

  • Improved OmixWiki UI for better consumption.
  • Added the ability to clone Notebooks within Workspaces.

Update

  • Added granular error messages for Notebook functions and CLI jobs.
  • Resolved the issue with renaming large data files.
  • Resolved the issue with folder breadcrumb in Workspaces.
  • Fixed involuntary logout issue.

November 6th, 2020

New

  • Data transfer time limit has been extended to 8 hour enabling transfer of 1TB data through CLI at once.

Update

  • Updated user interface of Discover and Data Studio.
  • Added filtering interface to GEO data lake.
  • Added search functionality on Discover interface.
  • Added highlight and cumulative size feature on multiselection in Workspaces.
  • Updated collaborators icon to show number of collaborators.
  • Resolved inconsistent log2FC values for multiple comparisons in IntOmix.
  • Resolved sample name descrepancy in concentration plot of QuantFit.
  • Fixed table column resizing error on filtering interface.
  • Resolved a bug in Polly Docker Domain.

October 23rd, 2020

New

Update

  • Updated Workspaces user interface.
  • Added filtering interface to COVID-19 data lake.
  • Updated datasets searchability on dataset ID and description.
  • Fixed incorrect memory error in CLI.

October 9th, 2020

New

  • Introduced the option to make dockers on Polly public by adding public docker domain.
  • Welcome screen now displays the username.
  • Decreased launch time for applications and notebooks through horizontal pod scaling and buffering.

Update

  • Fixed landing on Discover after logging in error.
  • Fixed priority assignment of automated jobs error.
  • Fixed renaming files after upload error.
  • Fixed 404 error in Metabolomics Data Lake.
  • Integrated documentation to every application.

September 25th, 2020

New

  • Introduced Labeled LC-MS Analysis preset for natural abundance correction and visualization for single or dual labeled LC-MS data combined with an interactive, customizable and shareable reporting dashboard.
  • Integrated pathway visualization in Labeled LC-MS Workflow.
  • Added dilution factor and protein normalization in the Lipidomics Visualization Dashboard.

Update

  • Added warning message to prevent duplicate folder creation in Workspaces.
  • Fixed nested folder creation and notebook renaming error in Workspaces.
  • Fixed 503 error in Metabolomics Data Lake.
  • Fixed a bug associated with notebooks and shiny apps opening to a blank screen.
  • Fixed error occurring in automated jobs.

September 11th, 2020

New

  • Introduced Data Studio that brings the tools you need to create, customize, and share your analysis effortlessly with your team across the world.
  • Introduced CCLE Correlation Analysis for identification of features correlated with a gene mutation such as mutations in other genes, expression and sample level metadata.

Update

  • Updated the version of scanpy to 1.6.0 in single cell docker.
  • Fixed a bug in notebook giving error with CLI commands.

August 28th, 2020

New

Update

  • Updated discoverpy package in all the dockers to the latest version.
  • Fixed CellxGene visualization loading for specific datasets.
  • Fixed duplicate metabolite generation issue within the Dual Mode Data Visualization application.
  • Fixed minor UI issues in Workspaces.
  • Decreased Workspaces loading time.

August 14th, 2020

New

  • Introduced Workspaces on Polly, which is a new and improved version of Polly Projects.
  • Added GTEx app to process the filtered datasets from GTEx data lake.
  • Added a filtering interface for GTEx data lake that allows filtering of the data on the basis of fields within the curated dataset.
  • Integrated Discover and Dual Mode Visualization for processing and further analysis of transcriptomic and metabolomic and single cell filtered datasets.
  • Integrated Notebook to process the filtered datasets.
  • Hosted CellxGene for processing and visualization of single cell datasets.

Update

  • Enabled logs access functionality through Polly CLI.
  • Added the python package, Discoverpy to all the dockers.

Deprecated

  • The Project Management Dashboard has been deprecated and replaced by Workspaces.

July 31st, 2020

New

  • Added dot plot for Gene Ontology in the Discover application.
  • Added an extra layer of security in authentication.

Update

  • Allowed internal standards and unlabeled data to pass through the Labeled LC-MS Workflow to generate output.
  • Added Phantasus, Boxplot & Whisker plot along with the bar plot in the Discover application.
  • Fixed Polly CLI auto login error in notebooks.
  • Fixed unresponsive notebook with infinite loading.

July 17th, 2020

New

  • We have released the newest version of Polly CLI v0.1.18 enabling you to run a CLI job without the need of "secret" key if the private docker is on Polly.

Update

  • Labeled LC-MS Workflow has N and C as indistinguishable isotopes.
  • Improved the stability of both Shiny and Desktop Applications.
  • Communication within the infrastructure is now through encrypted keys.
  • Shiny apps as well as shiny states are encrypted during transit as well as storage.
  • Added encryption for the disks running the computations.
  • Encrypted buckets containing credentials.

July 3rd, 2020

New

Update

Deprecated

  • Deprecated El-MAVEN FirstView Integration.

June 19th, 2020

New

  • We now support reactions from Drosophila melanogaster for integrated pathway analysis in IntOmix.
  • Introduced pathway enrichment and pathway view feature along with comparative analysis in Dual Mode Data Visualization.
  • DEPMAP CCLE (DEPMAP Cancer cell line expression data and dependency scores for genes) repository has been added in Data Lake.
  • Implemented input file access from the sub-folders of a project for applications.

Update

  • The Single Cell Downstream docker is updated with these new packages: rpy2, anndata2ri (Python packages), ExperimentHub (R package).
  • Added a GPU instance for Polly CLI.

June 5th, 2020

New

  • Introduced visualization of labels in stacked plot within Labeled LC-MS Workflow.
  • Enabled least privilege access for stringent access policies.
  • Encryption of data in transit and at rest.

Update

  • Improved access logs throughout the platform.
  • Enhanced security using a secrets management service.
  • Implemented regular backups and versioning of data.

May 22nd, 2020

New

  • Introduced Polly QuantFit node in Compound DiscovererTM that allows peak picking and absolute quantification on raw data obtained from a Thermo ScientificTM Mass Spec instrument.

May 8th, 2020

New

Update

  • Changed the optimized color palette in IntOmix from a red-yellow-green scale to a more intuitive red-green scale. All upregulated metabolites or genes are represented by a shade of red and downregulated metabolites or genes as a shade of green.
  • Changed the non-optimized color palette in IntOmix from a pink-purple scale to a red-green scale to remove ambiguity.

April 24th, 2020

New

  • COVID-19 (Transcriptional datasets for SARS viruses, viral infections, and therapeutics for novel coronavirus) repository has been added in Data Lake.