Microbiome data is now supported on OmixAtlases. The data can be searched and queried on Polly and can be downloaded in the form of a BIOM file.
‘Cancer Stage’ is the newest metadata field and it has been curated for 974 datasets on GEO.
Reports present in workspaces can now be linked to datasets in OmixAtlases.
‘Select All’ is now an option for the filter results while searching and filtering datasets on the OmixAtlas UI.
Identify datasets that do not contain data matrices on the OmixAtlases through the new ‘Metadata-only’ labels.
Send requests to obtain data matrices for specific datasets.
The Nextflow integration with Polly CLI allows users to run any Nextflow bioinformatics pipeline with parallel processing on Polly’s scalable computational infrastructure.
OmixAtlases are now available as shareable links through Workspaces.
797 clinical datasets from MIMIC are now available on Polly, curated for four metadata fields “Drug, Dose, Frequency, and Strength.”
With the recent Polly Python version release, users can now:
Copy files/folders in workspaces from one workspace to another.
Add datasets to and delete datasets from OmixAtlases.
Link and fetch reports to a dataset on any OmixAtlas.
Get auto-generated metadata summaries for datasets present on the GEO OmixAtlas by giving the GEO accession ID as an input. This helps to improve findability and estimate the relevance of the dataset.
Cell line recommendations are now available, to select multiple related cell lines. Users can start the search with a disease, tissue or cell line and receive recommendations for related or matching cell lines.
The recommend functionality is available for disease, tissue and cell lines at sample level metadata queries as well.
New datasets from the following repositories have been added to Polly OmixAtlases in the last month:
DepMap OmixAtlas - 4312
GEO OmixAtlas - 922
Single-Cell OmixAtlas - 240
These are the major Polly Python updates to existing functionalities:
The complete schema for tables in an OmixAtlas can be fetched in the form of a dictionary.
Schema for feature level metadata is now available.
Schema for single cell and GWAS data can be retrieved.
Ontology recommendations for sample-level queries have been enabled.
Users can now host docker based applications and add their own notebook environments using Polly CLI.
Users can launch opensource notebooks from GitHub directly on Polly’s compute environment.
Users can save dataset from OmixAtlas to workspace as well as upload/download files and folders to/from workspace using polly-python.
Users can filter the schema in an Omixatlas specific to the source & data_type using polly-python.
Users can access the installed version of polly-python inside python shell or jupyter notebook cell.
Users can create cohorts for TCGA Transcriptomics and Mutation data using polly-python.
Users can now create an Omixatlas using polly-python.
Users can query datasets, samples and features on polly-python across multiple OmixAtlases at once. Find examples here.
Recommended disease ontologies are displayed when a user queries disease field in OmixAtlas.
Users can now copy/move files (except analysis files) or folder across workspaces and folders on Polly frontend.
Users can now launch notebooks situated within folders or sub-folders.
Users can also filter folders from workspace contents.
polly-python users can now access schema functions via both repo_id and repo_name.
polly-python users can easily convert .gct file format to .maf file format in TCGA and cBioportal repositories using a file format converter function.
211 Single Cell datasets were added to OmixAtlas with cluster-level cell type annotation.
For 3.5k GEO datasets platform field was updated on Polly.
Users can now also preview tsv files along with other files types on Polly.
Cell types were added for fetal single cell atlas in Polly.
Users can create workspaces and fetch list of workspaces using polly-python.
Change in authentication process - Until now, the users had to authenticate each class separately. In this release, a global authentication mechanism has been developed using which users can authenticate multiple classes (such as OmixAtlas, Workspaces) using a single authentication step.
Users can publish personal notebook on Polly workspace environment and analyse vcf files using Hail docker.
Preview of all standard file types is available on Polly UI. Different file types like xls/xlsx, pdf, html, csv, png/jpg/gif, ipynb can be opened directly on Polly without a third party service.
2792 datasets, 74713 samples have been annotated for tissue and cell line tags.
Strain has been added as a queryable field at dataset-level for GEO datasets.
A bug in Sort by Relevance when searching over description in OmixAtlas table view was fixed. Other sort related bugs have also been fixed.
360k datasets from the UK Biobank were added on Polly.
149k Immport lab datasets were added on Polly.
59k RCSB datasets were added on Polly.
Cell type curated for 229 Single Cell datasets were added to OmixAtlas.
Sample level age labels were added to all datasets for TCGA and GEO
Workspace contents like files, analyses, reports can now be sorted in workspaces based on name, created date, last modified, author, and type.
15,000 Microarray and WES (Whole Exome Sequencing) datasets from PPMI have been curated and added to Polly.
Dataset Overview (containing Title, Publication, Abstract, Tags for the Data, sample information as summary plots and table, processed data as a table) for every dataset can be viewed using the “View Details” option beside datasets in OmixAtlases.
The Datasets (gct, h5ad, vcf) files can be downloaded from the Options Menu in the Card view or from the View Details Page.
Request for Dataset option is available at multiple places within the OmixAtlases.
Resolved a bug in El-MAVEN instance termination.
Curation information of 45,000 datasets from gnomAD and DepMap has been updated.
In addition to Liver OmixAtlas, polly now has GDC, GEO, cBioPortal, PharmacoDB, LINCS and Metabolomics OmixAtlases.
Added 17500 datasets on Polly
Genotype, age and gender annotation were added to 3,900,000 GEO samples
New compute machines Mi5xlarge (32 vCPU, 250GB RAM), Mi6xlarge (64 vCPU, 500GB RAM), Mi7xlarge (64 vCPU, 970GB RAM), GPUsmall (1 GPU, 8 vCPU, 60GB RAM) and GPUmedium (4 GPU, 32 vCPU, 240GB RAM) were added to EL-MAVEN.
Introduced Polly Files (beta version), a desktop application for transferring files between computer and Polly Workspaces.
Resolved a bug that caused an error in SQL query on “’” expansion.
gnomAD was enriched with 96,000 new datasets of WES and WGS type.
Fixed table view bugs and enhanced UI on OmixAtlas.
New datasets totalling 48,000 were added to Immport, HPA, CPTAC and GTex.
Auto-curated tags totalling 770,000 were added to polly datasets.
A bug affecting folder deletion was fixed.
Introduced Polly Python Library facilitating powerful search capabilities across dataset, sample, and feature level metadata on any computational environment through code.
Introduced “View Only Access” on Polly Workspaces – an enterprise grade permission giving more control to admins.
Enabled Voila Dashboards within Polly Notebooks.
Introduced an application resource monitor on EL-Maven, enabling users to monitor the progress of a job and make decisions about requirement of a bigger machine.
Over 155,000 datasets were added to LINCS OmixAtlas on Polly.
Liver OmixAltas released.
June 18th, 2021
Added 12,200 new curated transcriptomics and single cell datasets to various Data Lake.
June 4th, 2021
Made Dual Mode Visualizatopn application and Untargeted Pipeline Application lighter for heavy datasets to avoid memory leakage.
Added 3.350 new curated transcriptomics and single cell datasets to various Data Lake.
Resolved issue in Lipidomics, Dual Mode and Polly El-MAVEN Applications.
May 21st, 2021
Introduced Google Slide Intergration with Polly Notebooks.
Added Reporting feature using Markdown in Dual Mode Visulaization Application.
Added 23.350 new curated transcriptomics and single cell datasets to various Data Lake.
Added Scree Plot under Quality Check in Dual Mode Visualization Application.
Resolved forgot password issue.
May 7th, 2021
El-MAVEN latest beta version is now available on Polly.
Added 25.120 new curated transcriptomics and single cell datasets to various Data Lake.
Added two-way ANOVA capability in Dual Mode visualization Application along with combining multiple conditions or cohorts while performing differential expression.
April 23rd, 2021
Introduced an Admin Dashboard to provide account administrators the convenience to manage their accounts.
Added 10,120 new curated transcriptomics and single cell datasets to various Data Lake.
HTML as a data file can now be opened through Workspace itself.
Resolved an issue to make Dropbox work seamlessly with Workspaces.
April 9th, 2021
Created Shiny and Studio applicationS for feature level search of GEO Datasets.
Added 11,177 new curated transcriptomics datasets to TCGA Data Lake.
PDF as a data file can now be opened through Workspace itself.
Resolved an issue to make Google Drive work seamlessly with Workspaces.
March 26th, 2021
TEDDY (The Environmental Determinants of Diabetes in the Young) and DEPMAP (Dependency Map) Data Lakes have been added on Polly.
Introduced option to directly export data to the workspace from a Studio Preset.
Added 37,177 new curated datasets corresponding to various omics to different Data Lakes.
Updated Polly Login User Interface.
Added additional filters to TEDDY Data Lake.
Resolved issue with app hosting infrastructure to increase stability of apps for better user experience.
March 12th, 2021
Introduced Docker building feature on Polly CLI which enable users to build dockers, check their build status and logs and push dockers to Polly.
Added 11,470 new curated transcriptomics and single cell datasets to different Data Lakes.
Better accessibility to datasets within OmixWiki with accessibility to metadata filtering options.
February 26th, 2021
Introduced the functionality that enables the users to host their own application on Polly by using Polly CLI.
Enabled feature level querying for GEO Data Lake.
Added Genomics docker for variant calling and annotation.
Added a new notebook environment for Genomics Variant Analysis.
Enabled partial string search for dataset id in the search bar.
Added 20,096 new curated transcriptomics and single cell datasets to different Data Lakes.
February 12th, 2021
Introduced the status page for real time updates on Polly’s status, downtime, incidents, and maintenance.
Added auto-run feature for selected Studio Presets.
Enabled component updating and versioning by component creator.
Added 11,580 new curated transcriptomics datasets to GEO and LINCS Data Lakes.
Updated the UI of visualization dashboard of Data Studio for better visibility.
Updated all notebook dockers with the latest version of discoverpy (0.0.10).
Introduced the option to make dockers on Polly public by adding public docker domain.
Welcome screen now displays the username.
Decreased launch time for applications and notebooks through horizontal pod scaling and buffering.
Fixed landing on Discover after logging in error.
Fixed priority assignment of automated jobs error.
Fixed renaming files after upload error.
Fixed 404 error in Metabolomics Data Lake.
Integrated documentation to every application.
September 25th, 2020
Introduced Labeled LC-MS Analysis preset for natural abundance correction and visualization for single or dual labeled LC-MS data combined with an interactive, customizable and shareable reporting dashboard.
Changed the optimized color palette in IntOmix from a red-yellow-green scale to a more intuitive red-green scale. All upregulated metabolites or genes are represented by a shade of red and downregulated metabolites or genes as a shade of green.
Changed the non-optimized color palette in IntOmix from a pink-purple scale to a red-green scale to remove ambiguity.
April 24th, 2020
COVID-19 (Transcriptional datasets for SARS viruses, viral infections, and therapeutics for novel coronavirus) repository has been added in Data Lake.