Skip to content

Support retrieval of scxa anndata files by accession #106

@dosumis

Description

@dosumis

Status draft:

Question: Does this belong in VFB_connect or in a separate library? Probably the latter.

This works

import anndata
import requests
import warnings
def read_h5ad_from_scxa(accession, dir='./'):
    """
    Retrieve anndata file from scxa by accession.  Save file to disc and return an anndata object

    ARGS: 
    * Accession: 
    
    KWARGS: 
    * dir: Optionally specify directory where anndata file should be stored.
    """
    filename = accession + '.project.h5ad'
    r = requests.get("http://ftp.ebi.ac.uk/pub/databases/microarray/data/atlas/sc_experiments/%s/%s.project.h5ad" % (accession, accession))
    if not r.status_code == 200:
        warnings.warn("request failed: " + r.reason)
        return False
    filepath = dir + filename
    with open(filepath, 'wb') as h5ad:
        h5ad.write(r.content)
    return anndata.read_h5ad(filepath)

The result is still quite far from the CxG standard for obs and var e.g.

var['gene_name'] --> feature_name

authors_cell_type_-ontology_labels-_ontology_labels. --> cell_type

authors_cell_type_-_ontology_labels_ontology --> cell_type_ontology_term_id - using CURIEs for values in in place of PURLs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions