Skip to content

Commit e822f37

Browse files
committed
get_instances: resolve subclass closure via Owlery, not direct INSTANCEOF
The previous Cypher used a single-step (i:Individual:has_image)-[:INSTANCEOF]->(p:Class {short_form: $id}) which only sees individuals *directly* typed as the queried class. For any parent class (e.g. FBbt_00007484 'mushroom body intrinsic neuron', whose individuals are typed as Kenyon cell, gamma Kenyon cell, etc.) the query returned 0 rows even though SOLR has dozens of image entries on file. v2 (and tester) reported ListAllAvailableImages empty for mushroom body intrinsic neuron, larval Kenyon cell, neuron root, etc. Switch to the canonical VFBquery idiom: ask Owlery for instance IDs matching the OWL class expression first, then fetch image metadata from Neo4j for those specific IDs. Owlery's reasoner handles the subclass closure with proper OWL semantics (equivalence classes, defined classes, etc.) — matching how the Neurons*Here family, ImagesThatDevelopFrom, and TractsNervesInnervatingHere already work via _owlery_query_to_results(query_instances=True). Side benefit: the per-row 'parent' column now reports the actual class each instance is typed as (often a subclass of the queried class) rather than the queried class itself — better for v2 display. No test changes needed; existing tests on FBbt_00003748 (medulla) still pass — count may go up because subclasses are now included.
1 parent b521a0b commit e822f37

1 file changed

Lines changed: 45 additions & 14 deletions

File tree

src/vfbquery/vfb_queries.py

Lines changed: 45 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2051,31 +2051,62 @@ def get_term_info(short_form: str, preview: bool = True, force_refresh: bool = F
20512051
@with_solr_cache('instances')
20522052
def get_instances(short_form: str, return_dataframe=True, limit: int = -1):
20532053
"""
2054-
Retrieves available instances for the given class short form.
2055-
Uses SOLR term_info data when Neo4j is unavailable (fallback mode).
2054+
Retrieves available image-bearing instances for the given class short form.
2055+
2056+
Subclass closure is resolved via Owlery's reasoner (consistent with the
2057+
`Neurons*Here`, `ImagesThatDevelopFrom`, `TractsNervesInnervatingHere`
2058+
family that all use ``_owlery_query_to_results(query_instances=True)``).
2059+
Per-instance image metadata is then fetched from Neo4j in a single
2060+
batched query keyed on the Owlery-returned IDs.
2061+
2062+
Falls back to the SOLR ``term_info`` ``anatomy_channel_image`` extract
2063+
if either Owlery or Neo4j is unavailable.
2064+
20562065
:param short_form: short form of the class
2066+
:param return_dataframe: return a pandas DataFrame if True, otherwise a formatted dict
20572067
:param limit: maximum number of results to return (default -1, returns all results)
20582068
:return: results rows
20592069
"""
20602070

20612071
try:
2062-
# Try to use original Neo4j implementation first
2063-
# Get the total count of rows
2064-
count_query = f"""
2065-
MATCH (i:Individual:has_image)-[:INSTANCEOF]->(p:Class {{ short_form: '{short_form}' }}),
2066-
(i)<-[:depicts]-(:Individual)-[r:in_register_with]->(:Template)
2067-
RETURN COUNT(r) AS total_count
2068-
"""
2069-
count_results = vc.nc.commit_list([count_query])
2070-
count_df = pd.DataFrame.from_records(get_dict_cursor()(count_results))
2071-
total_count = count_df['total_count'][0] if not count_df.empty else 0
2072+
# Step 1: ask Owlery for instance IDs matching the class expression.
2073+
# Owlery's reasoner handles the subclass closure via OWL inference,
2074+
# which is the canonical VFBquery idiom (same path used by the
2075+
# `Neurons*Here`, `ImagesThatDevelopFrom`, `TractsNervesInnervatingHere`
2076+
# etc. via `_owlery_query_to_results(..., query_instances=True)`).
2077+
#
2078+
# Why this matters: the previous Cypher used
2079+
# `(i)-[:INSTANCEOF]->(p:Class {short_form: $id})` — a single-edge
2080+
# match that only sees individuals *directly* typed as the queried
2081+
# class. For any parent class (e.g. mushroom body intrinsic neuron
2082+
# FBbt_00007484, whose individuals are typed Kenyon cell etc.) the
2083+
# query returned 0 even though SOLR had dozens of image rows on
2084+
# file. Asking Owlery first gives us the same subclass-aware result
2085+
# the legacy v2 XMI chain produced, with proper OWL semantics
2086+
# (equivalence classes, defined classes, etc.).
2087+
owl_query = f"<{_short_form_to_iri(short_form)}>"
2088+
instance_ids = vc.vfb.oc.get_instances(query=owl_query, query_by_label=False)
2089+
if not instance_ids:
2090+
if return_dataframe:
2091+
return pd.DataFrame()
2092+
return {
2093+
"headers": _get_instances_headers(),
2094+
"rows": [],
2095+
"count": 0,
2096+
}
20722097

2073-
# Define the main Cypher query
2098+
# Step 2: fetch image metadata for those instances from Neo4j.
20742099
# Pattern: Individual ← depicts ← TemplateChannel → in_register_with → TemplateChannelTemplate → depicts → ActualTemplate
2100+
# The `parent` column now reports the *actual* class each instance
2101+
# is typed as (often a subclass of the queried class) rather than
2102+
# the queried class itself — more useful for v2 display.
2103+
total_count = len(instance_ids)
2104+
20752105
query = f"""
2076-
MATCH (i:Individual:has_image)-[:INSTANCEOF]->(p:Class {{ short_form: '{short_form}' }}),
2106+
MATCH (i:Individual:has_image)-[:INSTANCEOF]->(p:Class),
20772107
(i)<-[:depicts]-(tc:Individual)-[r:in_register_with]->(tct:Template)-[:depicts]->(templ:Template),
20782108
(i)-[:has_source]->(ds:DataSet)
2109+
WHERE i.short_form IN {instance_ids!r}
20792110
OPTIONAL MATCH (i)-[rx:database_cross_reference]->(site:Site)
20802111
OPTIONAL MATCH (ds)-[:license|licence]->(lic:License)
20812112
RETURN i.short_form as id,

0 commit comments

Comments
 (0)