Skip to content

Consolidate DataFetchers#626

Merged
TheJeran merged 7 commits into
mainfrom
jp/getarray
Apr 15, 2026
Merged

Consolidate DataFetchers#626
TheJeran merged 7 commits into
mainfrom
jp/getarray

Conversation

@TheJeran
Copy link
Copy Markdown
Collaborator

@TheJeran TheJeran commented Apr 15, 2026

This addresses #602 by merging the logic of the netcdf and zarr fetchers. I dunno if I should make the fetcher objects a class or keep it as it is. Cause the current approach is a little odd.

I've left my code for a class approach in if it makes sense to switch later.

@TheJeran TheJeran changed the title Jp/getarray Consolidate DataFetchers Apr 15, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the data fetching architecture by introducing a centralized GetArray utility and dedicated fetchers for Zarr and NetCDF data, alongside adding necessary security headers for SharedArrayBuffer in the Next.js configuration. The review identifies several issues in the new implementation, including a bug where a fillValue of zero is incorrectly handled, inconsistent variable usage during chunk fetching when overrides are present, and potential crashes from missing null checks on LRU cache lookups. Further improvements include removing unnecessary await keywords on synchronous functions, deleting debug logs, and cleaning up redundant code.

Comment thread src/components/zarr/GetArray.ts Outdated
Comment thread src/components/zarr/GetArray.ts Outdated
Comment thread src/components/zarr/GetArray.ts Outdated
Comment on lines +102 to +106
const tempChunk = cache.get(`${cacheBase}_chunk_${id}`);
tempChunk.scaling = scalingFactor;
RescaleArray(tempChunk.data, delta);
cache.set(`${cacheBase}_chunk_${id}`, tempChunk);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is no check to ensure tempChunk exists before accessing its properties. Since the cache uses an LRU policy, entries can be evicted at any time. Accessing tempChunk.scaling on an evicted entry will cause a runtime crash.

                            const tempChunk = cache.get(`${cacheBase}_chunk_${id}`);
                            if (tempChunk) {
                                tempChunk.scaling = scalingFactor;
                                RescaleArray(tempChunk.data, delta);
                                cache.set(`${cacheBase}_chunk_${id}`, tempChunk);
                            }

Comment thread src/components/plots/AnalysisWG.tsx Outdated
Comment on lines +101 to +102
console.log("thisone")
const inputArray = analysisMode ? analysisArray : await GetCurrentArray(analysisStore)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Remove the debug log and the unnecessary await on the synchronous GetCurrentArray function.

Suggested change
console.log("thisone")
const inputArray = analysisMode ? analysisArray : await GetCurrentArray(analysisStore)
const inputArray = analysisMode ? analysisArray : GetCurrentArray(analysisStore)

Comment thread src/components/plots/AnalysisWG.tsx Outdated
const is2D = outputShape.length === 2
async function Analyze(){
const dataArray = analysisMode ? analysisArray : GetCurrentArray(analysisStore)
const dataArray = analysisMode ? analysisArray : await GetCurrentArray(analysisStore)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Remove the unnecessary await on the synchronous GetCurrentArray function.

Suggested change
const dataArray = analysisMode ? analysisArray : await GetCurrentArray(analysisStore)
const dataArray = analysisMode ? analysisArray : GetCurrentArray(analysisStore)

Comment on lines +125 to +207
export class ZarrFetcher {
private outVar: zarr.Array<zarr.DataType, any> | undefined;
private variable: string;

constructor(variable: string) {
this.variable = variable;
}

async init(){
const { currentStore } = useZarrStore.getState();
const group = await currentStore;
const tempOutVar = await zarr.open(group.resolve(this.variable), { kind: "array" });
this.outVar = tempOutVar;
}

async getMetadata(): Promise<any> {
if (!this.outVar) {
await this.init();
if (!this.outVar){
throw new Error(`Init failed`);
}
}
if (!this.outVar.is("number") && !this.outVar.is("bigint")) {
throw new Error(`Unsupported data type: ${this.outVar.dtype}`);
}

const symbols = Object.getOwnPropertySymbols(this.outVar);
const contextSymbol = symbols.find(s => s.toString().includes("zarrita.context"));
const fillValue = contextSymbol && !Number.isNaN((this.outVar as any)[contextSymbol]?.fill_value)
? (this.outVar as any)[contextSymbol].fill_value
: NaN;
return {
shape: this.outVar.shape,
chunkShape: GetSize(this.outVar)[2],
fillValue,
dtype: this.outVar.dtype,
};
}

async fetchChunk({
rank,
chunkShape,
x,
y,
z,
xDimIndex,
yDimIndex,
zDimIndex,
idx4D,
}: any): Promise<FetchOutput> {
if (!this.outVar) {
await this.init();
if (!this.outVar){
throw new Error(`Init failed`);
}
}

const chunkSlice = new Array(rank).fill(0);
chunkSlice[xDimIndex] = zarr.slice(x * chunkShape[xDimIndex], (x + 1) * chunkShape[xDimIndex]);
chunkSlice[yDimIndex] = zarr.slice(y * chunkShape[yDimIndex], (y + 1) * chunkShape[yDimIndex]);
if (zDimIndex >= 0) {
chunkSlice[zDimIndex] = zarr.slice(z * chunkShape[zDimIndex], (z + 1) * chunkShape[zDimIndex]);
}
if (rank >= 4) {
chunkSlice[0] = idx4D;
}

const chunk = await fetchWithRetry(
() => zarr.get(this.outVar!, chunkSlice),
`variable ${this.variable}`,
useGlobalStore.getState().setStatus
);

if (!chunk || chunk.data instanceof BigInt64Array || chunk.data instanceof BigUint64Array) {
throw new Error("BigInt arrays not supported.");
}
return {
data: chunk.data as Float32Array,
shape: chunkShape,
stride: chunk.stride,
};
}
} No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ZarrFetcher class appears to be a redundant implementation marked as a 'possibility' and is currently unused. It should be removed to keep the codebase clean.

Comment thread next.config.ts
Comment on lines +45 to +61
async headers() { // Needed for SharedArrayBuffer. Security reasons won't let sharedArrayBuffer
return [
{
source: '/(.*)',
headers: [
{
key: 'Cross-Origin-Opener-Policy',
value: 'same-origin',
},
{
key: 'Cross-Origin-Embedder-Policy',
value: 'require-corp',
},
],
},
]
},
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is leftover from the SharedBufferArray that will be used for the workers.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to add this in the fetch options of the zarrita fetch store. And maybe remove for now.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm not an expert almost certain this is unrelated to Zarrita.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is a fetch operation, then is zarrita. And we should pass headers directly there, instead of setting this globally. Similar to, https://zarrita.dev/packages/storage.html#adding-authentication-headers

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rephrase. Although I'm not certain what this is doing. It's unrelated to zarrita.

@TheJeran TheJeran linked an issue Apr 15, 2026 that may be closed by this pull request
@TheJeran TheJeran marked this pull request as ready for review April 15, 2026 08:50
@TheJeran TheJeran merged commit 7097f4f into main Apr 15, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate Datafetching Code

2 participants