Skip to content

Add one-shot analysis entry point: analyze(path) or analyze(session_id, dataset_handle) #32

@rajeeja

Description

@rajeeja

Problem

The most natural user request — "analyze this dataset" — today requires the user (or agent) to know and orchestrate 5+ tool calls:

  1. inspect_mesh
  2. validate_dataset
  3. inspect_variable
  4. calculate_area
  5. calculate_zonal_mean
  6. plot_variable

This is not how the best tools work. The user should say one thing and get a complete answer.

Proposed tool: analyze_dataset

def analyze_dataset(
    grid_path: str | None = None,
    data_path: str | None = None,
    session_id: str | None = None,
    dataset_handle: str | None = None,
    use_remote: bool = False,
) -> dict:

Runs the full pipeline internally:

  • Inspect mesh topology
  • Validate dataset
  • Discover variables
  • Calculate area statistics
  • Compute zonal mean for first face-centered variable
  • Render mesh and variable plots (as base64 PNG)
  • Return everything in one structured result

Why this matters

This is what makes the difference between a library wrapper and a scientific assistant. A climate scientist with no uxarray knowledge should be able to point at a file and get a complete picture of their data in one call.

Relationship to run_scientific_agent

run_scientific_agent is the autonomous LLM-driven version of this. analyze_dataset is the deterministic, always-works version. Both should exist — one for simple use, one for complex reasoning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions