|
| 1 | +Fetches observations for a statistical variable from Data Commons. |
| 2 | + |
| 3 | +**CRITICAL: Always validate variable-place combinations first** |
| 4 | +- You **MUST** call `search_indicators` first to verify that the variable exists for the specified place |
| 5 | +- Only use DCIDs returned by `search_indicators` - never guess or assume variable-place combinations |
| 6 | +- This ensures data availability and prevents errors from invalid combinations |
| 7 | + |
| 8 | +This tool can operate in two primary modes: |
| 9 | +1. **Single Place Mode**: Get data for one specific place (e.g., "Population of California"). |
| 10 | +2. **Child Places Mode**: Get data for all child places of a certain type within a parent place (e.g., "Population of all counties in California"). |
| 11 | + |
| 12 | +### Core Logic & Rules |
| 13 | + |
| 14 | +* **Variable Selection**: You **must** provide the `variable_dcid`. |
| 15 | + * Variable DCIDs are unique identifiers for statistical variables in Data Commons and are returned by prior calls to the |
| 16 | + `search_indicators` tool. |
| 17 | + |
| 18 | +* **Place Selection**: You **must** provide the `place_dcid`. |
| 19 | + * **Important Note for Bilateral Data**: When fetching data for bilateral variables (e.g., exports from one country to another), |
| 20 | + the `variable_dcid` often encodes one of the places (e.g., `TradeExports_FRA` refers to exports *to* France). |
| 21 | + In such cases, the `place_dcid` parameter in `get_observations` should specify the *other* place involved in the bilateral relationship |
| 22 | + (e.g., the exporter country, such as 'USA' for exports *from* USA). |
| 23 | + The `search_indicators` tool's `places_with_data` field can help identify which place is the appropriate observation source for `place_dcid`. |
| 24 | + |
| 25 | +* **Mode Selection**: |
| 26 | + * To get data for the specified place (e.g., California), **do not** provide `child_place_type`. |
| 27 | + * To get data for all its children (e.g., all counties in California), you **must also** provide the `child_place_type` (e.g., "County"). |
| 28 | + **CRITICAL:** Before calling `get_observations` with `child_place_type`, you **MUST** first call `search_indicators` with child sampling to determine the correct child place type. |
| 29 | + **Child Type Determination Logic:** |
| 30 | + 1. Use the `dcid_place_type_mappings` field from the `search_indicators` response to examine the types of sampled child places |
| 31 | + 2. Use the type that is common to ALL sampled child places |
| 32 | + 3. If more than one type is common to all child places, use the most specific type |
| 33 | + 4. If there is no common type across all sampled child places, use the majority type (50%+ threshold) if there's a clear majority |
| 34 | + 5. If there is no common type and no clear majority, this tool cannot be called with child-place mode - fall back to single-place mode `get_observations` calls for each place |
| 35 | + **Note:** If you used child sampling in `search_indicators` to validate variable existence, you should still get data for ALL children of that type, not just the sampled subset. |
| 36 | + |
| 37 | +* **Data Volume Constraint**: When using **Child Places Mode** (when `child_place_type` is set), you **must** be conservative with your date range to avoid requesting too much data. |
| 38 | + * Avoid requesting `'all'` data via the `date` parameter. |
| 39 | + * **Instead, you must either request the `'latest'` data or provide a specific, bounded date range.** |
| 40 | + |
| 41 | +* **Date Filtering**: The tool filters observations by date using the following priority: |
| 42 | + 1. **`date`**: The `date` parameter is required and can be one of the enum values 'all', 'latest', 'range', or a date string in the format 'YYYY', 'YYYY-MM', or 'YYYY-MM-DD'. |
| 43 | + 2. **Date Range**: If `date` is set to 'range', you must specify a date range using `date_range_start` and/or `date_range_end`. |
| 44 | + * If only `date_range_start` is specified, then the response will contain all observations starting at and after that date (inclusive). |
| 45 | + * If only `date_range_end` is specified, then the response will contain all observations before and up to that date (inclusive). |
| 46 | + * If both are specified, the response contains observations within the provided range (inclusive). |
| 47 | + * Dates must be in `YYYY`, `YYYY-MM`, or `YYYY-MM-DD` format. |
| 48 | + 3. **Default Behavior**: If you do not provide **any** date parameters (`date`, `date_range_start`, or `date_range_end`), the tool will automatically fetch only the `'latest'` observation. |
| 49 | + |
| 50 | +Args: |
| 51 | + variable_dcid (str, required): The unique identifier (DCID) of the statistical variable. |
| 52 | + place_dcid (str, required): The DCID of the place. |
| 53 | + child_place_type (str, optional): The type of child places to get data for. **Use this to switch to Child Places Mode.** |
| 54 | + source_override (str, optional): An optional source ID to force the use of a specific data source. |
| 55 | + date (str, optional): An optional date filter. Accepts 'all', 'latest', 'range', or single date values of the format 'YYYY', 'YYYY-MM', or 'YYYY-MM-DD'. Defaults to 'latest' if no date parameters are provided. |
| 56 | + date_range_start (str, optional): The start date for a range (inclusive). **Used only if `date` is set to'range'.** |
| 57 | + date_range_end (str, optional): The end date for a range (inclusive). **Used only if `date` is set to'range'.** |
| 58 | + |
| 59 | +Returns: |
| 60 | + The fetched observation data including: |
| 61 | + - `variable`: Details about the statistical variable requested. |
| 62 | + - `place_observations`: A list of observations, one entry per place. Each entry contains: |
| 63 | + - `place`: Details about the observed place (DCID, name, type). |
| 64 | + - `time_series`: A list of `(date, value)` tuples, where `date` is a string (e.g., "2022-01-01") and `value` is a float. |
| 65 | + - `source_metadata`: Information about the primary data source used. |
| 66 | + - `alternative_sources`: Details about other available data sources. |
0 commit comments