-
Notifications
You must be signed in to change notification settings - Fork 16
mesh(iot): AWS IoT provisioning hardening (CA pin + thing-name regex + scoped policy) (8/9 of #195 split) #228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
b83c8f5
e50b873
d260de2
8ef82a8
22cb7f5
cfa24cc
07bf435
e89e0f4
bd38184
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -37,7 +37,23 @@ | |
| ``STRANDS_MESH_CAMERA_S3_PREFIX`` | ||
| Optional prefix inside the bucket (defaults to ``""``). | ||
| ``STRANDS_MESH_CAMERA_PRESIGN_TTL`` | ||
| Seconds the presigned GET URL stays valid (default 3600). | ||
| Seconds the presigned GET URL stays valid. Defaults to | ||
| :data:`DEFAULT_PRESIGN_TTL_SECONDS` (60s); clamped at | ||
| :data:`MAX_PRESIGN_TTL_SECONDS` (1 hour) to prevent accidental | ||
| day- or week-long URLs. Pass ``presign_ttl=N`` to override | ||
| explicitly; the env-var fallback only applies when the kwarg is | ||
| ``None``. | ||
|
|
||
| Bucket-ownership threat model | ||
| ----------------------------- | ||
| The S3 PutObject path in :meth:`CameraOffloader._upload_frame` does | ||
| not pass an ``ACL=`` kwarg. The contract for the offload bucket is | ||
| that the operator configures it with object-ownership control | ||
| ``BucketOwnerEnforced`` (and a bucket policy that denies public | ||
| ACLs); that enforcement is out of scope for this library because | ||
| deployments differ on whether the bucket is shared with non-mesh | ||
| producers. A future code-side ``ACL="private"`` + ``ChecksumAlgorithm`` | ||
| hardening is tracked in #249. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
@@ -50,7 +66,13 @@ | |
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| DEFAULT_PRESIGN_TTL_SECONDS = 3600 | ||
| # Lifetime of the presigned GET URL we hand out for each camera frame. | ||
| # Kept deliberately short -- anyone who briefly captures a /ref MQTT message | ||
| # can use the URL inside this window. ``STRANDS_MESH_CAMERA_PRESIGN_TTL`` | ||
| # overrides for higher-latency consumers, clamped to MAX_PRESIGN_TTL_SECONDS | ||
| # (1 hour) to prevent accidental day- or week-long URLs. | ||
| DEFAULT_PRESIGN_TTL_SECONDS = 60 | ||
| MAX_PRESIGN_TTL_SECONDS = 3600 | ||
|
|
||
|
|
||
| class CameraOffloader: | ||
|
|
@@ -70,9 +92,37 @@ def __init__( | |
| ) -> None: | ||
| self.bucket = bucket or os.getenv("STRANDS_MESH_CAMERA_S3_BUCKET", "") | ||
| self.prefix = (prefix or os.getenv("STRANDS_MESH_CAMERA_S3_PREFIX") or "").strip("/") | ||
| self.presign_ttl = presign_ttl or int( | ||
| os.getenv("STRANDS_MESH_CAMERA_PRESIGN_TTL", str(DEFAULT_PRESIGN_TTL_SECONDS)) | ||
| ) | ||
| if presign_ttl is not None: | ||
| ttl_raw = presign_ttl | ||
| else: | ||
| raw_env = os.getenv("STRANDS_MESH_CAMERA_PRESIGN_TTL") | ||
| if raw_env is None or raw_env == "": | ||
| ttl_raw = DEFAULT_PRESIGN_TTL_SECONDS | ||
| else: | ||
| try: | ||
|
cagataycali marked this conversation as resolved.
|
||
| ttl_raw = int(raw_env) | ||
| except ValueError: | ||
| logger.warning( | ||
| "[camera_offload] STRANDS_MESH_CAMERA_PRESIGN_TTL=%r is not an integer; using default %ds", | ||
| raw_env, | ||
| DEFAULT_PRESIGN_TTL_SECONDS, | ||
| ) | ||
| ttl_raw = DEFAULT_PRESIGN_TTL_SECONDS | ||
| if ttl_raw > MAX_PRESIGN_TTL_SECONDS: | ||
|
cagataycali marked this conversation as resolved.
|
||
| logger.warning( | ||
| "[camera_offload] STRANDS_MESH_CAMERA_PRESIGN_TTL=%d > %d cap; clamping", | ||
| ttl_raw, | ||
| MAX_PRESIGN_TTL_SECONDS, | ||
| ) | ||
| ttl_raw = MAX_PRESIGN_TTL_SECONDS | ||
| if ttl_raw < 1: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Negative-clamp warning is asymmetric: env-var triggers WARNING, kwarg-supplied negative is silent. The R7 fix made the env-var path log a WARNING when But Two options:
Option 2 preserves the R1 invariant. Either is better than the current asymmetry; the current code lets |
||
| if presign_ttl is None: | ||
| logger.warning( | ||
| "[camera_offload] STRANDS_MESH_CAMERA_PRESIGN_TTL=%d < 1 floor; clamping to 1s", | ||
| ttl_raw, | ||
| ) | ||
| ttl_raw = 1 | ||
| self.presign_ttl = ttl_raw | ||
| self.region = region or os.getenv("AWS_REGION", os.getenv("AWS_DEFAULT_REGION")) | ||
| self._s3: Any | None = None | ||
|
|
||
|
|
@@ -120,7 +170,7 @@ def upload_frame(self, peer_id: str, cam_name: str, jpeg_bytes: bytes, ts: float | |
| Body=jpeg_bytes, | ||
| ContentType="image/jpeg", | ||
| ) | ||
| except Exception as exc: | ||
| except Exception as exc: # noqa: BLE001 -- boto3 raises ClientError, EndpointConnectionError, NoCredentialsError, etc.; offload is best-effort | ||
|
cagataycali marked this conversation as resolved.
|
||
| logger.debug("[camera_offload] put_object %s failed: %s", key, exc) | ||
| return None | ||
|
|
||
|
|
@@ -130,7 +180,7 @@ def upload_frame(self, peer_id: str, cam_name: str, jpeg_bytes: bytes, ts: float | |
| Params={"Bucket": self.bucket, "Key": key}, | ||
| ExpiresIn=self.presign_ttl, | ||
| ) | ||
| except Exception as exc: | ||
| except Exception as exc: # noqa: BLE001 -- boto3 ClientError / NoCredentialsError; presign is best-effort | ||
| logger.debug("[camera_offload] presign %s failed: %s", key, exc) | ||
| url = None | ||
|
|
||
|
|
@@ -181,7 +231,7 @@ def _publish_cameras_once_with_offload() -> None: | |
| # call it to preserve any user customisation that might have been added). | ||
| try: | ||
| original() | ||
| except Exception as exc: | ||
| except Exception as exc: # noqa: BLE001 -- original is user-customised; offload must not block on user code | ||
| logger.debug("[camera_offload] original _publish_cameras_once raised: %s", exc) | ||
|
|
||
| # Now do the S3 offload + ref publish per camera. | ||
|
|
@@ -195,13 +245,14 @@ def _publish_cameras_once_with_offload() -> None: | |
|
|
||
| try: | ||
| obs = inner.get_observation() | ||
| except Exception: | ||
| except Exception as exc: # noqa: BLE001 -- LeRobot get_observation() may raise hardware-specific errors | ||
| logger.debug("[camera_offload] get_observation failed: %s", exc) | ||
| return | ||
|
|
||
| try: | ||
| import cv2 | ||
| except Exception: | ||
| logger.debug("[camera_offload] cv2 unavailable — skipping S3 upload") | ||
| except ImportError: | ||
| logger.debug("[camera_offload] cv2 unavailable -- skipping S3 upload") | ||
| return | ||
|
|
||
| transport = current_transport() | ||
|
|
@@ -231,8 +282,17 @@ def _publish_cameras_once_with_offload() -> None: | |
| if ref is None: | ||
| continue | ||
| ref["shape"] = list(shape) | ||
| # Publish the /ref topic via the transport layer. | ||
| # On ``iot`` the IoT Policy's AllowOwnTopics statement | ||
| # bounds writes to ``strands/<ThingName>/*`` (covers | ||
| # ``camera/*/ref`` via the trailing wildcard); on | ||
| # ``bridge`` the Zenoh ACL adds a LAN-side gate on top. | ||
| # ``enable_for_mesh`` early-returns unless the active | ||
| # backend is one of those two, so the publish reaches | ||
| # the wire only when at least one of these gates is in | ||
| # force. | ||
| transport.put(f"strands/{mesh.peer_id}/camera/{cam_name}/ref", ref) | ||
| except Exception as exc: | ||
| except Exception as exc: # noqa: BLE001 -- numpy / cv2 / transport.put can raise diverse errors per frame; offload is best-effort | ||
| logger.debug( | ||
| "[camera_offload] %s/%s offload failed: %s", | ||
| mesh.peer_id, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc-code drift on the negative/zero clamp behaviour.
This row says 'a
0is clamped to 1s' -- accurate for both kwarg and env-var paths. But the R7 fix incamera_offload.py:118-124introduced an asymmetric WARNING posture: env-var path of0(or any< 1) emits a WARNING, while explicitpresign_ttl=0kwarg silently clamps to 1 (preserving the R1 sentinel for the deliberate-zero case).The negative-value behaviour is similarly split: env
=-99-> WARN+clamp; kwarg=-99-> silent clamp. R8 deferred this asymmetry to #259, which is fine, but the README is the operator-facing source of truth -- an operator who sees=-99typo'd into the env file gets a WARNING and finds the row here. They'll be surprised that passingpresign_ttl=-99from a config-loader sees no warning.Suggest tightening this row to:
Capped at 3600s (1h); env-var values < 1 (including non-integer) fall back to 1s with a WARNING; explicit kwarg of 0 is silently clamped to 1s (deliberate sentinel for 'always-expired'). Negative kwarg values clamp silently -- tracked in #259.Non-blocking; tracked. Flagging because the README is shipped to operators and the asymmetry is exactly the kind of surprise that wastes 20 minutes of debugging.