feat(aws): add AWS Security integration by vish4004 · Pull Request #173 · Autohive-AI/autohive-integrations

vish4004 · 2026-02-17T22:40:09Z

Summary

New AWS Security monitoring integration with 20 actions across 5 AWS services:
- Security Hub (4): get_findings, get_finding_details, update_finding_workflow, get_insights
- GuardDuty (4): list_detectors, list_guardduty_findings, get_guardduty_finding_details, archive_findings
- CloudWatch (5): list_metrics, get_metric_data, describe_alarms, get_alarm_history, set_alarm_state
- CloudWatch Logs (3): describe_log_groups, filter_log_events, get_log_events
- CloudTrail (4): lookup_events, describe_trails, get_trail_status, get_event_selectors
Custom auth with AWS access key, secret key, and region
Follows humanitix multi-file pattern with actions/ subdirectory
Includes comprehensive config.json schemas, test suite, README with IAM policy, and integration icon
Tested and verified working on the beta site

Test plan

All 20 actions registered and matching between config.json and code
Python syntax validation passes on all files
Integration uploaded and tested on Autohive beta site
Security Hub, GuardDuty, CloudWatch, CloudTrail actions confirmed working
Auth credential flow validated end-to-end

…dWatch, and CloudTrail

…ries description

- Add missing autohive-integrations-sdk to requirements.txt - Validate AWS credentials before creating boto3 client - Parallelize get_insights result fetches with asyncio.gather - Batch update_finding_workflow lookups in chunks of 100 - Add error_code to all 20 output schemas in config.json - Add docstrings to all action handler classes

- Remove root __init__.py (not used by working integrations) - Rename integration variable to 'aws' matching entry point name - Update all action imports to 'from aws import aws' - Update all decorators to @aws.action() - Remove required array from auth fields schema

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 982b0a353a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

TheRealAgentK · 2026-02-18T00:57:41Z

Can we please run philbot against it, too?

phillip-haydon · 2026-02-18T03:41:56Z

🤖 Automated Code Review

Overall Assessment

Verdict: ✅ Patch is correct - The code is functional and free of blocking issues. The integration follows security best practices and demonstrates good memory management. However, there are performance optimizations that should be addressed.

Confidence Score: 0.92/1.0

📊 Review Summary

Category	Status	Critical Issues	Notes
Security	✅ Pass	0	Excellent credential handling and error management
Performance	⚠️ Needs Improvement	1 High, 1 Medium	Client creation overhead and sequential batching
Memory	✅ Pass	0	Good pagination and bounded data handling
General Code Quality	✅ Pass	0	Well-structured and maintainable

🔴 High Priority Issues

[P1] Boto3 client created on every action execution

File: aws/helpers.py (Lines 11-22)
Impact: Increased latency and resource overhead per request. Each action creates a new boto3 client, which involves connection setup, credential resolution, and HTTP session initialization (~50-200ms overhead).

Current code:

def create_boto3_client(context: ExecutionContext, service_name: str):
    credentials = context.auth.get("credentials", {})
    access_key = credentials.get("aws_access_key_id")
    secret_key = credentials.get("aws_secret_access_key")
    if not access_key or not secret_key:
        raise ValueError(
            "AWS credentials are missing: aws_access_key_id and aws_secret_access_key are required"
        )
    return boto3.client(
        service_name,
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
        region_name=credentials.get("aws_region", "us-east-1")
    )

Recommendation: Consider caching boto3 clients per (credentials_hash, service_name, region) tuple. Since boto3 clients are thread-safe for making API calls, they can be reused across requests with the same credentials:

import hashlib

# Simple client cache - in production, consider TTL-based eviction
_client_cache: dict[tuple[str, str], Any] = {}

def _get_credentials_key(context: ExecutionContext) -> str:
    """Generate a cache key from credentials."""
    credentials = context.auth.get("credentials", {})
    key_parts = (
        credentials.get("aws_access_key_id", ""),
        credentials.get("aws_secret_access_key", ""),
        credentials.get("aws_region", "us-east-1")
    )
    return hashlib.sha256(":".join(key_parts).encode()).hexdigest()

def create_boto3_client(context: ExecutionContext, service_name: str):
    credentials = context.auth.get("credentials", {})
    access_key = credentials.get("aws_access_key_id")
    secret_key = credentials.get("aws_secret_access_key")
    if not access_key or not secret_key:
        raise ValueError(
            "AWS credentials are missing: aws_access_key_id and aws_secret_access_key are required"
        )
    
    cache_key = (_get_credentials_key(context), service_name)
    if cache_key not in _client_cache:
        _client_cache[cache_key] = boto3.client(
            service_name,
            aws_access_key_id=access_key,
            aws_secret_access_key=secret_key,
            region_name=credentials.get("aws_region", "us-east-1")
        )
    return _client_cache[cache_key]

Expected improvement: Eliminates client initialization overhead for subsequent calls with the same credentials.

🟡 Medium Priority Issues

[P2] Sequential API calls in update_finding_workflow could be parallelized

File: aws/actions/security_hub.py (Lines 68-82)
Impact: When updating many findings across multiple batches, each batch lookup is performed sequentially, adding latency proportional to the number of batches.

Current code:

# Look up findings in batches of 100 (AWS API limit) to get ProductArn
findings = []
for i in range(0, len(finding_arns), 100):
    batch = finding_arns[i:i + 100]
    lookup_kwargs = {
        "Filters": {
            "Id": [
                {"Value": arn, "Comparison": "EQUALS"}
                for arn in batch
            ]
        },
        "MaxResults": len(batch)
    }
    lookup_response = await run_sync(client.get_findings, **lookup_kwargs)
    findings.extend(lookup_response.get("Findings", []))

Recommendation: Parallelize the batch lookups using asyncio.gather, similar to the pattern already used in get_insights:

# Look up findings in batches of 100 (AWS API limit) to get ProductArn
async def lookup_batch(batch):
    lookup_kwargs = {
        "Filters": {
            "Id": [
                {"Value": arn, "Comparison": "EQUALS"}
                for arn in batch
            ]
        },
        "MaxResults": len(batch)
    }
    lookup_response = await run_sync(client.get_findings, **lookup_kwargs)
    return lookup_response.get("Findings", [])

batches = [finding_arns[i:i + 100] for i in range(0, len(finding_arns), 100)]
batch_results = await asyncio.gather(*[lookup_batch(batch) for batch in batches])
findings = [f for batch_findings in batch_results for f in batch_findings]

Expected improvement: For N batches, reduces latency from O(N × API_latency) to O(API_latency) (parallel execution).

[P2] Unbounded finding lookup in update_finding_workflow

File: aws/actions/security_hub.py (Lines 73-90)
Impact: If a user passes a very large list of finding_arns, this could accumulate significant memory and processing time.

Recommendation: Add input validation to limit finding_arns to a reasonable maximum (e.g., 1000):

if len(finding_arns) > 1000:
    return error_result(ValueError("Maximum 1000 finding ARNs allowed per request"))

🟢 Low Priority Issues

[P3] Missing batch size handling for >50 finding IDs in GuardDuty

File: aws/actions/guardduty.py (Lines 48-58)
Impact: The GuardDuty GetFindings API has a limit of 50 finding IDs per request. If a user passes more than 50 IDs, the API will return an error.

Recommendation: Either validate the input length or batch the requests:

async def execute(self, inputs: Dict[str, Any], context: ExecutionContext):
    try:
        client = create_boto3_client(context, "guardduty")
        finding_ids = inputs["finding_ids"]
        
        # GuardDuty GetFindings has a limit of 50 IDs per request
        if len(finding_ids) > 50:
            # Batch the requests
            all_findings = []
            for i in range(0, len(finding_ids), 50):
                batch = finding_ids[i:i + 50]
                kwargs = {
                    "DetectorId": inputs["detector_id"],
                    "FindingIds": batch
                }
                response = await run_sync(client.get_findings, **kwargs)
                all_findings.extend(response.get("Findings", []))
            return success_result({"findings": all_findings})
        
        kwargs = {
            "DetectorId": inputs["detector_id"],
            "FindingIds": finding_ids
        }
        response = await run_sync(client.get_findings, **kwargs)

Alternatively, add "maxItems": 50 to the finding_ids schema in config.json.

[P3] list_metrics missing pagination limit parameter

File: aws/actions/cloudwatch.py (Lines 14-27)
Impact: The CloudWatch ListMetrics API can return up to 500 metrics per page by default. Without a limit, large AWS accounts could return excessive data.

Recommendation: Add a max_results parameter to the config.json schema and implementation for consistency with other actions.

[P3] Inconsistent pagination limits across actions

File: aws/config.json
Impact: Some actions have max_results with defined limits, others don't, creating inconsistent user experience.

Current state:

Action	Has max_results	Default	Maximum
get_findings	✅	20	100
get_insights	✅	20	-
list_detectors	✅	50	-
list_guardduty_findings	✅	50	-
list_metrics	❌	-	-
describe_alarms	✅	50	-
describe_log_groups	✅	50	50
filter_log_events	✅	50	10000
lookup_events	✅	50	50

Recommendation: Add max_results to list_metrics for consistency and to prevent unbounded responses.

✅ Positive Observations

Security Best Practices

Excellent credential handling: AWS credentials are retrieved securely from the execution context and passed directly to boto3 without being logged or stored
Proper input validation: User inputs are properly typed and passed to boto3 methods as structured parameters, mitigating injection risks
Safe error handling: Exceptions are caught and returned as structured error results, preventing exposure of stack traces or sensitive internal state
No data leakage: No sensitive data (findings, logs, credentials) is logged to stdout/stderr

Memory Management

Good pagination support: All list operations return next_token for pagination instead of fetching all results
Bounded defaults: All actions have sensible max_results/limit defaults (20-50)
No static caches: No module-level caches or singletons that could grow unbounded
Proper async execution: Uses run_in_executor for blocking boto3 calls, preventing thread pool exhaustion

Code Quality

Well-structured: Follows the humanitix multi-file pattern with actions/ subdirectory
Good async patterns: Correctly uses run_in_executor for blocking boto3 calls
Comprehensive testing: Includes test suite with good coverage
Good documentation: README includes IAM policy and usage examples

Excellent Pattern: Parallel Insight Fetching

File: aws/actions/security_hub.py (Lines 117-143)

The get_insights action correctly uses asyncio.gather to fetch insight results in parallel:

enriched_insights = await asyncio.gather(
    *[fetch_insight_result(insight) for insight in insights]
)

This pattern should be replicated in other actions that perform batch operations (like update_finding_workflow).

📋 Summary of Findings

{
  "findings": [
    {
      "title": "[P1] Boto3 client created on every action execution",
      "priority": 1,
      "confidence_score": 0.95,
      "file": "aws/helpers.py",
      "lines": "11-22"
    },
    {
      "title": "[P2] Sequential API calls in update_finding_workflow could be parallelized",
      "priority": 2,
      "confidence_score": 0.90,
      "file": "aws/actions/security_hub.py",
      "lines": "68-82"
    },
    {
      "title": "[P2] Unbounded finding lookup in update_finding_workflow",
      "priority": 2,
      "confidence_score": 0.85,
      "file": "aws/actions/security_hub.py",
      "lines": "73-90"
    },
    {
      "title": "[P3] Missing batch size handling for >50 finding IDs in GuardDuty",
      "priority": 3,
      "confidence_score": 0.80,
      "file": "aws/actions/guardduty.py",
      "lines": "48-58"
    },
    {
      "title": "[P3] list_metrics missing pagination limit parameter",
      "priority": 3,
      "confidence_score": 0.75,
      "file": "aws/actions/cloudwatch.py",
      "lines": "14-27"
    },
    {
      "title": "[P3] Inconsistent pagination limits across actions",
      "priority": 3,
      "confidence_score": 0.70,
      "file": "aws/config.json",
      "lines": "N/A"
    }
  ],
  "overall_correctness": "patch is correct",
  "overall_explanation": "The code is functional, secure, and follows best practices. The identified issues are optimization opportunities rather than bugs. The integration will work correctly in production, but addressing the P1 and P2 issues will significantly improve performance.",
  "overall_confidence_score": 0.92
}

🎯 Recommended Action Items

Before Merge:

Consider implementing boto3 client caching (P1 issue)
Parallelize batch lookups in update_finding_workflow (P2 issue)

Post-Merge (Nice to Have):

Add input validation for finding ARN limits
Add batch handling for GuardDuty finding IDs >50
Add max_results parameter to list_metrics
Standardize pagination limits across all actions

This is an automated review from Mrs. Reviewer in Autohive. The review was conducted by specialized AI agents focusing on security, performance, memory management, and general code quality.

vish4004 · 2026-02-18T03:50:36Z

Thanks for the thorough review! The patch confidence score is appreciated.

P1 (client caching): Noted for a future optimization pass. For v1, per-call client creation keeps things simple and avoids credential/region caching edge cases on Lambda.

P2/P3 items: Most are already handled (chunked batching, next_token pagination) or are edge cases unlikely in practice (e.g. >50 detector IDs). We'll revisit if performance becomes a concern at scale.

Overall a clean bill of health — thanks!

TheRealAgentK · 2026-02-19T21:35:54Z

P1 (client caching): Noted for a future optimization pass. For v1, per-call client creation keeps things simple and avoids credential/region caching edge cases on Lambda.

[P1] Boto3 client created on every action execution

@vish4004 - could you please create a GH issue for this future change.

Edit: As we've discussed - in lambdas this is prob not required, but if this was used in a container, it might make sense. I think a ticket should be created to be looked at at a later stage.

Another edit: new information has been surfaced (implication of caching in the context of AWS orgs) that have led us towards not doing this at all for now.

TheRealAgentK · 2026-02-19T22:02:13Z

The P2 items, I'd not deal with right now

The P3 (max boundaries) - items are low hanging fruit, I'd say. Having validation around the AWS limits would stop problems earlier in the flow before requests even hit the AWS APIs.

TheRealAgentK

Left some comments re my views.

I'd prefer the P3s to be addressed, but it's not a hill I'm going to die on 😄

vish4004 · 2026-02-19T23:36:07Z

@TheRealAgentK I don't think any of the issues are P1 in terms of impact it has, but an interpretation of an over eager bot trying to please.

Re caching, while the bot thinks it as an edge case, however given how AWS sub orgs and regions work wherein a single access key and secret can have cross org/region access, so caching would generate other issues such as getting locked into a region or org. Right now AWS default's to us-east or root org and user can ask for a org or region in each tool call. This would have been different if were using OIDC auth that AWS supports.

As for the limits AWS Api changes on a continuous pace and in our case it's the boto3 client that abstracting away the calls, which handlers the errors gracefully, and will also pass that to an the agent which can then handle then actively limits in runtime. I think that's better approach than handling the limits for each call manually. We are also not locking the boto3 sdk version as well which or for any other packages as well which would be needed as well if we are setting up limits.

TheRealAgentK · 2026-02-19T23:54:21Z

Re caching, while the bot thinks it as an edge case, however given how AWS sub orgs and regions work wherein a single access key and secret can have cross org/region access, so caching would generate other issues such as getting locked into a region or org. Right now AWS default's to us-east or root org and user can ask for a org or region in each tool call. This would have been different if were using OIDC auth that AWS supports.

Ok, that's a good reason for not caching then. Let's scrap the idea of the P1 for now.

…ials

TheRealAgentK · 2026-03-05T23:39:35Z

@vish4004

I pushed a few more commits from changes due to the new integration tooling into the branch. Please check that they are still ok - most are rule exclusions for valid code that linters and security checkers don't interpret correctly to 💯

TheRealAgentK · 2026-03-05T23:40:25Z

Tooling checks now:

----------------------------------------
Checking: /home/kai/code/autohive/integrations-autohive/aws/
----------------------------------------

📦 Installing dependencies...

🐍 Checking Python syntax...
   ✅ Syntax OK

📥 Checking imports...
   ✅ Imports OK

📄 Checking JSON files...
   ✅ JSON files OK

🔍 Linting with ruff...
   ✅ Lint OK

🎨 Checking formatting with ruff...
   ✅ Formatting OK

🔒 Scanning for security issues with bandit...
   ✅ Security OK

🛡 Checking dependencies for vulnerabilities with pip-audit...
   ✅ Dependencies OK

🔗 Checking config-code sync...
   ⚠  Action 'describe_trails': parameter 'include_shadow_trails' is optional in schema but accessed with inputs["include_shadow_trails"] (will raise KeyError if not provided)
   ✅ Config-code sync OK

and

Validating 1 integration(s)...

============================================================
Integration: aws
============================================================

Errors (1):
  ❌ icon.png must be 512x512 pixels (found 256x256)

Warnings (1):
  ⚠ Integration should use 'Integration.load()' to load the integration

============================================================
SUMMARY
============================================================
Integrations validated: 1
Total errors: 1
Total warnings: 1

TheRealAgentK · 2026-03-05T23:41:00Z

Last thing to do in my view (apart from functional testing on your end) is getting a 512x512 icon. @vish4004

The __init__.py caused a circular import chain: context.py -> aws.aws -> actions -> security_hub -> aws.aws (still initializing) This integration is loaded as an entry point, not as a package, so __init__.py is not needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…bility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-11T22:53:06Z

🔍 Integration Validation Results

Commit: f07b338540ac5db0dc78e342710d19a9c83dba42 · fix(aws): resolve CI failures - add noqa F401, ruff format, fix describe_trails input access
Updated: 2026-03-12T00:01:13Z

Changed directories: aws

Check	Result
Structure	⚠️ Passed with warnings
Code	✅ Passed
README	✅ Passed

⚠️ Structure Check output

Validating 1 integration(s)...

============================================================
Integration: aws
============================================================

Warnings (1):
  ⚠️ Integration should use 'Integration.load()' to load the integration

============================================================
SUMMARY
============================================================
Integrations validated: 1
Total errors: 0
Total warnings: 1

⚠️ Validation passed with warnings - please review

✅ Code Check output

----------------------------------------
Checking: aws
----------------------------------------

📦 Installing dependencies...

🐍 Checking Python syntax...
   ✅ Syntax OK

📥 Checking imports...
   ✅ Imports OK

📄 Checking JSON files...
   ✅ JSON files OK

🔍 Linting with ruff...
   ✅ Lint OK

🎨 Checking formatting with ruff...
   ✅ Formatting OK

🔒 Scanning for security issues with bandit...
   ✅ Security OK

🛡️ Checking dependencies for vulnerabilities with pip-audit...
   ✅ Dependencies OK

🔗 Checking config-code sync...
   ✅ Config-code sync OK

========================================
✅ CODE CHECK PASSED
========================================

✅ README Check output

========================================
✅ README CHECK PASSED
========================================

…ibe_trails input access

TheRealAgentK

LGTM now

vish4004 added 9 commits February 13, 2026 15:06

feat(aws): scaffold directory structure, helpers, and entry point

e93cf84

feat(aws): add config with auth and all 20 action schemas

8e0dd01

feat(aws): implement all 20 actions for Security Hub, GuardDuty, Clou…

72772da

…dWatch, and CloudTrail

feat(aws): add tests and README

fd8ba5f

fix(aws): clean up imports, use get_running_loop, fix metric_data_que…

1383c06

…ries description

feat(aws): add integration icon

daed130

docs: add AWS Security integration to repo README

982b0a3

vish4004 requested review from ProRedCat and TheRealAgentK February 17, 2026 22:41

github-code-quality Bot found potential problems Feb 17, 2026

View reviewed changes

Comment thread aws/aws.py Dismissed

Comment thread aws/tests/context.py Fixed

chatgpt-codex-connector Bot reviewed Feb 17, 2026

View reviewed changes

Comment thread aws/actions/security_hub.py

Comment thread aws/tests/context.py Outdated

Comment thread aws/tests/test_aws.py

fix(aws): update stale import in test context

60b736d

github-code-quality Bot found potential problems Feb 17, 2026

View reviewed changes

Comment thread aws/tests/context.py Fixed

TheRealAgentK reviewed Feb 19, 2026

View reviewed changes

vish4004 and others added 6 commits March 6, 2026 09:59

Merge branch 'master' into feature/aws-security-integration

ccd1827

Add missing __init__.py to aws integration

e9d3bbb

Pin autohive-integrations-sdk version in requirements.txt

581c26a

Add noqa: F401 to side-effect imports

087f529

Add E402 to noqa for intentional late import and apply ruff formatting

ccd13b8

Add nosec B105 to suppress false positive on test placeholder credent…

361b2df

…ials

TheRealAgentK and others added 3 commits March 6, 2026 13:59

Remove aws/__init__.py to fix circular import

7a9a587

The __init__.py caused a circular import chain: context.py -> aws.aws -> actions -> security_hub -> aws.aws (still initializing) This integration is loaded as an entry point, not as a package, so __init__.py is not needed.

fix(aws): update integration icon to new AWS logo

68575a5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(aws): alias integration variable in test context for test compati…

60ad4f5

…bility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-code-quality Bot found potential problems Mar 6, 2026

View reviewed changes

Comment thread aws/tests/context.py Dismissed

fix(aws): update integration icon to 512x512 as required

3b459d6

fix(aws): resolve CI failures - add noqa F401, ruff format, fix descr…

f07b338

…ibe_trails input access

TheRealAgentK approved these changes Mar 12, 2026

View reviewed changes

TheRealAgentK merged commit 4544a50 into master Mar 12, 2026
3 checks passed

TheRealAgentK deleted the feature/aws-security-integration branch March 12, 2026 00:32

Uh oh!

Conversation

vish4004 commented Feb 17, 2026

Summary

Test plan

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheRealAgentK commented Feb 18, 2026

Uh oh!

phillip-haydon commented Feb 18, 2026

🤖 Automated Code Review

Overall Assessment

📊 Review Summary

🔴 High Priority Issues

[P1] Boto3 client created on every action execution

🟡 Medium Priority Issues

[P2] Sequential API calls in update_finding_workflow could be parallelized

[P2] Unbounded finding lookup in update_finding_workflow

🟢 Low Priority Issues

[P3] Missing batch size handling for >50 finding IDs in GuardDuty

[P3] list_metrics missing pagination limit parameter

[P3] Inconsistent pagination limits across actions

✅ Positive Observations

Security Best Practices

Memory Management

Code Quality

Excellent Pattern: Parallel Insight Fetching

📋 Summary of Findings

🎯 Recommended Action Items

Uh oh!

vish4004 commented Feb 18, 2026

Uh oh!

TheRealAgentK commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheRealAgentK commented Feb 19, 2026

Uh oh!

TheRealAgentK left a comment

Choose a reason for hiding this comment

Uh oh!

vish4004 commented Feb 19, 2026

Uh oh!

TheRealAgentK commented Feb 19, 2026

Uh oh!

TheRealAgentK commented Mar 5, 2026

Uh oh!

TheRealAgentK commented Mar 5, 2026

Uh oh!

TheRealAgentK commented Mar 5, 2026

Uh oh!

Uh oh!

github-actions Bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Integration Validation Results

Uh oh!

TheRealAgentK left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TheRealAgentK commented Feb 19, 2026 •

edited

Loading

github-actions Bot commented Mar 11, 2026 •

edited

Loading