Skip to content

Bug: xlsx output always appends None for response_time_s when query_time is missing #2891

@mango766

Description

@mango766

Description

In sherlock.py, the --xlsx output path has a logic bug when building the response_time_s list. The condition checks if response_time_s is None, but response_time_s is initialized as an empty list ([]) earlier in the same block — so this check is always False and the empty string fallback is never used.

This means that when a site's query_time is None (e.g. timed-out or errored requests), the raw None value gets appended to the list instead of "". When pandas then writes this to Excel, the cell contains None rather than a blank, which can cause downstream issues when processing the xlsx output.

Steps to reproduce

Run sherlock with --xlsx on a username. Sites that fail to respond (timeout/error) will have None in the response_time_s column instead of an empty string.

Root cause

# Line ~912 in sherlock.py
if response_time_s is None:   # BUG: response_time_s is a list, never None
    response_time_s.append("")
else:
    response_time_s.append(results[site]["status"].query_time)  # appends None

Compare with the CSV block just above it (lines ~882-884), which correctly does:

response_time_s = results[site]["status"].query_time
if response_time_s is None:
    response_time_s = ""

Fix

Change the condition to check the actual value:

if results[site]["status"].query_time is None:
    response_time_s.append("")
else:
    response_time_s.append(results[site]["status"].query_time)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions