Skip to content

Different sizing methods can cause  #260

Description

@BennettJames

Problem

Messages can be configured to be clamped with the parameter max_message_length. When set, this will truncate all messages with this value as a length.

However, it is then validated against bytesize, which is different then string size. This means that even with a max message length set to say 250k, well under the ~260k limit, you can still exceed the limit if the message has many characters > 1 byte in length.

Steps to replicate

Here's a simplified reproduction of the issue:

def truncate_and_validate(msg, truncation, limit)
    if truncation
        msg = msg.slice(0, truncation)
    end

    if msg.bytesize > limit
        raise "message too long"
    end

    return msg
end

puts truncate_and_validate("abcd", 2, 2) # outputs 'ab'
puts truncate_and_validate("❤️❤️❤️❤️", 2, 2) # throws 'message to long'

Expected Behavior or What you need to ask

Line length clamping and validation should use the same underlying values. E.g. do this for truncation:

if @max_message_length
  message = message.byteslice(0, @max_message_length)
end

One downside to this is it can output broken UTF - in this case, might want to do some validation to ensure the truncation is done to avoid invalid formatting.

Using Fluentd and CloudWatchLogs plugin versions

  • OS version
  • Bare Metal or within Docker or Kubernetes or others?
  • Fluentd v0.12 or v0.14/v1.0
    • paste result of fluentd --version or td-agent --version
  • Dependent gem versions
    • paste boot log of fluentd or td-agent
    • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions