You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the fluentbit config that i have (attached below), I consistently see log loss without any error logs in fluentbit logs. My client script sends 10 requests of 10000 logs per request to the HTTP input (each log ~2KB), and gets back 201 success response. Which means that the http input should have the logs in its buffer. However the much logging is something that my rewrite_tag filter can't handle (memory given is 10MB), so it pauses, and resumes later. Since all the client requests get a 201 response, I would expect the logs to start getting processed when rewrite_tag resumes again. But somehow, I never get the logs delivered to S3.
This is my fluentbit config (please dont mind unoptimized values, i have been trying to create any scenario where http input would pause):
[SERVICE]
Flush 3
Grace 30
Log_Level info
Daemon off
Parsers_File parsers.conf
storage.path /Users/ashish/fluent-bit/flb-storage/
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 5M
storage.max_chunks_up 1
[INPUT]
name http
listen 0.0.0.0
port 9890
buffer_max_size 10M
buffer_chunk_size 2M
mem_buf_limit 10MB
storage.pause_on_chunks_overlimit on
[FILTER]
Name rewrite_tag
Match application.*
Rule $log .*s3LogGroup.* console_log_s3 false
Emitter_Mem_Buf_Limit 20M
# actual logs are wrapped in a 'log' key, get the actual log
[FILTER]
Name parser
Match console_log_s3
Key_Name log
Parser json
Reserve_Data False
# filter out logs that have s3LogGroup key
[FILTER]
Name grep
Match console_log_s3_http
Regex s3LogGroup .+
# parse timestamp and add year, month, day, hour to the record
[FILTER]
Name lua
Match console_log_s3_http
Script parse_timestamp.lua
Call parse_timestamp
[FILTER]
Name rewrite_tag
Match console_log_s3_http
Rule $s3LogGroup ^(.*)$ console_log_s3_http.$1.$year.$month.$day.$hour false
Emitter_Mem_Buf_Limit 10M
# all other input processing should ignore logs that have s3LogGroup in it
[FILTER]
name grep
match application.*
exclude log /.*s3LogGroup.*/
[OUTPUT]
Name s3
Match console_log_s3_http.*
bucket some_bucket_name
region us-west-1
json_date_key timestamp
json_date_format iso8601
total_file_size 10M
upload_timeout 10s
use_put_object On
s3_key_format /$TAG[1]/$TAG[2]/$TAG[3]/$TAG[4]/$TAG[5]/%M:%S-$UUID.log
retry_limit 5
store_dir /tmp/fluent-bit/s3
This is the script that i use to generate data and send the logs:
#!/bin/bash
X=10 # Maximum number of attempts
TOTAL_LOGS=10000 # Total number of logs to send
PAYLOAD_FILE="/tmp/log_payload.json" # File to store the JSON payload
MAX_RETRIES=10 # Maximum number of retries per iteration
RETRY_DELAY=5 # Delay in seconds between retries
LOG_LINE='a 2KB json log that has key "s3LogGroup" set'
# Function to create the payload file with N log entries
create_payload_file() {
local num_logs=$1
echo "Creating payload file with $num_logs log entries..."
# Start with opening bracket
echo "[" > $PAYLOAD_FILE
# Append log entries
i=1
while [ $i -le $num_logs ]; do
echo -n "$LOG_LINE" >> $PAYLOAD_FILE
if [ $i -lt $num_logs ]; then
echo "," >> $PAYLOAD_FILE
fi
i=$((i + 1))
done
# Close with closing bracket
echo "]" >> $PAYLOAD_FILE
echo "Payload file created, size: $(du -h $PAYLOAD_FILE | cut -f1)"
}
# Function to send request and retry until status code 201
send_request_with_retry() {
local retry_count=0
local success=false
while [ $retry_count -lt $MAX_RETRIES ] && [ "$success" = false ]; do
# Send the request using the file
echo "Attempt $((retry_count + 1)): Sending request with $TOTAL_LOGS logs from file"
START_TIME=$(date +%s.%N)
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -d @$PAYLOAD_FILE -X POST -H "content-type: application/json" http://localhost:9890/console_log_s3_http)
END_TIME=$(date +%s.%N)
# Calculate time taken in seconds
TIME_TAKEN=$(echo "$END_TIME - $START_TIME" | bc)
echo "HTTP Status Code: $HTTP_STATUS"
echo "Request took: ${TIME_TAKEN} seconds"
if [ "$HTTP_STATUS" -eq 201 ]; then
echo "Success! Received status code 201."
success=true
else
retry_count=$((retry_count + 1))
if [ $retry_count -lt $MAX_RETRIES ]; then
echo "Status code was not 201. Retrying in $RETRY_DELAY seconds..."
sleep $RETRY_DELAY
else
echo "Maximum retry attempts reached. Moving to next iteration."
fi
fi
done
return $([ "$success" = true ] && echo 0 || echo 1)
}
# Create the payload file with all logs (do this only once)
create_payload_file $TOTAL_LOGS
count=0
successful_iterations=0
TOTAL_START_TIME=$(date +%s.%N)
while [ $count -lt $X ]; do
count=$((count + 1))
echo "======================================="
echo "Iteration $count of $X"
# Send request with retry logic
if send_request_with_retry; then
successful_iterations=$((successful_iterations + 1))
fi
done
TOTAL_END_TIME=$(date +%s.%N)
TOTAL_TIME=$(echo "$TOTAL_END_TIME - $TOTAL_START_TIME" | bc)
echo "======================================="
echo "Summary:"
echo "Total iterations: $X"
echo "Successful iterations (status 201): $successful_iterations"
echo "Failed iterations: $((X - successful_iterations))"
echo "Total time taken: ${TOTAL_TIME} seconds"
echo "Average time per iteration: $(echo "$TOTAL_TIME / $X" | bc -l) seconds"
# Clean up at the end
rm -f $PAYLOAD_FILE
Expected behavior
Either the HTTP input should return non success status code if the pipeline is paused or buffers are full, or all the logs should be delivered, or there should be some error. Screenshots
Your Environment
Running it locally on my Mac.
Version used: 4.0
Configuration:
Environment name and version (e.g. Kubernetes? What version?):
Server type and version:
Operating System and version:
Filters and plugins:
Additional context
The text was updated successfully, but these errors were encountered:
Bug Report
Describe the bug
i have a scenario where I am able to consistently produce log loss without any errors in fluentbit logs.
To Reproduce
With the fluentbit config that i have (attached below), I consistently see log loss without any error logs in fluentbit logs. My client script sends 10 requests of 10000 logs per request to the HTTP input (each log ~2KB), and gets back 201 success response. Which means that the http input should have the logs in its buffer. However the much logging is something that my rewrite_tag filter can't handle (memory given is 10MB), so it pauses, and resumes later. Since all the client requests get a 201 response, I would expect the logs to start getting processed when rewrite_tag resumes again. But somehow, I never get the logs delivered to S3.
These are the fluentbit logs:
This is my fluentbit config (please dont mind unoptimized values, i have been trying to create any scenario where http input would pause):
This is the script that i use to generate data and send the logs:
Expected behavior
Either the HTTP input should return non success status code if the pipeline is paused or buffers are full, or all the logs should be delivered, or there should be some error.
Screenshots
Your Environment
Running it locally on my Mac.
Additional context
The text was updated successfully, but these errors were encountered: