Adopting ELK Stack for blog -Fin(?)-

Cliche

2025-03-05

530 • 0

As I mentioned in the last post, there was a JSON parsing error, but I realized that it wasn't just a parsing error, but the logs weren't being sent to Kibana. After checking again, I realized that the logs generated within FastAPI weren't being sent to Logstash. There are many ways to send them, but I added a handler to send them to Logstash via TCP socket.

Sending logs using TCP

class TCPLogstashHandler(logging.Handler):
    def __init__(self, host, port):
        super().__init__()
        self.host = host
        self.port = port
        self.sock = None
        self._connect()

    def _connect(self):
        try:
            self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            self.sock.connect((self.host, self.port))
        except (socket.error, socket.timeout) as e:
            self.sock = None
            print(f"로그스태시 연결 실패: {e}")

    def emit(self, record):
        try:
            if self.sock is None:
                self._connect()

            if self.sock is not None:
                msg = self.format(record) + '\n'
                self.sock.sendall(msg.encode('utf-8'))
        except (BrokenPipeError, socket.error) as e:
            self.sock = None
            print(f"로그 전송 실패: {e}")

However, I still got JSON parsing errors, so I explicitly set the codec of the Logstash pipeline to json_lines.

input {
  tcp {
    port => 5044
    codec => json_lines {
      charset => "UTF-8"
    }
  }
}

filter {
  # 필터 로직...
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200"]
    index => "fastapi-logs-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
}

After doing this, I checked Kibana and the logs from FastAPI were being collected normally, but the problem was that the log time was displayed in UTC. The reason was that the ELK stack was set to UTC by default, but I solved it by changing the Kibana settings and host settings to KST.

Kibana configuration path: In Stack Management > Advanced Settings, set Timezone for date formatting to Asia/Seoul.

I set the timezone and checked the logs in Kibana about a day or two later due to an exam, and the logs were all messed up.
I was looking for paths that didn't actually exist and getting tons of useless 404 logs.

Our first response was to return a 444 instead of a 404 to give the bots no leeway.

server {
    listen 443 ssl default_server;
    listen [::]:443 ssl default_server;
    server_name _;

    ssl_certificate /etc/nginx/ssl/default.crt;
    ssl_certificate_key /etc/nginx/ssl/default.key;

    return 444;
}

server {
    listen 80 default_server;
    listen [::]:80 default_server;
    server_name _;
    return 444;
}


server {
    listen 80;
    listen [::]:80;
    server_name cliche.life;

    if ($host = cliche.life) {
        return 301 https://$host$request_uri;
    }

    return 444;
}

We changed it to only allow access if the connection is legitimate, and we chose to block direct IP or port connections with 444 instead of 404.
Setting it to 444 disconnects the TCP connection altogether, so it shows up as "disconnected" in the browser, and most bots think it's unresponsive and don't try again.

But that wasn't the only thing, they were also looking at all the sensitive paths, starting with .env, then .git, wp-login, admin, .bak, config/, etc.

Of course, I'm not an idiot, and I don't leave such sensitive settings exposed to the outside world, but these guys were really mean. I couldn't stand the sight of my logs getting dirty by going through paths that have nothing to do with this blog, like wp-login and php settings, so I decided to implement fail2ban.

Introducing Fail2ban

What is Fail2ban?

Fail2ban is an intrusion prevention framework that protects computer servers from brute force attacks. It is written in the Python programming language and can run on POSIX systems with interfaces to packet control systems or locally installed firewalls (iptables or TCP wrappers).

We decided to implement Fail2ban because it seemed to be the best framework for our current log trails.

sudo apt install fail2ban

On Ubuntu environments like mine, you can install fail2ban with this command.

After that, you can create a custom filter, which in my case was

[Definition]
failregex = ^<HOST> .* "(?:GET|POST|HEAD) /(?:\.git/|\?XDEBUG|\?a=|\.env|wp-login|admin/|administrator/|xmlrpc\.php|config/|\.well-known/|console/|_ignition/|\.sql|\.bak|\.php\?|cgi-bin/|\.asp|myadmin|phpmyadmin|adminer|manager/|jenkins/|solr/|%00|%27|%20select%20|%20or%201=1|%20and%201=1|%20from%20|/etc/passwd).*" .*$
ignoreregex =

In this way, I pulled the actual Kibana logs into a csv, analyzed them, and blocked all incoming requests.

You can see the logs like this.

In fact, if you look at the logs, you can see how many requests are coming in just by looking at the numbers blocked in the last 2-3 days.

However, to see this log, I need to run

sudo tail -f /var/log/fail2ban.log

on the host to see this log, I realized that I should have added fail2ban to the ELK stack to manage everything at once. (I'm not going to lie, I thought this would be a good idea)
The reason why I adopted the ELK stack in the first place was because it was unmanageable and annoying to view logs on the host, but I realized that it didn't make sense to turn on the terminal every time I wanted to check the fail2ban logs and connect to the host to view the logs like that.

But the problem was that the ELK stack is running on a Docker container, and Fail2ban is running on the host environment, so the solution was to use filebeat to shoot the logs of Fail2ban to Logstash separately.

There were many twists and turns in the process, such as setting the port of Filebeat to 5044, which caused a conflict with Logstash, which was moving the logs of the existing FastAPI, and shut down Logstash. To prevent this, we set the port of Filebeat to 5045, so the logs could not be sent to Logstash because port 5045 was not open in the Docker container.

Also, since we didn't tag the logs separately, the logs from Fail2ban were mixed in with the logs from the existing FastAPI.

The solution to each of these problems was to modify docker-compose.yml to add the following line.

logstash:
    image: docker.elastic.co/logstash/logstash:8.12.0
    volumes:
      - ./elk/logstash/pipeline:/usr/share/logstash/pipeline
    ports:
      - "5044:5044"  # FastAPI
      - "5045:5045"  # Fail2Ban
    depends_on:
      - elasticsearch
    networks:
      - elk

I opened a separate port for Filebeat in Logstash,

output {
   if "fail2ban" not in [tags] {
    elasticsearch {
      hosts => ["elasticsearch:9200"]
      index => "fastapi-logs-%{+YYYY.MM.dd}"
    }
    stdout { codec => rubydebug }
  }
}

Modified the existing tag settings to only send to fastapi-logs if fail2ban is not included in the tag.

This allowed me to use the ELK stack to collect and visualize logs from both Fail2ban and fastAPI, while adding additional security settings along the way. I was able to see that we were getting a lot more malicious requests than I thought, and I learned that security is not just a dev->deploy process.