How to deal with spam

Hi,

Since few years, I am running web services and I have faced a phenomenon that can be handled in different ways. In modern web applications there are so many threats that you need to pay attention and there are also many services or SaaS that will do the work for you. Generally it’s recommended to do not reinvent the wheel when it’s about security matters. I agree, but depending on size of your architecture/service/business I’m sure you can handle it on your own for some part of it. Honestly at the beginning when I faced some kind of networks attacks for the very first time I had barely no way to deal with it and I paid a well-known service which made the job for me. Lately when I was no longer under attack I thought about a solution that could help me and at least reduce the number of attacks and spam on my web services.

Dos Attacks

On internet there are many networks threats. The most well know are DoS Attacks Denial of Service. Basically, there are attackers who are targeting your web apps and they are trying to get your resources (memory, cpu, bandwidth) fully used. Then you are no longer able to deliver the service or the service is running slowly and the QoS quality of service is not acceptable. There are many type of DoS attacks and it’s hard to cover all of them but as network engineer you have to take care of it in order to keep your services up and running.

Spam

I’ll say that spam could be considered as a form of Dos Attack but for me it’s different. Dos Attacks is for me more network oriented meaning you are supposed to received and handle a ton of network request (SYN/ACK) as packets but not directly targeted especially on your web app whereas spam is more oriented to your running web services and will make abusive usage of it. Eg: registrations, sign in attempts, upload of files etc..

Both situation above happened to me for the first one it came few years ago. I was running https://simply-debrid.com , a SaaS for downloading over multiple filesharing services. Once a day, I received a blackmail message from a group asking me to pay for stop being under Dos and that they will monitor me and cover my ass if I was paying them monthly. Of course, I did not answered to this mail and started to deal with the issue since my website started to work slowly and going down from time to time. In order to deal with this I worked with Cloudflare see this review too I made few years ago. It a security SaaS where you can plug their services on top of your web server, then based on rules they are supposed to avoid threats and challenge bots or bad users. Honestly when I was supposed to implement this in a chaos atmosphere I remember it did not went well and it was not that easy to do it but I managed it and Cloudflare was up and running on top of my website. Cloudflare is acting like a proxy and supercharge your website it will act as a CDN Content Delivery Network and will filter visitors and block threats. Ok I used Cloudflare for a while and then stopped to use them. Why? Only because for me I was no longer a target and I thought I can handle it by myself for free (at least for spam issues) which lately finally represents 90% of attacks on my services.

Nowadays web apps and even mobiles one are well structured where there is:

  • Front-end
  • Back-end

Generally, front connects and retrieves data from back through REST APIs. This sounds pretty good but you have to also take care of security on both side. It important to check inputs on the front before sending requests to the back because on one hand it will save bandwidth while avoiding to send queries that are not well formatted or not expected and on the other hands you might avoid some security breach or injections. Checks does not ends on the front side, because if front is connected to the back you can also ‘spoof’ or ‘bypass’ it while directly talking to your APIs. That’s why again you HAVE to double check your input datas on your back side. According to me this is the MINIMUM for security matters on that kind of architecture.

Then you can monitor attempts and fail call on your APIs and critical inputs. This is on what I worked over the past few years when I tried to deal with spam issues.

Anti-spam strategy

I am not an expert about security matters but in all cases you have to keep your ideas, schemas and architectures simple. Later make them grow and become more complex once you are looking for advanced treatments of your problem. Programming is all about this building a simple solution to solve a problem, and then adjust or aggregate solutions to solve bigger problems. Also there is the common CAPTCHA but first it will kill your user experience and captcha is not the complete solution when you are APIs structured out of if you are checking auth with a response token on your api side telling that people which is currently requesting is a valid user (completed the captcha challenge).

  1. Create a function that will temporarily log those attempts
  2. Identify critical endpoints
  3. Periodically check your logged attempts based on rules, log them as blocked or not and clean the other one if they are not matching criteria
  4. Update your infra/website app architecture to dynamically read your blocked items and not letting them access your resources for a while (depending on the amount of time you want to block people)

Create a function that will log and identify people. I recommend you to log extra datas in the aim of identifying the kind of error or fail attempts. On my side as I said it was pretty simple but I was logging this:
IP/user agent/number of attempts/type of resource (api, login, registration etc.)

When my logging function is triggered it works in this way:

Is this guy who just failed to sign in on my website already tried to sign in?

  • YES - Retrieve his row attempt and just add another attempt to the number of attempt column
  • NO - Just add it as a new logged user

Generally, a ‘unique’ identifier was IP/user agent, even if it’s not exactly right it uncommon to have an IP with different user agent in a small amount of time (that was the case for bot spamming me).

Once your function is ready identify critical endpoints and put your logging function in fail attempts of login, registration or core functions of your app. This is up to you to identify parts of your website that will trigger queries to database or make calculations, things that cost money to you.

Create an external job cron or whatever that will periodically read logged users, block those who has a number of attempts higher or equal to a given criteria and clean the rest. eg: block people who may have like 100 attempts in 5 minutes, block them for 6 hours. Also, I was massively blocking some people from another rule. Actually if your criteria are only based on number of attempts you can hypothetically clean and get unblock some malicious users. While observing this I found that some users were spamming few times under my criteria but keeps spamming for a long time. This is why you can also aggregate your results and group people by user agent or any other common data that they could share. On my side these criteria were checked only when I was ‘under spam’ mode, meaning that if I had like more than 300 different rows in my table the spam mode was considered and then I was gathering people with same user agent. Among those if the attempts was higher than 20 for example, all people who were sharing exactly the same user agent and currently spamming my services were also blocked. I have been doing this in Node.js on a MySQL database and that was working pretty well.

You are now logging users, blocking those who are doing too much attempts, you now need to notify to your web app, servers that you do not allow those guys to use your resources for a while. Honestly there is so many ways to do it and you are free to use a solution depending on your needs.

How I handled blocked users

For now, the process is simple I am logging/blocking/cleaning users dynamically and also reading users dynamically. You can block people on the web app layer meaning you will check IPs with PHP, Node.js or whatever and if it among blocked user you are not delivering the resource. Also, you can make it on the web server layer Nginx or Apache, I have been a user of Nginx for quite a long time and there is directive to do this. You can even load a *.conf file to do so and dynamically change that conf and reload it. On my side, I decided to do it in another way, since my service is spread on various servers, I decided to directly block people on input on the system/network layer meaning I was blocking them directly from IPTABLES.

That means my blocked users were shared from the database to all of my instances/servers and they were periodically blocked/unblocked. This was translated into IPTABLES easily. Also note that sometimes if there is too many people to block/unblock it can take times to iptables to log all of your entries. I used ipset which will bind blocked people to your IPTABLES and ipset is strong too handle a lot of entries.

That’s it for today, this is a brief overview about how I worked and handle spam issues on my web apps. Hope you enjoy it and also that it will give you some ideas/ways to manage your security.