A Simple Idea to Improve CAPTCHAs

A CAPTCHA is an automated test which is used to separate computers from humans. It is created in such a way that for humans it is easy, but for computers as difficult as possible. For example, given an image with distorted text, a human can easily read the presented word, but computers will have difficulties (the amount of difficulties depends on how the image is distorted).

One use for CAPTCHAs is to prevent spamming with the idea that anything automated (such as mass-spamming computers) won't pass the puzzle, and therefore will not be able to inject their trash-messages into a service. However, CAPTCHAs don't work so well in practice anymore.

As demonstrated by the recent break of Gmail CAPTCHA, even really good CAPTCHA systems are vulnerable to a "mechanical turk" attack. This is an attack where, instead of computers attempting to solve the problem, the attacker "outsources" the solving to a group of real people.

For example, one way to conduct such an attack is the following: to enter a porn site, one must solve a CAPTCHA puzzle. This puzzle is actually one which comes from a valid service, and is reproduced at the porn site. People who want to see porn solve the CAPTCHA, thus giving the attacker a proper and validated answer to the presented puzzle. (Of course the system relies on the people giving a correct answer)

An alternative way is to just pay the people to do such work. Generally, if there is some incentive which the attacker can provide to people, it is possible to employ a number of people for solving the CAPTCHAs without them possibly knowing the real use of the solved puzzles. This kind of attack is very hard to beat.

However, there is at least a way to make the problem harder simply by enforcing a strict time window in which the CAPTCHA must be solved. In a nutshell, the server must create a random ID and remember it along with a timestamp, and present the ID with the CAPTCHA. When the user solves it, he provides the ID to the server and the server notes the timestamp (from its own clock) when the solution (along with the ID) is received. The server can now look up the original time, based on the received ID. With knowledge of the current time, the server can deduce the amount of time it took for the user to answer. If the solution has not been given within a certain time window, the solution is rejected (even if it was correct).


For example, if a user has 20 seconds time to solve the CAPTCHA, it will give a very tight time window for an outsourcing/mechanical turk attack. The act of spamming and logging into a porn site would need to coincide within the 20 seconds, minus the time needed to solve the puzzle, minus the time for network latency. Surely, with today's traffic amounts, obtaining such simultaneousness is possible, especially if a single spammer controls a large number of sites which can aid the mechanical turk attack. However, the amount of such opportunities would be less than now, when the CAPTCHA is not tied to time. As the amount of opportunities lessens, the amount of spam resulting from successful attacks would be cut down.

Another aspect of the time windowed CAPTCHA is that it raises the requirements for efficient logistics when moving the puzzle to the group of people. This translates to increased costs, which is undesired by the attacker.

The presented idea is not perfect by any means: it would still not solve the mechanical turk attacks where real people are coupled with the spamming system to solve the puzzles in real- or near-realtime.

Further Notes

There are some practical issues with the time-window approach. Obviously to prevent a denial of service the old, unused ID to timestamp mappings would need to be cleared from the database periodically.

Also, for practical reasons, the authentication would need to be two-step: the first step where a user writes a message (or fills in the user information when registering, or whatever the CAPTCHA puzzle protects), the second step to display and process the CAPTCHA. This is because a user can spend a lot of time actually writing the information, and that should not be penalized. Besides, proof of having solved the puzzle is not needed until the user actually attempts to submit the information.