The problem has been formulated as a grid world in which the agents can learn the movement of the mouse and click on the reCAPTCHA button to receive a high score. The performance of the agent is studied on varying the cell size of the world. The paper shows that the performance drops when the agent takes big steps toward the goal. Finally, a divide and conquer strategy is used to defeat the reCAPTCHA system for any grid resolution.
Researchers have produced a plausible formalization of the problem as a Markov Decision Process (MDP) that can be solved using advanced RL algorithms. Then, a new environment is introduced that simulates the user experience with websites that have reCAPTCHA system enabled. Finally, it is analyzed how RL agents learn or fail to defeat Google reCAPTCHA.
In order to pass the reCAPTCHA test, a human user is required to move the mouse starting from an initial position then perform a sequence of steps until the user reaches the reCAPTCHA check-box and clicks on it. Based on how the interaction goes, the reCAPTCHA system rewards the user with a score.
As shown in the figure, the point where the mouse is the starting point and goal is the position of reCAPTCHA. A grid is constructed where all the pixels between these two points is a possible position for the mouse. It is assumed in the paper that a normal user will not necessarily move
the mouse pixel by pixel, hence, cell size is defined that refers to the number of pixels between these two consecutive positions.
Agent’s mouse movement
After this, a browser page will be opened at each episode with the user mouse at a random position. The agent then takes in a sequence of actions until it reaches the reCAPTCHA or the time limit. Once the episode is complete, the user will receive a feedback of the reCAPTCHA algorithm as any normal human user would.
Researchers trained a Reinforce agent on a grid world of a specific size. The results presented in the paper are success rates across different 1000 runs. For the experiment to be successful, the agent would have to defeat the reCAPTCHA and obtain a score of 0.9. As per the results of the experiment, the discount factor achieved was 0.99, thereby, successfully defeating the reCAPTCHA.
“Our proposed method achieves a success rate of 97.4% on a 100 × 100 grid and 96.7% on a 1000 × 1000 screen resolution”, states the researchers.
For more information, check out the official research paper.
Google researchers propose building service robots with reinforcement learning to help people with mobility impairment
Facebook researchers show random methods without any training can outperform modern sentence embeddings models for sentence classification
Researchers release unCaptcha2, a tool that uses Google’s speech-to-text API to bypass the reCAPTCHA audio challenge