reCAPTCHA Privacy — Is it an Oxymoron Now?

Imatge
Àmbits Temàtics

Google’s reCAPT­CHA is by far the most popu­lar CAPT­CHA (Comple­tely Auto­ma­ted Public Turing test to tell Compu­ters and Humans Apart). Accor­ding to Built­With, it’s currently used by more than 15 million websi­tes, and accor­ding to Slin­tel, reCAPT­CHA has a market share of 97.08%.

That being the case, the fact that the French privacy commis­sion CNIL recently said that reCAPT­CHA uses exces­sive perso­nal data for purpo­ses other than secu­rity comes as a wake-up call. It means that website owners wanting to guaran­tee the safety of their users’ perso­nal data may strug­gle to do so.

reCAPT­CHA Privacy Concerns

This isn’t the first time that Google has run into trou­ble with the French regu­la­tor. CNIL fined the company 150 million euros in 2021 (90 million for GOOGLE LLC and 60 million for GOOGLE IRELAND LIMI­TED) because google.fr and youtube.com users couldn’t refuse or accept cookies as easily as GDPR says they should.

It’s a simi­lar story this time too, but in this case, CNIL wasn’t initi­ally looking at Google. It only disco­ve­red reCAPT­CHA privacy concerns (it was sending user data from Euro­pean users to Google’s US servers) as part of its inves­ti­ga­tion into an e-scoo­ter company called Citys­coot. The firm was using reCAPT­CHA on its website and app, but it wasn’t attemp­ting to gain users’ consent to use it and didn’t offer them any infor­ma­tion about what was happe­ning to their data.

It should have done both because the latest version of reCAPT­CHA relies on cookies, and under the EU ePri­vacy Direc­tive, you need to tell users about what cookies you are using and why, as well as gain their consent.

The reCAPT­CHA Privacy Cookie

There are excep­ti­ons to this rule though, and Citys­coot tried to argue that it wasn’t respon­si­ble for the reCAPT­CHA privacy issue because its use of cookies was, “strictly neces­sary (to provide a) service expli­citly reques­ted by the user…” Howe­ver, there is a caveat to this excep­tion which says that “The act of authen­ti­ca­tion must not be taken as an oppor­tu­nity to use the cookie for other secon­dary purpo­ses…”

Since reCAPT­CHA sends appli­ca­tion and device data to Google for analy­sis, Citys­coot couldn’t claim this excep­tion. So, in the opinion of the regu­la­tor, the act of authen­ti­ca­tion was taken as an oppor­tu­nity to use the cookie for addi­ti­o­nal purpo­ses. CNIL found that Citys­coot should have been infor­ming its users about the reCAPT­CHA privacy cookie and giving them the chance to opt-out.

This crea­tes quite a problem for reCAPT­CHA privacy protec­tion. If it requi­res consent, as CNIL says, then that would clear the way for spam­bots to decline it, effec­ti­vely turning it into a secu­rity door that opens for anyone.

Website Trac­king Tech­no­lo­gies

The table below shows details of reCAPT­CHA privacy issues and simi­lar essen­tial tools with poten­ti­ally proble­ma­tic beha­vi­ors.

Name Descrip­tion Addi­ti­o­nal Infor­ma­tion Privacy Risks
reCAPT­CHA Auto­ma­ti­cally diffe­ren­ti­a­tes between humans and inter­net bots. Often inclu­des image challen­ges that humans can complete but bots strug­gle with. Data collec­tion, inclu­ding IP addres­ses, user agent strings, and brow­ser info.

Can be used to put toget­her a profile of users’ online acti­vi­ties.

Google can track user brow­sing habits to cons­truct beha­vior profi­les.

If users are signed into their Google accounts while using reCAPT­CHA, Google could link collec­ted data to their perso­nal profi­les.

Google’s reCAPT­CHA data collec­tion prac­ti­ces are not made clear to users.
Cookie Small text files placed on a user’s device when they visit a website. The website server crea­tes and uses them to track user beha­vior and store infor­ma­tion.  Session cookies are a tempo­rary type. They are dele­ted when the brow­ser is closed. Persis­tent cookies remain even after the brow­ser is closed. They remem­ber user settings and prefe­ren­ces for the next visit. Third-party cookies are set by exter­nal domains (not the ones being visi­ted). Often used for trac­king and adver­ti­sing. Can be used to track users’ online acti­vi­ties across diffe­rent websi­tes.

Builds a profile of brow­sing beha­vior, prefe­ren­ces and inter­ests.

Can share info with third parties, inclu­ding adver­ti­sers for targe­ted ads.

Can collect IP addres­ses, device info, brow­sing history, demo­grap­hic infor­ma­tion and more. 
Pixel Tiny trans­pa­rent images or snip­pets of code embed­ded in emails or web pages. When a user visits a site or opens an e-mail, the pixel trig­gers a request to a server, sending info about the visi­tor’s inter­ac­tion with the content. Often used for trac­king how effec­tive ad campaigns are, gathe­ring analy­tics data, and measu­ring conver­si­ons. Good for gathe­ring user metrics such as page views, clicks, and conver­si­ons. Pixels are often used for retar­ge­ting, which means read­ver­ti­sing a product or service to visi­tors who showed an inter­est in it before.   Pixels can be embed­ded in multi­ple websi­tes and used to create a detai­led profile of user brow­sing beha­vior.

This cross-site trac­king can help build detai­led profi­les that grow into digi­tal finger­prints of their inter­ests that may intrude on privacy.

Pixels can collect IP addres­ses, device info, brow­sing history, and inter­ac­ti­ons with parti­cu­lar content.

May lead to the collec­tion of perso­nally iden­ti­fi­a­ble infor­ma­tion without expli­cit user consent. 
Tag Also refer­red to as UTM codes, trac­king tags or script tags. Code snip­pets embed­ded in a webpa­ge’s HTML or placed in its header or footer. Frequently used to collect data, track analy­tics, serve ads, inte­grate with social media, third-party tools and more. JavaS­cript, HTML, and other tags allow a website to commu­ni­cate with exter­nal plat­forms or servi­ces. Tags can be used to track user beha­vior, measure website perfor­mance, perso­na­lize content, and perform vari­ous marke­ting and analy­sis roles. Can poten­ti­ally gather sensi­tive or perso­nally iden­ti­fi­a­ble infor­ma­tion without users’ expli­cit know­ledge or consent.

Can share data with ad networks, analy­tics provi­ders, and social media plat­forms, poten­ti­ally without their expli­cit consent.

Can track user beha­vior and inter­ac­ti­ons across plat­forms and websi­tes, poten­ti­ally crea­ting detai­led profi­les with users’ online acti­vi­ties, inter­ests, and prefe­ren­ces.
iFrame Crea­tes a region on a webpage where the content of anot­her site can be displayed. Stands for ‘inline frame’ Can be exploi­ted for Cross-Site Scrip­ting (XSS) Attacks, to access or mani­pu­late sensi­tive infor­ma­tion within the host page or perform unaut­ho­ri­zed acti­ons on behalf of the user.

Embed­ded content could gather user cookies, IP addres­ses, and brow­sing patterns for trac­king and profi­ling without users’ know­ledge.

Your website may rely on all these tools to provide essen­tial func­ti­o­na­lity, but you don’t want them to be misu­sed, so how do you square that circle? The best answer may be exter­nal moni­to­ring. Unlike embed­ded solu­ti­ons, exter­nal moni­to­ring can’t be blin­ded to the beha­vi­ors of third-party website compo­nents.

A case in point: the Reflec­tiz plat­form recently iden­ti­fied when the TikTok pixel tried to access the login forms on a finan­cial service company’s website. It was trying to pass on sensi­tive user input data to TikTok’s servers. The Reflec­tiz inves­ti­ga­tion team imme­di­a­tely forwar­ded clear steps to the company to remedy this beha­vior, saving them the finan­cial, legal, and repu­ta­ti­o­nal damage of a data breach.