Requests for Comment/Replace reCAPTCHA with another CAPTCHA

It's recently come to my attention that Miraheze is currently using reCAPTCHA for account creation, and is attempting to replace that reCAPTCHA with a new version of reCAPTCHA.

This is a major problem. reCAPTCHA has absolutely no place on Miraheze for several reasons:

reCAPTCHA is owned by Google, which is a massive tracking/surveillance company. The purpose of reCAPTCHA is for Google to crowdsource their work onto other people and to put trackers and other nasties onto every page that uses it. For the most part, Miraheze is free of these. All requests are in Miraheze and don't require a dependency on a third-party server. Except for reCAPTCHA, which both requires Miraheze's backend to have a dependency on an external server, and for users to be sending requests to Google servers, both breaching their privacy by keeping that data only on Miraheze but by using Miraheze, sending information to Google without their knowledge in the background. Fandom can keep those trackers and other nasties on their site, Miraheze is supposed to be free of that stuff. There's no ads, no clutter, and certainly shouldn't be any of this business.

This also breaks MediaWiki's commitment to w:unobtrusive JavaScript, which MediaWiki has a lot of information on. But as of right now, it is impossible to create an account without enabling JavaScript. This is a major problem for unobtrusive JavaScript which MediaWiki has for a reason and commits to. Being unable to do actions, especially basic actions like creating account, without enabling JavaScript is not okay. Even more damning, reCAPTCHA's JavaScript is proprietary, which requires users to run proprietary code to make an account, blatantly violating and damning Miraheze's free software commitment.

There are other, better ways to prevent spam, including many self hosted captchas, ones that do not require running proprietary JavaScript, are less of a burden or strain to fill out and stop spam better, which Miraheze can discuss and choose a new one to switch to.

Thanks for your time, Naleksuh (talk) 18:03, 7 February 2022 (UTC)

New Captcha
Sign your name on what you think should be the new captcha.

SimpleCaptcha
A simple captcha (that's the name) which requires you to answer a very basic math problem to pass. No dependencies or manual maintenance and works easily.

MathCaptcha
Like SimpleCaptcha, but displays an image with the problem instead of directly outputting text. A bit more maintenance for SRE, but may be more effective at stopping spam.
 * 1) Naleksuh (talk) 19:03, 7 February 2022 (UTC)

FancyCaptcha
A more traditional type of captcha with skewed characters that must be identified. May be annoying to solve and requires some dependencies, but may be more effective at stopping spam than SimpleCaptcha.

QuestyCaptcha
Displays a question. Extremely effective at stopping spam (maybe even more than ReCaptcha), but requires tech debt on questions and users may not know the answer to them.

Discussion

 * ConfirmEdit has some interesting ones, the most popular one of which is QuestyCaptcha but has a bit more free form style. Any thoughts on those? Naleksuh (talk) 18:04, 7 February 2022 (UTC)
 * to my knowledge, support for visual editor is poor or nonexistent in all alternatives. ~ RhinosF1 - (chat)· acc· c -  18:13, 7 February 2022 (UTC)
 * It's my understanding that captchas are only being used to create accounts. Are they elsewhere too? That is even worse. Naleksuh (talk) 18:14, 7 February 2022 (UTC)
 * login, account creation and adding external URLs for new users. This has never been different. ~ RhinosF1 - (chat)· acc· c -  18:16, 7 February 2022 (UTC)
 * That's weird, I don't get any requests to Google servers when on the login screen. I certainly hope there aren't more wiretaps there. I will look into CAPTCHAs on Visual editor (although I'm not 100% sure it would have the desired result- I've never seen a spambot try to use the visual editor before). Naleksuh (talk) 18:19, 7 February 2022 (UTC)
 * Just noting that this comment is made in my capacity as a Miraheze user. From the start, I would note that in order to change to another captcha, an alternative or alternatives should be proposed, and ideally this should have been done by the proposer. Without being presented with a viable alternative, I don't see how we can discuss removing ReCapthca. Or if the purpose of this RfC is for people to propose their own captcha's that's fine, but how will it be clear which captcha is the preferred option which the community will propose to SRE? Reception123 (talk) ( C ) 18:25, 7 February 2022 (UTC)
 * My belief is that this is a request for comment, people can comment on the new captcha. I even gave my own suggestion, which is currently being discussed. "Replace X captcha with Y captcha" would be fundementally flawed for the same reason this is. What if people want to replace X, but oppose Y? That's one of the purposes of this RFC, to establish that. Naleksuh (talk) 18:29, 7 February 2022 (UTC)
 * In that case I think it would be more appropriate to have different sections for each proposal for a new Captcha, so it's clear who supports which Captcha, rather than have a long discussion where everything get's confused. I would've rather phrased this as 1) Do you wish to change from ReCaptcha? 2) Which Captcha would you prefer? Reception123 (talk) ( C ) 18:32, 7 February 2022 (UTC)
 * Fair enough. Let me experiment a bit with types of captcha and then I will add sections for each captcha type. Also, we need to discuss what we think about VisualEditor etc. My personal opinion is that 1) it not supporting VisualEditor is not a huge problem especially as bots usually use source 2) Even if that is a problem, the problems with ReCaptcha outweigh it so still worth switching 3) Problems are to be fixed, not avoided, unless they are out of our control like whatever Google is doing with ReCaptcha. Naleksuh (talk) 18:37, 7 February 2022 (UTC)


 * Dont you think this RfC should be Draftified so that it can be properly formatted as per suggestions from our recently closed RfC regarding RfCs? --  Joseph  TB  CT  CA   18:42, 7 February 2022 (UTC)
 * No, this is a discussion, and requiring draft RFCs failed. Naleksuh (talk) 18:51, 7 February 2022 (UTC)
 * I've just tried SimpleCaptcha, and it seems it works with VisualEditor perfectly fine. I have no idea what you were talking about. Infact, it's exactly the opposite-- the page warns ReCaptcha may break with VisualEditor. Yet another reason to switch. Naleksuh (talk) 18:51, 7 February 2022 (UTC)
 * What indications do you have that SimpleCaptcha/MathCaptcha are effective and can't be easily cracked by bots? I would note that I would oppose this based on MW.org which states that "Note that the display of a trivial maths problem as plaintext yields a captcha which can be trivially solved by automated means; as of 2012, sites using SimpleCaptcha are receiving significant amounts of spam and many automated registrations of spurious new accounts. Wikis currently using this as the default should therefore migrate to one of the other CAPTCHAs." Reception123 (talk) ( C ) 19:16, 7 February 2022 (UTC)
 * I never said SimpleCaptcha can't be cracked. It's called simple for a reason, and some of the other options are better. MathCaptcha is like SimpleCaptcha, but sends an image, which would require OCR to defeat. QuestyCaptcha is thought to be extremely effective, even more so than reCAPTCHA, but does have some drawbacks. Either way, even if it is less effective, it is my belief that reCAPTCHA is something Miraheze cannot use as it goes against its core principles, and that anything else is better. Naleksuh (talk) 19:55, 7 February 2022 (UTC)
 * Yes, the new CAPTCHA is terrible. I don't know if it's still like this, but according to AbuseLog I still see spambots almost at work. Any good alternative is welcome. --YellowFrogger ( talk ) ( ✔ ) 19:40, 7 February 2022 (UTC)
 * I haven't researched them too deeply for background issues that may preclude them here, but what about hCaptcha? From what I'm aware they're at least preferable to reCaptcha at an ideological level, and seem effective as well as high profile enough. --Raidarr (talk) 20:27, 7 February 2022 (UTC)
 * hCaptcha is just a clone of reCAPTCHA, it has most of the same problems as reCAPTCHA and doesn't really solve anything. Naleksuh (talk) 20:34, 7 February 2022 (UTC)
 * This is just silly. What's the point of replacing reCAPTCHA with another CAPTCHA, if it could cause problems later on? --DarkMatterMan4500 (talk) (contribs) 20:47, 7 February 2022 (UTC)
 * Read this to find out. Also, what problems? Naleksuh (talk) 20:49, 7 February 2022 (UTC)
 * The reason ReCAPTCHA (out of all CAPTCHAs) has been implemented on Miraheze is because AIs have become too smart and can very easily decipher any other CAPTCHA systems. Simple, Math and FancyCaptcha all have "scarce effectiveness" per mw:Extension:ConfirmEdit. As for QuestyCaptcha, it'd be difficult to maintain a question that can be translated into different languages, accepts different variations in words in different languages, and that can be updated quickly if cracked. We don't use ReCAPTCHA because we're not aware of alternatives, we use it because AIs can easily crack all other CAPTCHAs so we need an AI-based CAPTCHA to catch another AI. It's confusing but none of the proposed CAPTCHAs will be effective at all and if anything, we might as well have no CAPTCHA. Agent Isai  Talk to me! 21:13, 7 February 2022 (UTC)
 * we might as well have no CAPTCHA Sounds good to me, not like reCAPTCHA actually stopped spambots from signing up.
 * Okay, on a more solution note. My belief is that Miraheze can not continue to use reCAPTCHA, it is violating Miraheze's founding statement and core standards. If another solution is found which works better, great. But continuing to use reCAPTCHA is a problem.
 * I'm curious what Wikimedia is doing to stop spambots. They have no reCAPTCHA and have *even less* of a spambot problem than Miraheze. Naleksuh (talk) 21:31, 7 February 2022 (UTC)
 * At this current moment, we're working to fix our ReCAPTCHA setup but during the first few weeks of ReCAPTCHA v3, we saw a dramatic decline in spambot registrations. We're attempting to migrate to ReCAPTCHA Enterprise to see why that changed but I believe ReCAPTCHA v3 is very effective. Additionally, could you point to the "founding statement"? I am unable to find it. Anyway, the only strong alternative at this moment seems to be hCaptcha which touts itself as a privacy-oriented CAPTCHA service. Do you have any other ones in mind potentially? Agent Isai  Talk to me! 21:40, 7 February 2022 (UTC)
 * All tasks relating to CAPTCHAs should be stalled until this RFC is closed. There is certainly no reason to be upgrading reCAPTCHA when it is up in the air if it should be there at all. hCaptcha does not solve most of the problems laid out in the OP. The only advantage is that people may trust hCaptcha more than Google. If a better solution is found, good, but not the primary purpose. I will try to see what exactly Wikimedia sysadmins have to say about the matter Naleksuh (talk) 21:48, 7 February 2022 (UTC)
 * Being that SRE cannot be coerced or bound to anything by Requests for Comments (which therefore means this RfC is technically out-of-scope), they are under no obligation to stall tasks until after RfCs are closed however, I will make sure SRE closely follows this RfC and implements its outcome where possible. If this RfC were to close in favor of replacing ReCAPTCHA, SRE would likely consider the alternatives however no suitable alternative has been proposed so far and getting rid of ReCAPCTHA completely without a strong alternative is off the table completely. I ask that you please propose alternatives to ReCAPCTHA which are strong and privacy-friendly if that's your main concern and SRE will look into them. Agent Isai  Talk to me! 21:55, 7 February 2022 (UTC)
 * hCaptcha has many of the same problems that reCAPTCHA has : it requires a dependency on an external server, users must enable JavaScript, users must run proprietary code, and are one of the most annoying type to solve of all. The only thing it solves is removing Google trackers and other nasties from Miraheze which have no place here. Wikimedia is currently using FancyCaptcha, and not only has not seen an increase in spam, but actually has less spam than Miraheze. So either there is another factor to it or it is not as easy to crack as you claim. Naleksuh (talk) 22:07, 7 February 2022 (UTC)
 * If FancyCaptcha is serving MediaWiki well, it is surely worth investigating how to duplicate that performance. Otherwise in the event of an unusable/inconclusive RfC or where options presented otherwise are downgrades, I'd consider hCaptcha a solid step even if it does not address the full spirit of this proposal or if it isn't perfect on the whole. De-googling's benefit should not be understated. --Raidarr (talk) 01:25, 8 February 2022 (UTC)
 * FancyCaptcha is probably the least effective of the bunch and the only reason Wikimedia sees so little spam is because they have CheckUsers available at all times to globally rangeblock spambot IP ranges. It seems Naleksuh wants an in-house solution which I sadly don't see happening any time soon as all current CAPTCHA extensions are extremely ineffective and as we also don't have the resources to run an in-house CAPTCHA system. It would have to be an external service backed by AIs. Agent Isai  Talk to me! 06:02, 8 February 2022 (UTC)
 * I never suggested an "in-house solution", although that would certainly be better than continuing to use 🤮reCAPTCHA🤮. Yes, actually using rangeblocks is another alternative. Naleksuh (talk) 06:53, 8 February 2022 (UTC)

Can't we make our own captcha that is made specifically for miraheze MediaWiki? It would be lots of work but it would be worth it.  Anpang 📨 01:08, 8 February 2022 (UTC)
 * What would that captcha look like, and what would make it any different from the existing ones? It seems like it could only be inferior to the existing ones or would just be something that could be options for Questy. And would need to be open source. Naleksuh (talk) 01:16, 8 February 2022 (UTC)
 * I am not very knowledgeable in these more technical fields but my understanding is that simple captchas like the SimpleCaptcha or MathCaptcha are ineffective against smart artificial-intelligence powered bots which are able to easily figure out the 'answers' to these problems proposed. Because of this I have a difficult time understanding why the proposer would promote those ones instead of looking for an alterantive to the ReCaptcha that is also more effective rather than less effective. I also believe that the concerns about Google can be played down especially due to the fact that I am sure that the very vast majority of the users on Miraheze use Google or Microsoft to conduct their daily business anyway and the users who especially avoid these two large companies I am sure are very rare. Regarding the QuestyCaptcha it is said on the MediaWiki.org Extension:QuestyCaptcha page that "Image-based CAPTCHAs have a few vulnerabilities. Bots using optical character recognition can crack them, and the only defense is to make the images harder to read for humans and computers alike". In consequence, without a captcha proposed that is superior to ReCaptcha I do not agree to switch ReCaptcha simply for the reasons given. --DeeM28 (talk) 05:26, 8 February 2022 (UTC)
 * Designing a CAPTCHA system is very difficult and most existing selfhosted CAPTCHA solutions seem to be easily crackable by bots. Agent Isai  Talk to me! 06:02, 8 February 2022 (UTC)
 * Special:Diff/235096 Naleksuh (talk) 06:46, 8 February 2022 (UTC)
 * Special:Diff/235282 Agent Isai  Talk to me! 06:49, 8 February 2022 (UTC)

. All of the above options are worse for usability and worse for spam prevention. Maths questions are likely easily bypassed and it just takes a few tries to program in the answers to the questions. It is at least commonly believed that AIs are better than humans at image-text CAPTCHAs.

All of them require far more human interaction. QuestyCaptcha would probably be the worst as it is difficult for a real user to know in exactly what format the answer is wanted (for instance I ran into one of these and was stumped because it wanted the answer “d4” not “1d4”). FancyCaptcha appears not to have any option for visually impaired users. Maths CAPTCHAs are mainly just annoying, but this is still worse than the current situation.

Three reasons are given for this - dislike of Google, dislike of proprietary software, and dislike of JavaScript. All of them are bad and most of them are simply extreme philosophical ideas incomprehensible to the average person.

First of all, Google. The comment simply makes broad claims about Google as a company and says nothing about ReCAPTCHA. Google has stated that they do not use ReCAPTCHA for tracking or surveillance. They state, “The information collected in connection with your use of the service will be used for improving reCAPTCHA and for general security purposes. It will not be used for personalized advertising by Google.”

But what if they are lying? Well I for one would rather see advertisements for products that I want than for products I don’t - I like stuff that informs me of things that I might actually want to buy. Is that really so evil? In short, I don’t just not care about Google tracking me, I actively want it, because it is good for me.

Secondly, proprietary software. I don’t care whether or not you can read the code or not. It’s as simple as that. Your ability or lack thereof to read the code must have an impact on me for me to care about whether or not something is open source. In fact I would be worse if the code was public because it would be easier to program spambots to bypass it.

Thirdly, JavaScript. It would be ideal if it wasn’t required but seriously JavaScript has been everywhere for decades now. I think the only place where you won’t find it is certain feature phone browsers. In contrast with basically any other consideration it is meaningless. ~ El Komodos Drago (talk to me) 11:09, 8 February 2022 (UTC)
 * So, since has basically confirmed that reCAPTCHA is here because global sysops are not blocking proxy ranges, how about blocking proxies, for example like I am doing here? Then we won't have this problem in the first place and Miraheze's free software and MediaWiki design standard will no longer be tarnished Naleksuh (talk) 05:56, 10 February 2022 (UTC)
 * What do you propose? That we block proxies as we already do per the No open proxies policy and convention of blocking spam ranges? We're already blocking them so I can't see what else we can do short of nominating more active Stewards to check spammers IPs every day and rangeblock them. Sure you can accuse Stewards and Global Sysops of not doing their job but that's not going to change anything unless more people step up to the CVT batting plate. Agent Isai  Talk to me! 06:06, 10 February 2022 (UTC)
 * Well, there has clearly been insufficient blocking, and I'm not convinced that this means reCAPTCHA is okay, it's frankly the most damning thing I've ever seen here. I will send you my plans for a new CAPTCHA over IRC. We'll see if that works any better. Naleksuh (talk) 06:29, 10 February 2022 (UTC)
 * As a web dev, the third one I agree, there can't be a good captcha without any javascript. Why oppose javascript? It's everywhere on the web. PHP is inefficient, Lua and Python has limitations and are hard to implement into websites, and HTML is just a markup language, it can't do anything.  Anpang 📨 06:37, 10 February 2022 (UTC)
 * Please read w:mw:No-Javascript notes. MediaWiki is designed to have full functionality without JavaScript, and for good reason, which reCAPTCHA is violating. Naleksuh (talk) 00:51, 11 February 2022 (UTC)