-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate less exploitative solutions to bypassing captcha #1256
Comments
An immediately actionable step that could be taken is to note the use of anti-captcha.com in Invidious somewhere. Currently the only way to find out about this is to look through the source code itself, and I think many users who want to avoid google poison also care about issues such as this and would appreciate being notified that Invidious is using them. |
Hello, I am the person that suggested the use of anti-captcha. I really care about those issue and it's something we really checked before. We spend some time analyzing the company, and came to the conclusion that this is a real company (it's not run by one guy from his garage), that provide "acceptable" work with acceptable to okay salary for people that live in countries with really low salary. From what we gathered the salary they offer to their worker is on par with the medium salary of those countries, however the work they do is "easier" (physically) that what those countries could propose them. Most of those countries are third world countries where the only work available is in mining / food production for first world country (literally us), anti-captcha allow some of those people to at least avoid this by getting an okay salary doing "easy" work. In my opinion this is okay, yes it could be better but this is, in my opinion, better for people to solve captcha than to produce stuff for a company that don't care about their life. This is a far deeper issue than just Invidious and Anti-captcha, and I don't think that stopping giving them this work will make things better. |
I've successfully used a program like uncaptcha2 with the help of puppeteer in order to solve the audio version of Google reCAPTCHA. This is certainly another possible way to avoid using anti-captcha and due to the fact that Invidious solve a reCAPTCHA only every 3 hours, Google may not detect that it's a not a human that is solving the audio reCAPTCHA. The main issue is that it's not 100% reliable and has a major annoyance:
I plan on doing a distributed audio reCAPTCHA solving solution and offering it as a clone of the API of anti-captcha so that every Invidious instance owner can just change the API URL in the source code and profit a CAPTCHA free experience. |
I would love to see that, @unixfox ! While it might not be viable for a large instance like the flagship, for people hosting their own instances I imagine it could work well |
@TheFrenchGhosty I appreciate the time that you and those involved took to look into the company. That said, I do personally still believe this is an issue. Looking at their site, anti-captcha.com reads as an incredibly stressful job to perform with the metaphorical guillotine of losing your livelyhood hanging over your head at all moments if you fill out too many captchas wrong (as we all know, something very easy to do). They seem to particularly take pride in banning workers quickly when they see that they're "cheating". I agree that there's no good answer here. I wish there was something that I could open up with and say, "let's replace anti-captcha.com with this solution I have that isn't as exploitative!", but at the very least I want Invidious to be open about using such services for now instead of it just staying tucked away in the source code where your average end-user will never hear about it. |
May I ask what exactly makes paying individuals to fulfill a market demand and make your life easier exploitative? Are you suggesting free humans shouldn't have the freedom of voluntary employment, and therefore further pushing people in to economic distress? |
people are getting exploited because they are being tricked or pressured not because they enjoy it. the only kind of people who seem to love it are those who see themselves as temporarily embarrassed millionaires and advocate explotive behavior just because "choice" or "freedom". in reality this is just instinctive submissive to what they percieve as alpha male, even if its just a company or "the market" |
I’m sorry, everything you just said is abstract, illogical, unsound and confusing. I’ve got no idea what you’re talking about. You’re not helping your cause. All I see is a lot of thumbs down and no responses of substance to what looks to be a peaceful and logical/rational/free market solution, if I understand this ‘issue’ properly.
How are people being tricked?
Who are these people being tricked?
Have you surveyed people working in this field?
How do you know all of this?
Where can I find out more about this crusade against people filling out captchas?
Why is the captcha completion industry the focus of your attention?
Do captcha crusaders understand Austrian economics or freedom of individual choice?
What makes this job of completing captchas exploitative, over any other job that someone might do, like say a toilet or hazmat cleaner?
Is the concern about how much money people are being paid? If so, how do you claim to know what someone is worth, or should be paid, more than they themselves know what they’re worth?
This whole thing is very confusing to me. If an entrepreneur has come along and created a new industry within a third world (as an example like this may be), and then hired two hundred people to complete captchas. How has he exploited those he employed by providing an income they otherwise wouldn’t have had without this opportunity? Hasn’t the entrepreneur actually improved the lives of 200 people by granting them another source of financial income?
They would be worse off without it, or they wouldn’t agree to trade their time for money, right? Assuming criminality like slavery or coercion isn’t involved.
What if xy and z captcha solver hears that EthicalCapcthaCo just opened up down the road and offers a higher pay rate and better conditions. Could she/he leverage his or her new experience at the current job and seek out a better standard?
… On 23 Jun 2020, at 4:59 am, elypter ***@***.***> wrote:
people are getting exploited because they are being tricked or pressured not because they enjoy it. the only kind of people who seem to love it are those who see themselves as temporarily embarrassed millionaires and advocate explotive behavior just because "choice" or "freedom". in reality this is just instinctive submissive to what they percieve as alpha male, even if its just a company or "the market"
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I dont know, people who work there are not forced to work there.
They are not.
No, but I registered as worker on their website and "worked" around an hour to test this out.
Online reviews, own experience.
crusade?!
Google blocks the access to youtube.com with captcha if they saw too many requests from the same IP. Therefore we need to solve a captcha to access the youtube.com data again. This hits only big instances.
I cant answer that, I dont know if they do.
I think it is way better than a cleaner in this country.
Yes, it is all about money. We dont claim anything, we just compare our first world standard with theirs and do understand that they get less than us. We did not decide that. We cant change it with a finger snaps
Most likely, but just because it is worse somewhere else it doesnt mean it is good here. |
Thanks for the response @Perflyst. I feel somewhat more confident I'm not losing my mind now. I say crusader because this is the second anti-captcha sentiment I've seen lurking in open source projects. The other was in the searx issues, where the social justice warriors there deleted my comment out of the thread for merely questioning the same exploitation logic. Who doesn't love a bit of censorship of opposing opinions. So is anyone able to shed some light on this whole captcha-completion-is-expliotation virtue signalling? My guess is these well intentioned people don't realize that the very people they think they're helping are actually getting screwed if their jobs/income go away. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Thank you for the discussion but please stay on topic and actually evaluate an alternative to anti captcha services. |
The above discussion is relevant. The fundamentals of the initial premise are being challenged, due to being based on incorrect or unsound logic; that there is in fact no need to seek alternatives. That's what the discussion intends to (and appears to have) showcase(d). |
How you think my comment is off topic is bewildering ! |
The fact is that no worker from the third world country should work for us to consume more or less trivial things. In my opinion the aim is to find a technical solution as @unixfox proposed in #1256 (comment) |
I asked perflyst to hide some comments because this issue is moving out of the initial subject which is talking about alternatives to anti-captcha and not about if anti-captcha is an ideal solution or not for Invidious. If you would like to talk about "if the main solution to solve the Google reCAPTCHA is bad or good" then feel free to open another issue. My main concern is that I feel like this discussion is refraining the potential peoples that would talk about the alternatives because this issue is too focused on the ethics of using anti-captcha. I'm really interested to see if someone else would come up with another good solution and I don't want to unsubscribe of this issue because it's not talking about the alternatives of anti-captcha anymore. |
I'm not certain I understand why Perflyst's "people are not being tricked or forced to work there" comment and m4teh's comments where they quiz me on how this could possibly be exploitative remain while mine that responds to them and attempts to explain the opposite side of the discussion are hidden. I can understand the wish to hide the discussion on the nature of the ethics of this situation so we can focus on the technical side, but completely hiding one side of the argument while only hiding the late rebuttals on the other side seems a little disingenuous to me. |
As one of the thumbs-down-ers: @m4teh, I don't think you're going to get a substantive response. As soon as you opened with your very first reply with
The answer should have been "no". It's been a massive derail into a pseudo-ethical discussion on what should be a technical issue with a very reasonable, limited request by @lambdadog:
and it seems everyone else is on-board with the idea of giving the user the freedom (and we know how you feel about freedom) to personally choose a technical alternative, if one can be found. Everyone else seems to recognize that, independent of each of our internal motivations, perhaps offering this choice would be a net positive for everyone, even if an individual person is not motivated to opt-in. Perhaps let the good hardworking folks create this opt-in alternative option. And then after it launches in Edit: With my sincerest apologies to the |
going back to the technical side of the issue i think there are the following ways and probably many more to tackle it
many of these points reflect ideas and suggestions spread across many different issues. unfortunately i dont have those at hand but im confident they will resurface if there is interest. possibilities are plentiful. the difficulty is to decide the most effective path. |
Another solution is to use dynamic mobile proxies with multiport which make every request from new 3G/4G ip address. Multiport proxy costs ~$41 per month. (I have tested this service https://airsocks.in/en#tariffs, you can test any proxy type for free for a few hours). We talked about that with Omarroth, and i have tested everything locally - it worked perfectly. I was surprised when I saw that invidious is using anticaptcha to solve that issue with youtube request limits from one ip. When I made my tests, I have found a line in a source code with http client initialization and hardcoded the proxy address and port there. I asked to add a config option for proxy host and port here at issues some time before, but it is still not implemented 😅 I am not familiar with crystal lang. It would be great if somebody can make a pull request and implement this simple config options for http client. |
while expensive it seems to be a useful piece to the puzzle. mobile ips are clearly labled as consumer devices and usually many people share one ip so they might go easy on the rate limiting. multiport could however be a problem. most captchas are prevented by users having a google cookie. multiport only makes sense if you dont use a cookie because you have a new ip each connection. that could mean that you have to solve an initial captcha every time. no idea if google is currently asking for this but they could demand it without hurting regular users. would be a lot better if you could change ip manually. |
From my experience, Google doesn't lower their rate limit for consumers IPs. It's the same rate limit for everyone.
You don't really have to find proxies specifically from home connection or cellular connection, you just need a bunch of clean IPs. Google supports IPv6 and there are way more IP blocks in IPv6 so if you were to find a provider of IPv6 proxies and those proxies have clean IP then you found a way less expensive solution. The big issue with your solution is that the bandwidth is very low (max 15 Mbit/s) and this bandwidth is very crucial for a service like Invidious that proxy actual videos not just text. Imagine using that service with a lot of users watching videos, the Invidious instance would have huge playback issues. |
A lot of developers are using an api-only branch for youtube data fetching, bandwidth is not a problem for such use case.
There are a lot of dynamic proxy services with an APIs which allow to rotate ipv4/ipv6 address when you need with or without any delay depending on tariff options. |
Can you explain what exactly the api-only branch is doing, @artshevtsov? Is it using the youtube API for searching, etc, but still using non-API mechanisms to fetch the actual video and proxy it?
The biggest concern with this solution to me is that the proxy services mentioned are likely used for a lot of spammy and malicious behavior and you may be likely to get a captcha right off the bat when rotating a good percentage of the time. I'd love to see some actual metrics from using this method though that might prove me wrong. |
With regards to needing "clean IPs", I'm curious of the reality of how google handles "dirty IPs" in this case. Are we familiar with how long it takes for them to be considered clean again? |
You can read about the API functionality here: https://github.com/omarroth/invidious/wiki/API This project doesn’t use youtube API.
|
I just found this service that doesn't involve humans for solving Google Recaptcha: https://capmonster.cloud/en/, it's cheaper than anti-captcha and compatible with anti-captcha API so technically everyone can start using it thanks to #1473. captcha_api_url: https://api.capmonster.cloud
captcha_key: captcha key from capmonster dashboard |
@FireMasterK Let's move here. That's not normal. Are you running multiple instances of Invidious at the same time? Do you often restart Invidious? |
Nope, just one instance of Invidious. Also, is anyone aware if Invidious saves the cookies obtained so there's no need to get new ones next time on restart? |
Yes Invidious should store the cookies inside the config.yaml. If it doesn't then there is a permission issue. |
does anyone know how this specific service works in more detail? |
Is this true even with docker? Isn't the config overridden by the environment variable? |
Yes. I'm using the Docker image and you just need to mount the config.yml into the container. |
@FireMasterK do you have less reCAPTCHA attempts now that you save the config.yml between each restart? |
Well the images that Google gives for the captcha can be recognized by for example an image recognition system. There are a lot of systems like this on the market, here is one from Microsoft: https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ By associating a system like this and an automated browser, the company (capmonster) is able to automate this reCAPTCHA solving without any human interaction. You can find a demo of what I explained on my twitter: https://nitter.snopyta.org/unixf0x/status/1075068461720702979 (sorry it's in French but just play the video). |
I have not tinkered with my docker-compose file yet, I'll do so soon today. |
This issue has been automatically locked since there has not been any activity in it in the last 30 days. If this is still applicable to the current version of Invidious feel free to open a new issue. |
Invidious is currently using https://anti-captcha.com to solve and bypass captchas.
I and many others feel negatively about supporting an exploitative mechanical turk-like company to enable invidious to continue to function. I'd very much like to start a discussion around more ethical alternatives to handling Google captchas.
There will certainly have to be tradeoffs to make this happen, but I'd like to see invidious be as ethical a piece of software as possible and I think a big first step is finding a way to stop supporting harmful and exploitative services like anti-captcha.com.
Ideally I'd like to see this issue left open as an open discussion about alternatives, even if they're just configurable for personal instances.
The text was updated successfully, but these errors were encountered: