-
Notifications
You must be signed in to change notification settings - Fork 763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Noisy alerts about 401s without auth challenge #158
Comments
kris-sigur
pushed a commit
to kris-sigur/heritrix3
that referenced
this issue
May 2, 2016
This was referenced May 2, 2016
nlevitt
pushed a commit
that referenced
this issue
May 2, 2016
* Fixes issue #158 : Noisy alerts about 401s without auth challenge * Update test to account for non-fatal-error log not being empty on non-auth 401s.
nlevitt
added a commit
to nlevitt/heritrix3
that referenced
this issue
Jun 7, 2016
* origin/master: Setup TravisCI Set fetch status on curis when testing link extraction No link extraction on URI not successfully downloaded Fixes issue internetarchive#158 : Noisy alerts about 401s without auth challenge (internetarchive#159) Make Content-Location header url INFERRED not REFFER hop type since Content-Location is not for redirection (internetarchive#151) fixes for kafka 0.9 (?) upgrade to kafka 0.9 somewhat ugly fix to handle exceptions from the bean browser like java.lang.RuntimeException: not implemented at org.archive.modules.fetcher.BdbCookieStore$RestrictedCollectionWrappedList.get(BdbCookieStore.java:92)
nlevitt
added a commit
to vonrosen/heritrix3
that referenced
this issue
Jun 7, 2016
* fix-test-errors: hopefully fix remaining serialization tests in oraclejdk8 by using ConcurrentSkipListMap instead of ConcurrentHashMap hopefully fix serialization tests in oraclejdk8 by using TreeSet instead of HashSet in KeyedProperties.java clear the history store at the beginning of testBasics(), because the other test might have run first yeesh... "cd .." to get back to the right place to see the failure reports let me see what failed, travis Setup TravisCI Set fetch status on curis when testing link extraction No link extraction on URI not successfully downloaded Fixes issue internetarchive#158 : Noisy alerts about 401s without auth challenge (internetarchive#159) Make Content-Location header url INFERRED not REFFER hop type since Content-Location is not for redirection (internetarchive#151)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
A 401 response is supposed to include an auth challenge but in practice a lot of sites erroneously use 401 without it (they should really be using 403s).
When Heritrix encounters such a situation it logs the error in a such a manner that it is added to the alerts log. As this isn't an issue with the crawler, this isn't very useful and the spamming of such errors may hide other, more serious and actionable errors.
Example entry from the alerts log:
Suggest we modify how these errors are handled and log them in the
nonfatal-errors.log
only.The text was updated successfully, but these errors were encountered: