Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paging based on 'continue' links #12

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

hempels
Copy link

@hempels hempels commented Aug 24, 2012

Commit 154a8bc adds logic in paging heuristic for heavy weighting on the word "continue".

All existing tests pass, new passing test added to cover that scenario. Example article from http://www.ilr.cornell.edu/trianglefire/story/introduction.html.

Ignore the other commits in this Pull Request, they've all been submitted/reviewed previously.

Also removed arbitrary exclusion of links over 25 characters long. This
seems contradictory to the weighting approach and isn't necessary for
any of the test cases.
Adds constructor overloads in NReadabilityWebTranscoder to enable
changing the page separator from the default.
Conflicts:
	NReadability.build
	NReadability.nuspec
	Src/NReadability/NReadability.Console/Properties/AssemblyInfo.cs
	Src/NReadability/NReadability.Tests/NReadability.Tests.csproj
	Src/NReadability/NReadability.Tests/NReadabilityWebTranscoderTests.cs
	Src/NReadability/NReadability.Tests/Properties/AssemblyInfo.cs
	Src/NReadability/NReadability.Tests/SampleWebInput/SampleInput_09_1.html
	Src/NReadability/NReadability.Tests/SampleWebInput/SampleInput_09_2.html
	Src/NReadability/NReadability/NReadability.csproj
	Src/NReadability/NReadability/NReadabilityTranscoder.cs
	Src/NReadability/NReadability/NReadabilityWebTranscoder.cs
	Src/NReadability/NReadability/Properties/AssemblyInfo.cs
	Src/NReadability/NReadability/UrlFetcher.cs

Manually merged changes from marek-stoj/NReadability
Links containing the word 'continue' usually imply a context of the
current page/article, so they're given a higher weight. All current
tests pass; new test added to cover this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant