THIS REPOSITORY HAS MOVED TO https://github.com/ace-ecosystem/urlfinderlib

urlfinderlib

Python library for finding URLs in documents and arbitrary data and checking their validity.

Basic usage

from urlfinderlib import find_urls

with open('/path/to/file', 'rb') as f:
    print(find_urls(f.read())

base_url usage

If you are trying to find URLs inside of an HTML file, the paths in the URLs are likely relative to their location on the server hosting the HTML. You can use the base_url parameter in this case to extract these "relative" URLs.

from urlfinderlib import find_urls

with open('/path/to/file', 'rb') as f:
    print(find_urls(f.read(), base_url='http://somewebsite.com/')

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
bin		bin
tests		tests
urlfinderlib		urlfinderlib
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

THIS REPOSITORY HAS MOVED TO https://github.com/ace-ecosystem/urlfinderlib

urlfinderlib

About

Releases 4

Packages

Contributors 2

Languages

License

IntegralDefense/urlfinderlib

Folders and files

Latest commit

History

Repository files navigation

THIS REPOSITORY HAS MOVED TO https://github.com/ace-ecosystem/urlfinderlib

urlfinderlib

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 2

Languages

Packages