Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

medialab / hyphe Public

Notifications You must be signed in to change notification settings
Fork 61
Star 329

Code
Issues 58
Pull requests 1
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Releases: medialab/hyphe

Releases · medialab/hyphe

Early 2024

31 Jan 09:52

boogheta

Compare

Choose a tag to compare

Loading

Early 2024 Latest

Latest

ChangeLog:

Give access to detailed crawl logs within frontend (#452)
Diverse small UI fixes/improvements in frontend (#482, #483, #485, #486, #488, #494)
Complete adaptation of web archives handling to INA's (#484)

Full Changelog: v1.10.9...v1.11.0

Assets 2

Loading

All reactions

Back-to-school papercuts

25 Aug 17:31

boogheta

Compare

Choose a tag to compare

Loading

Back-to-school papercuts

ChangeLog:

Add a button to export metadata from all pages of a webentity (#318)
Explicitly separate startpages warnings regarding redirected pages and faulty ones (#379)
Allow to set a specific User-Agent per crawl within the web interface (#461)
Display hints on the meaning of the different possible status of a crawl (#474)
Highlight corresponding webentities when hovering a status or a tag in the network legend (#459)
Switch User-Agents list used within crawls to relying on https://www.useragents.me/ (#453)
Various improvements (cleaner backend logs, remove empty traphs directories (#475), updated heuristics for webentity links calculation rhythm, visual fixes (#476, #477)

Assets 2

Loading

All reactions

Hot Summer '23

21 Aug 10:34

boogheta

Compare

Choose a tag to compare

Loading

Hot Summer '23

ChangeLog:

migrated caching WELinks to (working) files instead of mongo to handle huge corpuses
allow to set archives pass as ENV variable for docker instances
display time required by links indexation on overview

Assets 2

Loading

All reactions

Summer '23

21 Jul 17:27

boogheta

Compare

Choose a tag to compare

Loading

Summer '23

ChangeLog:

Added handling of more webarchives as sources (Arquivo.pt + INA DLWeb) + fixed various webarchives frontend info (#469, #471,
Added a corpus setting "ignore internal links" to crawl but not record links within the currently crawled webentity in order to fasten drastically indexation of entities with crazy amounts of links (with a cost in terms of functionalities since the network of internal pages is then not available, and entities that are split after a crawl will require to recrawled) (cf #371, #378, #433)
Better handle frontend warning on pending actions when trying to close a tab (#465, #466)
Minor fixes (#448, #460, #467, #468, #470, 50d97e8, 85decf2)

Assets 2

Loading

All reactions

Better, faster, stronger traph, there it is!

29 Nov 18:11

boogheta

Compare

Choose a tag to compare

Loading

Better, faster, stronger traph, there it is!

ChangeLog:

Switched to breaking new version of hyphe-traph 2.1, which should help fasten indexation on big networks, but requires to rebuild corpuses from start
Make iterator traph calls less recurrent to leave priority to quick user actions
Fixed stack on calling empty callback in List Webentities
Upgraded urllib3 to handle SSL deprecation
Froze dependencies to maintain python2.7 compat

Assets 2

Loading

All reactions

Summer '22

19 Aug 11:48

boogheta

Compare

Choose a tag to compare

Loading

Summer '22

ChangeLog:

Upgraded User Agents list
Added extra default WebEntity CreationRules for Github, Instagram, TikTok, Reddit and a bunch of blog platforms
Added perma.cc to list of default autofollowlinks
Diverse fixes and extra features for webarchives (links to archive permalinks, etc.)
Minor bugfixes

Assets 2

Loading

All reactions

Spring '22

30 Mar 15:50

boogheta

Compare

Choose a tag to compare

Loading

Spring '22

ChangeLog:

Added a distinction between successful and errored crawled pages to identify Suspicious crawls (#425)
Fixed frontend compatibility within Hyphe-Browser (medialab/hyphe-browser#212)
Fixed WebArchives crawling interface (#431) and behavior from BNF's archives (#426)
Improved network page's interaction using latest sigma.js v2.2 (node highlight etc & #367)
Allowed frontend to automatically restart a closed corpus when reopening the frontend directly on a specific corpus link (#440)
Allowed to check contiguous cases in frontend's lists of webentities using the shift key (#438)
Allowed to tune the frontend's header color from the config (#430)
Published Hyphe on Zenodo & Software Heritage
Minor fixes (#397, #388, #432, #429, #437, #343, #341, #444, #325)

Assets 2

Loading

All reactions

Robots sensitive crawls (stabilized)

15 Nov 14:55

boogheta

Compare

Choose a tag to compare

Loading

Robots sensitive crawls (stabilized)

ChangeLog:

Fixed environment variable OBEY_ROBOTS for Docker instance
Added explanation helpers in frontend
Fixed undeletable corpora

Assets 2

Loading

All reactions

Robots sensitive crawls

25 Oct 15:09

boogheta

Compare

Choose a tag to compare

Loading

Robots sensitive crawls

ChangeLog:

Optional support of robots.txt respect by crawls (added by @stijn-uva #376 #421)
Minor fixes (sigma.js upgrade to v2.0beta, #370, #395, #423, #284, dba5721, ...)

Contributors

stijn-uva

Assets 2

Loading

stijn-uva reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

WebArchives powered crawls

23 Sep 11:00

boogheta

Compare

Choose a tag to compare

Loading

WebArchives powered crawls

ChangeLog:

Allow to start crawls on Web Archives to browse disappeared or modified webentities in the past (#372)
Allow to setup advanced individual crawl settings (using a specific cookie, adjusting the depth, using a web archive...)
Allow to display only crawled pages in a webentity's webpages list
Upgraded fake user agents dependency for more recent UAs
Add to the API a route to collect crawled webentity's webpages content as clear text instead of zipped base64
Minor fixes (#397, #416, #418, 8b8f73f, 3b48755, 6aea48a, f3c1e85, e97b9d0, b05d470, 01aac8a, ...)

Assets 2

Loading

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.