Skip to content

April93/Kaffiene

Repository files navigation

Kaffiene

Popular ZeroNet Search Engine. Kaffiene.bit or 1Mr5rX9TauvaGReB4RjCaE6D37FJQaY5Ba to access on ZeroNet.

Experimental 0Git repo here: 1DUP5JRszgVVZbc2nq4B57qB6ZXk4sStKD


Python Management Tools

grab.py

grab.py's function is pretty simple, however it's the most complicated to set up and run. The script requires selenium to be installed for python ('pip install selenium' should work). It also requires you to download phantomjs. I downloaded the mac version from phantomjs.org. With that in place, you need to edit the python script to point to phantomjs. And you can also optionally change which proxy you're grabbing from. Once all that setup is done, simply run the script. After a short time, 'Done?' will appear. Wait a few seconds (I wait about 3-5) and hit enter. Three numbers should appear. These are the grabbed counts of the addresses, names, and peers. They should all be equal. Grab.py then outputs a peerlist.txt that merge.py is expecting.

merge.py

This is the new merge.py tool. It's used to update the siterank entries. It takes in an existing site index along with a new 'peerlist' that contains a list of addresses and peers of the site, as generated by grab.py. The peerlist.txt needs to have one address on each line, followed by a space, and then the siterank, followed by a space and the name of the site. Merge.py will then find all the sites in the index and combine the new siterank with the old. If one of them is missing, it will use the one available. If there are neither, it will keep the '-'. Any sites that are unique to the peerlist are separated into a newsites.txt file, for easier tagging. Running the tool multiple times will append to the newsites.txt, not overwrite. New sites are still added to data.txt by hand at the moment (once automated tag generation is in place, this will no longer be).

mergeold.py

mergeold.py takes in an existing site index, like Kaffiene has previously used, along with a new 'peerlist' that contains a list of addresses and peers of the site (as obtained from /Stats). The peerlist.txt needs to have one address on each line with no descriptive info, followed by a space, and then the number of peers. Or any value you wish to assign to the site. A 'siterank' if you will. Mergeold.py will then find all the sites in the index and append the site rank onto the end of the entry (again, with a space separator). If a site is not in the peerlist, it get's assigned a dummy '-' value. Any sites that are unique to the peerlist are separated into a newsites.txt file, for easier tagging.

It's worth noting that mergeold.py only works with a data.txt that does not yet have the siterank applied. Also, the new index.html will be expecting a data.txt with siterank. Please keep that in mind.

check.py

This is the original duplicate checking tool. It works great and even allows for file selection without modifying the script. It simply reveals how many unique sites are in the index, and what the duplicates are, if there are any. It's a simple and straight forward tool.