Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lessons learned over past 15 years #143

Open
synctext opened this issue May 16, 2013 · 22 comments
Open

lessons learned over past 15 years #143

synctext opened this issue May 16, 2013 · 22 comments
Assignees
Labels
component: documentation type: memo Stuff that can't be solved

Comments

@synctext
Copy link
Member

Goal: publish about issue tracking, unit tests, software repository, continous integration and test frameworks

Publish at this industry track conference where they accept 'experience reports describing problems (and their solutions) encountered in real applications'.

Notes ToDo process

Possible storyline: evolving slowly from experimental prototyping (science-first phase) to production-level code (users-first phase). Team altered over the years from scientists towards scientific software developers.

Berkeley DB towards SQLite (one year struggle). Support of swarmplayer, plugins and social networking (2008).

156 tickets from P2P-Next project, we tried but did not work yet.

Cleanup of the core May 2013:

  • New design principle Dispersy communities on-top, rest is core.
  • internal tracker removal, use Libtorrent, delete 8k-ish LOC
  • cleaner API
  • get_peers exposure in Libtorrent, delete 2nd DHT, rawserver and torrentcollecting

4th generation file sharing: the 10 years journey towards 1 million users

@synctext
Copy link
Member Author

Document the Bartercast pivot.

Dispersy history and evolution.

ABN/TKI/ShadowI/Tranwall

Struggle of proxies from 2006 onwards.

  • donate to friend, 2006 CVS checkin
  • upb code base
  • first socks5 code
  • e2e
  • still not OK after 13 years

Finding external talent, just teach/make them ourselves

P2P seminar, hacking lab, blockchain.

Leaving Android, MythTV and Kodi

All code replaced!

@synctext
Copy link
Member Author

synctext commented Jun 8, 2019

We have made continuous improvement our cardinal organisational capability.

Relentless incremental improvement is difficult to sustain within a university environment. It is defined by strong pressure to publish elegant ideas. Publications that are reproducible and based on real-world complex problem are resource intensive. Conducting reproducible science or actually solving problems can easily take 10x or 100x more time when compared to publishing ideas. Trying to solve a problem is a risky career path. To fund our scientific work we successfully tapped into a reliable revenue stream. Roughly 35% of our time is spend helping our government solve some of their most challenging ICT problems. Those problems have been selected to be closely aligned with our research interests. For instance, passport-grade online authentication.

In 2005 we published the following conclusion, after measuring Bittorrent for a few years:

One of our main conclusions is that within P2P systems a tension exists between
availability, which is improved when there are no global components, and data integrity,
which benefits from centralization.

The final sentences of our 2005 article mentions our work on distributed accounting systems:

Another future design challenge for P2P file sharing is creating incentives to seed.
For example, peers that seed files should be given preference to barter for other files.

14 years have passed since we started working on distributed accounting systems (now called blockchain or utility tokens). After relentless incremental improvements our online token economy within Tribler is nearly complete. Tribler still evolves. We recently replaced 50,000 lines of code with 5000 lines.

Tribler has grown or even evolved into a complex system. Is is not top-down designed, that is becoming impossible. After nearly 20 years of building systems with special properties, only successful design patterns survive. Inspired by Darwin himself, human-made systems often follow the same design principles governing natural systems. The key design principles is evolution by natural selection.

When do we call it https://en.wikipedia.org/wiki/Continual_improvement_process and when is it natural selection?
We speculatively deploy semi-random features and see if they stick.
Our decentral social network deployed a decade ago failed. Our wiki-style trust-less editing failed in 1999, but might finally catch on.

Our long-term direction is published in February 2006. We republish it here in-full:

our social-based P2P network, TRIBLER, addresses
all five grand challenges.
The most difficult research challenge is the decentralization
of the functionality of a P2P system across the various peers.
Full decentralization eliminates the need for central elements
in the system, which must be set up and maintained by some
party and which may form serious bottlenecks, point of failures, 
or security threats. In particular, connecting to the
network and validating accounts are difficult to implement
without any central element. To date, no P2P file-sharing
system exists which fully decentralizes all functionality efficiently
and without loss of integrity. Social groups form a
natural method to efficiently decentralize P2P systems, due
to the fact that communication is mostly localized amongst
group members.

The second challenge is to guarantee the availability of
a P2P system as a whole. The operation of such a system
should not depend on the availability of any particular participating
peer, or of any central component, if the latter exists.
Given the short availability of peers (in [14] we found less
than 4% of the peers to have an uptime of over 10 hours),
the availability problem is critical. Proven social incentives
such as awards and social recognition could stimulate users
to leave their P2P software running for longer periods, thus
improving the overall availability of the network.

The third challenge is to maintain the integrity of the system
and to achieve trust amongst peers. By definition, P2P systems
use donated resources. However, donors cannot always
be trusted, and maintaining system integrity has proven to be
difficult in operational systems [7]. Data can be attacked at
several levels in a P2P system, namely system information
(e.g., pointers to content), meta data, and the actual content
itself. This significant problem, often ignored by P2P system
designers, can be solved with a social-based network,
in which users actively help to clean polluted data and users
can select trustworthy representatives.

The performance of a P2P system highly depends on peers
donating resources. Even though the resource economy is
by definition balanced (e.g., every MByte downloaded corresponds
to a MByte uploaded), autonomous peers are free to
decide whether to donate resources or not. Hence, providing
proper incentives is vital to induce cooperation and to achieve
good performance [3]. Again, social recognition can help to
alleviate this problem.

The fifth challenge in P2P systems is to achieve network
transparency by solving the problems caused by dynamic IP
addresses, NAT boxes, and firewalls. The fundamentals of
the Internet have changed due to the wide-spread use of these
three technologies. Peers no longer have the freedom to send
anything anywhere, without the help of another peer acting
as a mediator between them. Social networks enable communicating
peers to automatically select trusted mediators
from the members of their social proximity, who are still
online; hence, the need for fixed mediators is eliminated.

After several millions downloads of Tribler and growing user community we are also becoming more ambitious. With the successful growth of our Youtube-like system our ambition level is to solve the problem of online trust. We believe our trust-inducing mechanism
can be applied in most online social platforms and economic settings.
We believe social proximity is suitable to become a cornerstone of the online world.

@synctext
Copy link
Member Author

synctext commented Nov 4, 2019

draft storyline

Permissionless innovation

Delft University of technology has created highly disruptive pioneering innovations within social media, medical domain, finance, and identity.
Our first permissionless innovation is a passport-grade identity and web-of-trust ecosystem. We empower each citizen to own their own digital identity. Self-sovereign identity is a much discussed topic in some circles, but not yet achieved in the real world. Our prototyping and relentless incremental improvement work has matured this emerging field. Within our trial we accomplished the first essential act of true self-sovereignty in the digital domain. A democratic government which recognizes the public key chosen by a citizen as their digital identity and their legally binding electronic signature.
We created an permissionless open banking infrastructure. We did not ask permission of the largest banks of Europe to create this open standard and operational system. Our transparent infrastructure offers real-time payments across Europe and competitive currency conversion. Our truly open banking infrastructure enables further optimisation and automation of business logic in many parts of our economy, without the cost and delay induced by countless lawyers.
In April 2005 we created the first video streaming platform which is designed to spreading videos with privacy and security. Social capital and integrated reputation mechanisms are key to spreading videos of demonstrations and protest. Protesters are not required to ask governments permission to spread videos of protests, prevent fake news, and promote local news footage.
The medical domain is part of our society which is highly regulated. Internet marketplaces and direct-to-patient shipping of medication may bypass some of these checks and balances. With the arrival of cheap whole-genome sequencing you can now download your own DNA profile. With the right inputs software is now capable of detection disease risks, without any oversight or permission from medical professionals. In close collaboration with medical ethicists we are creating machine learning algorithms which guarantee privacy, are GDPR-compliant, and can determine medical risks.
We are actively working on achieving the first software-only direct-to-patient medical trial. Scientists make software updates available for continuously running and incrementally improving detectors of disease risks. We believe patient-initiated medical trials represent a new phenomenon. By putting the patient fully in charge we are creating a permissionless ecosystem for medical research.
We believe our four permissionless innovation ecosystems are key drivers for superior innovation spaces. We are now evolving our complex systems constructed over the past 15 years into a single ecosystem with permissionless flow of information and value between leaderless organisations of unbounded size.

@synctext
Copy link
Member Author

Scientific principle behind our work on identity, trust, cooperation, and trade:

  • create a collection of autonomous agents which can scale indefinitely
  • each autonomous agents directly observes the state around them and has long-term memory
  • agents are truly autonomous and never receive instructions from others; or tells them what to do
  • reputations are calculated only through direct observations or friend-of-friend chained
  • agents should be able to conduct any economic activity in general

Our approach to science is:

  • relentless focus on solving real societal problems
  • we don't play the publish-or-perish citation game (but high-impact papers are nice)
  • share everything as an open lab
  • first-hand observation only, no reliance on stories told in other papers
  • improve our ideas until they have proven to work with real users
  • understanding the fundamentals of identity and trust requires combining of theory and experiments

@synctext synctext changed the title documentation: lessons learned over past 8 years documentation: lessons learned over past 15 years Jun 17, 2020
@synctext
Copy link
Member Author

synctext commented Jul 6, 2020

For 7 years we have been developing credit mining. Goal is to automate donating resources by identifying under-seeded swarms (need more replicas) and joining them. This is a key step towards an autonomous Youtube-like system without servers.
First research result was Investment Strategies for Credit-Based P2P Communities in 2013. First paper to describe the Bandwidth Investment Problem.
image

Credit mining is an example of our pathfinding methodology. During our 20 years of deploying self-organising systems we always deploy a conservative and partial system to learn and build experience. Deployment and feedback from the real-world has proven to be the most effective methodology, versus the getting it right the first time. Tribler is complex, involving interaction among and feedback between many parts. It consists of planning for the unexpected, avoiding partial failure and steady step-wise improvement. It evolved, it is not designed. We first measured NAT hardware, for instance, before devising UDP puncture techniques. The richest man on earth is also a fan, #GradatimFerociter.

After a master thesis on credit mining this feature still did not meet expectations. It was hard to use within the GUI and ineffective. After 7 years of effort the code was deleted. More research is needed. Critical missing element is a fully functional popularity community. That is known to be a hard problem, its part of building a Google-like search engine.

We don't do much formal scheduled meeting and work breakdown structure. Formal meetings are not appreciated by lab members and minimised. Alignment is done through coffee and slack. ToDo: explain the 1 person, 1 project, and 1 Github issue method (1=1=1 method).

@synctext synctext modified the milestones: V9.9: Shadow Internet, Backlog Jul 14, 2020
@ichorid ichorid added type: memo Stuff that can't be solved component: documentation labels Jul 17, 2020
@ichorid ichorid changed the title documentation: lessons learned over past 15 years lessons learned over past 15 years Jul 18, 2020
@synctext
Copy link
Member Author

"Epic Sprint", after 15 years and 5 months of Tribler development we are trying a new process.

"Epic Sprint" is a new agile work methodology to cope with our growing development team and expanding code base. Having weekly meetings about all the various development tasks is boring for most people, better to have smaller team meets. How to divide our overall work? With three developers @xoriole @kozlovsky @drew2a we turning our Distributed Google dream into real running code within 6 months hopefully. This means getting big things done in small a unit, or Bezos rule of sharing two pizzas. (given @drew2a experience lets make him responsible for the agile process). After this 6 months we can do another epic sprint, turning an open scientific problem into a solution which is verified by observation and experience. Working in a small team increases speed of development and peer-review of protocol designs, when compared to our prior lone-developer approach. In January 2021 we will evaluate and improve.

@xoriole
Copy link
Contributor

xoriole commented Sep 15, 2020

Tribler Startup Model - Sandip's thoughts

Tribler should evolve from being an academic project to bring the software to millions of users. For that, we Tribler team should consider it a startup with limited scarce resources and relentlessly focus on user growth. Month-over-month growth rate is a good metric to understand how we’re doing over time. A million users is always going to be a distant dream if there is no obsession for growth.

Tribler’s unique value proposition is its anonymity which kind of works but it still needs to provide its users the same level of content quality, performance and usability as the centralized counterparts. Therefore, the entire focus of the dev team for the coming months (or years) should be to develop features that actually solves the pain points or problems of the users. Only then, we can reach a broader mass besides a niche of privacy aware users.

To achieve that, a few points that we should always consider in our development sprint:

  1. Acquisition: Is this feature or fix going to attract new users?
  2. Churn: Is this feature or fix going to reduce user churn (leaving Tribler)?
  3. Retention: Is this feature or fix going to improve user experience and keep providing value to the users? Improve content or performance?

Therefore, for any feature, we should consider the concept of minimum viable product/feature. We test the feature first, then if it is contributing in either of the above three growth measures, we should double down and put more resources to further develop it. Otherwise discontinue swiftly and move to the next feature cycle. Measurement of the appropriate growth or usability metrics becomes key here. It is also the responsibility of the feature developer to develop the mechanism to measure the metrics that indicates success.

Next, having a fixed release cycle is important. Great teams deliver on the promise they make to their users. The release could be monthly, quarterly or half yearly or annually, but defining an interval and following faithfully builds trust of the users on the dev team. Besides, predictability helps users on deciding when to upgrade and to what version.

...

@synctext
Copy link
Member Author

synctext commented Oct 2, 2020

Nobel Price winner in economics Paul Romer his research links together: permissionless innovation with Big Tech monopolies and the health of The Commons. First is the classical 1989 model for innovation and knowledge as a nonrival good. Outcome of this simple model: the equilibrium is one with monopolistic competition.

image

This equation captures two substantive assumptions and two functional form
assumptions. The first substantive assumption is that devoting more human capital to
research leads to a higher rate of production of new designs. The second is that the higher
is the total stock of designs and knowledge, the higher is the productivity of an engineer
working in the research sector

Recently he promoted the idea of Big Tech and the Commons in New York Times

It is the job of government to prevent a tragedy of the commons. That includes the commons of shared
values and norms on which democracy depends. The dominant digital platform companies, including
Facebook and Google, make their profits using business models that erode this commons. They have
created a haven for dangerous misinformation and hate speech that has undermined trust in democratic
institutions.

Conclusion: in 2005 we published the need to rewards seeders for their efforts and started incrementally improving our deployed mechanism. We did not understand how fundamental and difficult our task was. We where quite naive. As of 2020, people see that due to the pandemic we need a strong government, rich Commons and additional digital regulation.

A mechanism to address the Tragedy of the Commons such as indirect reciprocity or network reciprocity would also enable democratic institutions on a global scale. Once we solve the problem of strong digital identities and secure voting it is possible to create decision making processes to democratically control the flow of any amount of money by a community of unbounded size. The founding of the "Global Democratic Commons" might actually be possible in coming decades.

@synctext
Copy link
Member Author

synctext commented Apr 11, 2021

Ideas from the past and old insights have been poorly documented. The lab often re-discovers them without knowing how much "those ancients" already knew. Tribler is getting old. Really old. The idea of Dispersy, IPv8, Allchannels and today channels 2.0 is: 6,487 days old; 17 years, 9 months, 3 days ago.

The project was called THE GOD FILE. It was a strange idea to package .torrent files inside a torrent. It would scale to millions of users. How could you make changes? Well yes, difficult. That feature is now done after 17 years. Crowdsourcing is probably taking 18 years since this first operational prototype:

Understanding incentives and freeriding. Kazaa measurements from 1 - 3rd of April 2003. Download the original measurement capture from 2003
measurement_556_downloads_overview_5downloaders

Our first user retention measurement in 2003, seeding duration in Bittorrent. Original captured data sample from December 2003 and beyond:

847192	24.114.x.x	6881	2003_12_23__19_20	2004_01_02__14_40
810297	24.43.x.x	6882	2003_12_27__07_57	2004_01_05__17_02
773323	82.66.x.x	6882	2004_01_03__07_52	2004_01_12__06_41

Specific displayed capture shows 9364 users. Each of these users is ranked by their continued usage of Bittorrent in seeding mode. The most loyal user is displayed on the left, using Bittorrent for a few weeks continuously.

Operational Merkle hashes inside Bittorrent 29 May 2006. After this work by The Ancients it was idle for 10 years.

Lesson: start documenting these high-level lessons. Either here or in the docs. Single comment in one of 6000 Github issues and a graph in one of our 100+ scientific publications might be forgotten if it can't be found with keyword search. Somebody might want to re-produce your work 17 years, 9 months, 3 days later. Avoid nostalgia, nobody cares.

@synctext
Copy link
Member Author

synctext commented Apr 25, 2021

Gossip protocols are special. Evolution, optimisations, generational improvements and tuning are required; however, simplicity must be maintained. Experiments have show that normal humans have a systematic bias to add complexity: https://www.nature.com/articles/s41586-021-03380-y
Subtraction needs to be trained. Simplicity needs to be supported/pushed by management.

@qstokkink
Copy link
Contributor

qstokkink commented Jul 14, 2021

@synctext
Copy link
Member Author

synctext commented Dec 16, 2021

Creating simple systems is surprisingly difficult.

More on the culture of engineering versus self-assembly. IPv8 gossip-based communities are not based on typical engineering methodology: piece-by-piece design; instead, they are build using evolution and emergence. We have primitive code since: 8 July 2003 (see above "The GOD File".) Concepts of the Tribler Lab have evolved for over 18 years.

We need to make new engineers in the team more aware of this: there is no clearly defined blueprint that shows the final structure of Tribler. We(/me) have failed to documented all evolutionary steps and lessons of the past. We need to collectively learn, but we dont have any formal defined support process for this collective intelligence. Therefore our key knowledge exchange happens at the ☕ making 🤖; next coffee-machine meetup a volunteer will be appointed to make meetup-minutes/s. ✍️

@synctext
Copy link
Member Author

synctext commented Apr 14, 2022

DAO engineering with "one-size-fits-all" model is wrong.

Policies or approaches written in immutable code and not tailored to individual needs is probably wrong. Instead of a "one DAO to rule the investment world" approach we need a collection of narrow-purpose DOAs into a composable architecture. Each DAO is a fully autonomous system with a stable API, a dedicated purpose, careful with breaking changes, and conservative governance model. Governance problems are greatly reduces when the purpose of a DAO is stable, the interface is stable and only maintenance-mode decisions are required (still risks of repeating the "Bitcoin civil war"). {Credits: brainstormed "swarms-of-DAOs" on 14 of March 2022 in Amsterdam.}

Permissionless innovation within a zero-trust DAO stack therefore follows the UNIX philosophy (like above cloud). It states that everybody should get along with others. Be an efficient specialist, not ineffective at everything. Functional decomposition of a composible DAO architecture yields: identity, trust, data, money, markets, and AI. We aim to build a DAO for all of them.
Over-engineering warning 🛑. Lets first make a single DAO work, deeply integrated with a single application: Tribler. When that is successful we can continue our engineering dreams of tech utopia. We dont see much freeriding, sybil attack or pollution. Lets volunteer somebody to build the first circular Bitcoin economy (inspired by our Robotic Music Industry). Earn Bitcoins by offering data storage, earn Bitcoin by offering encrypted proxy services, automatically invest in a VPS and get priority downloads by spending Bitcoin. Next audacious step: Bitcoin 2. The Bitcoin 2 DAO boosts transaction rate to 1 Mtps by adding dynamic deterministic periodic settlement on Bitcoin 1.

@devos50
Copy link
Contributor

devos50 commented Mar 26, 2023

Now that I left the Tribler lab, I will below list some of my insights, suggestions and take-away messages I obtained from the time I worked in the lab. Note that I left quite a few research ideas in our private GitHub repository to assign to further BSc/MSc or PhD students. Therefore, the points below are a bit more high-level.

Tribler

  • The anonymous downloading engine currently is Tribler's main selling point.
    • in my opinion, this is really an outstanding feature compared to other torrent downloading engines. Content discovery in Tribler is not solved yet, and users are still very likely to obtain their torrent files from another website. One of the reasons why content discovery is so difficult is that it requires both a solid back-end for search, as well as an acceptable user interface. While the team is good at designing back-ends, we lack specialized experience on the user experience side.
  • The public tends to agree with the vision of Tribler but not so much with its usability and implementation.
    • This insight is mostly based on the comments left on articles when Tribler got featured in news outlets, see for example here and here.
  • The Tribler development team currently is too understaffed to implement and maintain new functionality at a sustainable rate.
    • The latest release of Tribler was half a year ago from writing this post (September 20, 2022).
    • There is a tendency to make a release perfect while, at the same time, it is valuable to increase frequently. Every release generates some media attention (various media outlets pick up these releases - see, for example, here).
    • I would aim for quick and frequent releases, something we also outlined in our recent paper describing our deployment-first methodology.
  • End-to-end Tribler development requires developers to have expertise in many aspects related to software engineering, maintenance, release and deployment.
    • Even maintaining the TU Delft in-house servers for experiments/testing seems to be almost a full-time job that requires knowledge in security and server management.
  • Tribler's stability has improved significantly over the past few years.
    • This is mostly thanks to the great engineering efforts and insights of our developers. For example, the migration to Twisted (and later asyncio) and Python 3.
    • The switch from Dispersy to IPv8 also contributed to a more stable and developer-friendly experience, especially when designing overlay networks.
    • We have also pruned many unused features, such as the RSS feed and mugshots.
    • VLC integration has always been a point of concern. I would advise revising VLC integration later as I believe it can be a key feature of Tribler. It's just that QT support for video playback is whacky.
    • At the same time, some of these major refactoring efforts affect the entire Tribler code base, introducing new bugs or stalling the work on new features.
      • There needs to be a careful trade-off between refactoring existing code and working on new features. In my opinion, Tribler is ready to integrate new features in the coming months (perhaps even in the coming years).
      • If it ain't broke, don't fix it.
  • Rewriting Tribler in any other language than Python at this point is a futile attempt and wasted effort.
    • We have simply invested too much time in building our infrastructure and code base around Python.
    • This is also in response to the many students I met that suggest rewriting Tribler in a "decent" or "mature" language (read: a language they are currently familiar with). The "Not Invented Here syndrome" is real :)
    • I agree, though, that Python is far from perfect for building and deploying large-scale distributed systems.
    • There is a large amount of knowledge in the current system - bugs will be re-introduced when starting to write Tribler from a clean slate.
      • I found this out the hard way when rewriting the user interface in QT during my MSc thesis.
  • Ultimately, Tribler can become an open infrastructure for the deployment and evaluation of decentralized and lightweight algorithms.
    • One problem is that there is little staff to support such deployments.

Tribler Dev Process

  • We should be extremely careful that our resources are not spread too thin.
    • The world, and consequently the Tribler vision, changes quickly. The development team should be agile.
    • The Tribler development process tends to jump between different topics very fast. Tribler has many different components that all need to function in a deployment setting.
      • My suggestion: relentlessly focus on one feature, polish it to an acceptable quality, and then move on with the next one.
    • There are, unfortunately, some components of Tribler that, currently, no one has expertise in (e.g., hidden seeding).
  • Synergy between the research and development teams can be improved.
    • I have been in the unique position of being part of the research and development team. This was not always easy, as I had to carefully balance my time between Tribler development and my PhD research.
    • I would argue for more teamwork between the research and development teams.
      • Researchers can learn a lot from developers and the other way around.
      • Researchers should learn from developers about the challenges when shipping their software to end users. Developers should learn from the researchers about distributed systems and algorithm design.
      • Promoting knowledge transfer between these two teams also requires both teams to have something in common, which in this case is Tribler (or the superapp).
        • But: research should not fully dictate Tribler development and vice versa. There should be room for autonomous decision-making in both teams.
  • Randomness is a very powerful but sometimes overlooked design approach.
    • This should perhaps have been one of my thesis propositions :)
    • During my PhD and Postdoc, I've designed various mechanisms where decisions are made in a random fashion to detect fraud.
      • For example, in the matchmaking algorithm, orders are sent to random users. In TrustChain, blocks are gossiped amongst random users.
    • System designers sometimes (falsely) believe that a sophisticated, complex approach can outperform random decision-making.
      • A related insight is seen in the field of ML, where random forests often outperform complex DNNs at a fraction of the computational cost.
      • We have seen in Tribler that any bias in the system can be fatal to its performance (for example, our edge walking mechanism). Another post-mortem can be found here.
  • Unstructured overlays are much easier to engineer and maintain than structured overlays.
    • The above should not come as a surprise. In Tribler, we mostly rely on random overlays and design our mechanism around random interactions (e.g., the tunnel community, the popularity community and TrustChain).
      • One exception is the DHT community.
    • I recently worked with Skip Graphs which require structure in the underlying network overlay. Their construction and maintenance algorithms are often complicated to reason about and implement.
    • Given our limited development time, I find structured overlays in general unsuitable for deployment in Tribler.
      • At the same time, structured overlays can significantly reduce communication costs by carefully connecting particular neighbours to each other.
  • Immediately aiming for a design that is "fully decentralized" is inefficient (and sometimes not even possible!).
    • The point I want to make here is that the task of a system designer is difficult: one needs to consider many different aspects, including security, fault tolerance, robustness, availability, privacy, and efficiency just to name a few. Designing a system that addresses all of these challenges is beyond the scope of a single research team or individual. Depending on the research contribution one wants to make, I see it as a viable strategy to leave out particular system aspects to people with expertise in these properties. For example, I believe it's fine not to propose a solution for particular security issues and rely on existing solutions if the system designers' proposal focuses on system scalability. This decision, of course, should always be motivated.
    • At the same time, no solution is more permanent than a temporary one. Promises to replace, for example, a central server later on, are rarely fulfilled. Designers should be upfront and honest about this.
    • A system with flaws is better than no system at all.
      • The token economy is a good example of this. We knew there were security issues, yet we got useful insights and data from it and learned how to do better.
    • The systems that we design often contain many moving parts. Devising a design without any central component is very difficult. Instead, I would argue it's often simpler to either drop some security guarantees (e.g., as we have done with our token economy - we knew it was not perfect but deployed it nonetheless).
  • Publicly document as much as possible.
    • So much knowledge has been lost because we didn't bother to document meeting notes or design decisions. During meetings, we would try to recollect memories from the past. This is inefficient.
    • Even if we documented things, knowledge is spread thin between 366 GitHub issues.
      • As a side note, I very much like our practice of open science and public accountability.
    • We should be honest about our successes but, more importantly, about our mistakes.
    • Code that didn't work tends to be removed from Tribler and forgotten about.
      • For example, there are definitely lessons learned from credit mining, even though it didn't gain the traction we would have wanted.

TrustChain and the Token Economy

  • TrustChain is a powerful primitive for lightweight accounting (a bit of self-promotion here 😄).
  • I consider the bandwidth token economy the biggest "failure" during my PhD journey.
    • We had a poor understanding of the dynamics in real-world systems and the things we wanted to achieve. Were we too naive here?
    • There was probably also a mismatch with the anonymous downloading mechanism.
    • There were bugs, e.g., the double accounting bug and obvious security flaws, like token minting (the "Quinten" attack).
    • A key takeaway of this experiment is that users very much cared about their token balance and started to question their download behaviour.
  • Despite the above, I found it very rewarding to deploy all components of the token economy and work on it.
    • A failed experiment is better than no experiment. We learned a lot from it. Now the main task is to avoid repeating the same mistakes. See our recent paper.

Content Organization in Tribler (Tags/Knowledge Graphs)

  • This is a topic that I’ve been working on after my PhD. Detailed information and findings can be found on this dedicated GitHub issue.
  • (also outlined in the issue) I believe that the key to improving content organization and navigation in Tribler is by building and maintaining a knowledge graph.
  • We outline in our recent paper a potential mechanism to fulfil the vision of the global brain.
    • It uses a Skip Graph data structure to replicate data amongst different peers.
    • Unfortunately, this requires a structured overlay which is difficult to implement and maintain.
  • There still is a very large gap between what we have published in articles and what is currently implemented in Tribler.
    • However, many improvements can be implemented in an incremental manner. But if we decide to implement a particular idea from a paper in Tribler, we should stick to it and not be distracted with other manners during development.
  • The most difficult aspect will be maintaining this knowledge graph using crowdsourcing solutions.
    • This should probably be achieved by autonomous agents that continuously inject new information and edit or remove invalid information.
    • Open challenge: how can we make these metadata agents accountable for their actions and punish misbehaviour?
      • Note that I did some preliminary work on this which can be found on this issue. I was unable to devise a reputation mechanism that would be effective at countering false information. Maybe MeritRank might be more helpful here.
  • Ultimately, we can start devising more advanced approaches to obtain recommendations from this knowledge graph.
    • Disclaimer: This will likely still take a while.

The Bumpy Road to ML Deployment in Tribler

  • There currently is no deployed, decentralized ML approach.
    • It is a relatively new subfield of ML. Most work on decentralized ML so far has been conducted within a data center setting, where hardware and network capacities are assumed to be homogeneous. In a data center there also is no Byzantine behaviour.
  • There are many research opportunities where we can re-use our labs' prior work.
    • For example, accounting ML model updates with TrustChain or rewarding users for their work during collaborative training with MeritRank.
    • The focus on privacy by Tribler aligns well with approaches like Federated/Decentralized learning in which data remains on users' devices.
  • Tribler is NOT ready for the integration of DL models because of the following three reasons:
    1. Currently, no one has enough expertise to integrate an ML framework in Tribler.
      1. While I think everyone in the current development team can learn to do this, it would require a full-time investment.
    2. There currently is no clear need to integrate DL training in Tribler. More lightweight solutions, for example, linear models, might be explored and deployed as a first step.
    3. It is extremely easy to do ML incorrectly (e.g., using the wrong models or hyperparameters).
  • In summary, as the Tribler team is understaffed, maintaining an ML solution will likely be too time-consuming.
  • Approaches like Gossip Learning align very well with the architecture of Tribler.
    • They even show lower time-to-accuracy compared to training on fixed topologies (D-SGD).
  • Decentralized Learning is a great research task for MSc/PhD students, though.
    • But deploying these approaches is probably too difficult for now since our infrastructure is not ready for it.

Advice for MSc/PhD Students

  • The “no running code, no passing grade” is a great quality standard for MSc/PhD students.
    • It helps to distinguish the goals and expertise of the lab from that of others.
  • In my opinion, every PhD thesis produced by the lab should contain at least one chapter with results obtained from deployment (e.g., from Tribler or the Superapp).
    • Now, this opinion might be a bit more controversial. I learned a lot from deploying my algorithms, and I think it should always be the goal of any line of research to get your ideas deployed and add value to end users. Tribler has made that possible for previous students and myself.
    • At the same time, a deployment can take very long to complete, and there might be external factors outside the student's control that make it hard to successfully obtain and analyse results. I suggest planning a deployment as early as possible in the PhD process.
    • The only exception I see would be cum-laude potential students, where there should be more focus on presenting a particular idea/extensive evaluation in a controlled setting to obtain a scientific publication.
    • In the eyes of some reviewers/conferences, a deployment is unfortunately not adding much value to a paper.
      • The exception, perhaps, is top-quality systems conferences where deployment is considered valuable.
  • It is difficult at times to keep focus as a researcher in the Tribler lab.
    • The Tribler lab works on almost all aspects of the decentralized systems stack, from the network layer to the user interface.
    • My advice is to identify your expertise/interest in a particular sub-component early in your PhD and focus on it. For me, I focussed specifically on the application layer and designing applications with TrustChain.
      • Yet, always remain open-minded to learn about other components and to obtain new insights from other parts of the stack.
  • PhD students in the Tribler lab are faced with a publish-deploy trade-off.
    • There is pressure to publish papers in conferences/journals of sufficient quality within the period of their contract (usually 4 years), but at the same time, there is pressure from the lab to deploy our ideas in Tribler and to actively contribute to Tribler.
    • Our DICG workshop is getting more mature and can eventually act as a target venue for our publications.
    • Our deployment methodology is what makes the lab unique.
    • One solution: salami slicing; publish one paper explaining the idea and another with some deployment results.
  • Determine one or two "main" conferences/journals that align with your interests/expertise and try to get involved with them.
    • Even visiting a conference without having a presentation is a valuable experience.
    • Don't hesitate to reach out to other people that work on similar topics as you.
    • Selling ideas to a scientific community you are not familiar with is very challenging and requires a co-author that is familiar with the target community. I learned this the hard way by submitting some articles to security-oriented conferences.
  • Don't be afraid of submitting to a high-quality venue.
    • The review process is ultimately a numbers game.
      • My hypothesis: with enough dedication and effort, any paper will eventually be accepted in a good conference or journal.
  • Keep a (public) journal/research log.
    • As MSc/PhD student, you work on many different things. I recently started to maintain a research log and found it extremely helpful in organizing my thoughts and ideas.
    • This also gives other lab members and your supervisor an idea of what you are working on.

@synctext
Copy link
Member Author

synctext commented May 3, 2023

Lesson: Focus on your one true core? After 18 years and 1 month of Tribler we are still making the search&download core production-ready, stable, efficient, and fast. IPFS attempted from 2019 onwards to make 2 clients: The reference implementations of the IPFS Protocol (Go & JS) become Production Ready. Within the 2023 IPFS ecosystem there are 17 implementations, various libraries, and multiple networks they feel required to define more clearly what IPFS is. For Tribler, the one true reference core implementation is the specification. Different choices.

@synctext
Copy link
Member Author

synctext commented Sep 27, 2023

Lesson: stability matters
After 18 years and 5 months of Tribler development, we still don't have our core stable.
With our recent 7.13 release we focused on stability and the core features of search & download. We have now 58 bugs reported by our volunteers. These 58 unique bugs are registered inside Sentry through our automated bug reporting pipeline with detailed debugging info, duplicate bundeling, and automatic private-info stripping.
The connecting to core took April to August 2023 to understand and hopefully fix. Now we still have issues with the GUI-Core connection. Maybe the root-cause-of-failure is blocking of main thread of a few seconds by a process, unknown.
Even though this is a lot of bug and especially nasty bugs, it is better than before. Lot of them seem to be the easy class of bugs, one developer can fix 5 of those per day. Two years ago we where in much worse shape. Lot of technical depth. Our sentry setting around 2021 was to hide and ignore all bugs which where reported by less than 10 volunteers.

Complexity is our enemy
Stability, overengineering and complexity are our problems. COPIED from blog
We, engineers, naturally react to hype. We get obsessed with the idea of learning something new and building complex, all-powerful solutions. No surprise, AutoGPT included vector db at the very beginning. BarterCast, Libswift, Dispersy, LevelDB, etc. But as the time goes by, good engineers focus on what’s really important. Hype is over, now that some value needs to be delivered to the actual users, complexity becomes our enemy.
Tribler is known on The Internet :-) We are a bad example. It is a warning to add trust and to kill the lightness of old tit-for-tat. COPIED from YC forum
I’d be wary of creating something that looks a bit like Tribler, which while an interesting project seems to have demonstrated that implementing trust, reputation and privacy at the protocol level carries too much overhead to be a compelling alternative to plain old BitTorrent, for all its imperfections

@synctext
Copy link
Member Author

Lesson: emergence requires decentralisation and crowdsourcing requires micro-contributions
Social process of crowdsourcing failed (e.g. Gigachannels). The ambitious roadmap for perfect metadata and enrichment for a Big Tech alternative with markdown support and merger of Wikipedia, Scholar.google, and Youtube/Tiktok. The unit of contribution was too big. This required 20 people to create 80% of the content: too centralised. Not permissionless. True decentralisation and emergence requires micro-contributions. Background reading: performance of channel volunteers is superlinear, leading to superlinear returns, winner-takes-most, and centralisation

@synctext
Copy link
Member Author

synctext commented Nov 3, 2023

Bug hunting speed

Lesson: keep track of bug inventory
Paused improving tags, switched to bug hunting mode till remainder of 2023. Speed of closing bug was last week: 3 bug for 1 full-time week for 1 developer. How much time are we spending on getting experimental features stable versus fixing bug in Tribler general? Stop working on feature and first do boring or painful fixing chores? Software contains many defects. This is especially the case with fresh experimental university software: Tribler. Our world-first decentralised trust, decentralised AI, self-organising overlay, etc. comes with numerous bug. Currently 199 unresolved bugs in sentry, 103 marked as unresolved in current 7.13 release. One particular nasty bug still not taken seriously for a few years. Huge torrents don't work in Tribler. Speculation is that we might have had this bug for 15 years, 11 months and 14 days (the dark old days of 11 threads, would require extensive useless checking).
Bugs can be nasty to fix. Pull request tries to fix 1 of 2 underlying causes we believe might trigger the race condition around CoreConnectionError: "The connection to the Tribler Core was lost" This possible fix is only 253 extra lines of code. {Too} complex fix. Still uses database to store which Tribler is primary, to avoid starting Tribler twice. Fixed multiple bug which crashed the core. Crashing cores has been taken us 1+ year to fix.

3 bug fixed in 1 full-time week for 1 developer

@synctext
Copy link
Member Author

synctext commented Jan 9, 2024

We're stuck, see thread!

  • ✔️ Huge codebase
  • ✔️ Slow unit tests
  • ✔️ Huge amount of time for 1-line fixes
  • ✔️ Huge changes feel more efficient
  • ✔️ Huge Pull Request for productivity
  • ✔️ Huge discussions on each PR
  • ✔️ Unclear project guidelines on code practices
  • ✔️ Huge discussions on minor side issues

@synctext
Copy link
Member Author

synctext commented Jul 15, 2024

anti-competitive collectives for post-capitalism {brainstorm}

Key lessons from collectives are that they need a clear vision. This enables gathering of social capital and establishment of trust. Thus the above initial sketch of reasoning from "first principles". Next step is a roadmap for relentless monotonic growth. Cardinal milestone is defeating an entrenched business model. Show the world that cooperation can beat monopoly capitalism. Establish a de-facto non-profit monopoly based on openness, sharing, and kindness. Kinda challenging in todays toxic Internet 💀.

@qstokkink qstokkink removed this from the Backlog milestone Aug 23, 2024
@drew2a
Copy link
Contributor

drew2a commented Aug 26, 2024

I wrote and rewrote this post many, many times. First, I made a long list of things that could be improved, then I spent a lot of time choosing the right phrases, and then I reread the current issue and removed the points that had already been mentioned in one way or another.

In the end, I decided to keep it brief, because all my points could be addressed with the same piece of advice: if we want to transform Tribler from a scientific prototype into a product, we need to hire a manager. This is the only and main advice I will leave. Self-organization without a manager or without a clear common goal leads to what in Russian fables is described as "the swan, the pike, and the crawfish" https://allpoetry.com/Swan,-Pike-And-Crawfish.

I will also leave the thought that sole ownership of a repository, feature, or project is a form of centralization.

@synctext
Copy link
Member Author

Keep it simple in space:
https://www.reddit.com/r/ASTSpaceMobile/comments/p0m1yo/the_popup_array_unfolded_analyzing_an_ast_space/
Huge unfolding satellite with complexity on earth phase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: documentation type: memo Stuff that can't be solved
Development

No branches or pull requests

6 participants