Skip to content

IETF Research Questions

Nick Doty edited this page Mar 26, 2018 · 8 revisions

Research questions of interest for IETF101 Hackathon hacking

Feel free to add.

  • how many participants total in IETF work?
  • how "sticky" is participation?
    • what's the attrition rate?
    • what's the distribution of length of participation?
  • who has participated longest? across the most groups?
    • is there a core across working groups?
    • how many groups does the typical participant join?
  • How does participation look like per affiliation?

  • How did participation per affiliation and affiliate category develop over time?

  • What is the relationship between mailinglist participation and RFC authors

  • How did the usage of specific words evolve (of combined lists, and specific lists)

    • How do certain topical words move between mailinglist, are there central nodes for this?
    • What are 'trending topics' for X period for X mailinglist
  • What is the gender distribution of participants?
    • Does the gender distribution of conversation differ from the gender distribution of the participants?

Existing notebooks / work

  • word analysis

    • single word trends, multi word trends across list [DONE]
    • influential words (that move between mailing lists?) [DONE]
    • recurring words from particular senders [DONE]
    • use of words by a user on multiple lists [DONE]
    • first occurrence in a list [DONE]
  • interaction graphs

    • replies between people
    • threads and thread lengths
    • overlap between mailing lists/groups
  • entity resolution / consolidation

    • affiliation
  • basic descriptive statistics on a mailing list

  • cohort visualization [DONE]

Presentation

Slides presented at IETF101 Hackathon

Issues:

  • x-unknown unknown encoding in message Fri, 32 Jan 2008 19:05:18 +0900

Statistics:

  • Amount of total emails 1.944.019

  • Amount of emailaddresses these were sent from 19204

  • Number of mailinglists 1016 downloaded, 955 analyzed

  • One mailaddress sent 80533 messages

  • Most contributors sent 1 or 2 messages (spam?)

  • Cohorts / tenure

Network Analysis

  • Clustering vs one community
  • Showing clusters
  • Centrality of top posters
  • What percentage of mails was sent by the top 1% contributors

Word

  • uses of IPv6, middlebox/es, catenet, DECnet on ietf@ietf
  • influential words

Threads

  • Longest thread on ietf@ietf