Women Data ~ Labels to train NLP, NLU, Image Recognition AI with Support, Defense, Offense (YouTube & Reddit)
Texts & photos labeled of Support, Abuse, Defense of Women for creating Equality AI, Natural Language Understanding, and image recognition Protections for Women on social platforms.
Done to accurately label defense of Women as "Support" and offense and defense against as "Offense."
Shows 4 labels: support_women, defense_women, offense_women, defense_against_women
Previously data is not obtained about women or the abuse labels online did not classify about women accurately. E.g., Women's Defenders and Supporters were being labeled by NLP and by Sentiment as 'offenders' from open-source datasets.
- Part of Anti-CyberAbuse projects and Rescue Social Tech - Women (Natural Language Processing)
Examples: We looked at specific case examples against women, e.g., on actresses, on women driving, domestic abuse, women's rights, and other female-focused areas.
This is a Sample, worked on by an intern from the University of Chicago for Worldie - Social Media for Good. For your purposes, you will need to edit. This 2K+ Sample is not perfect (requires tens of thousands of texts).
- Collect the Data
- Label the Data (0-1 in labeled columns)
- Train from NLP packages on the internet
- Use an NLU tester page to see the result (attached example) and APIs
Important:
- Machines look at patterns of words, so the more data is trained, the more patterns it will recognize
- See the perspective from the female viewpoint and well-being
- Look at context (previous texts, texts afterwards, images, situation)
- Look at symbolism
- Look at semantics (meaning)
- Test your results of the processing with scientific methods in the context
Data is from Reddit and YouTube