A possible strategy to create a simple language classifier that will rate social media posts as homophobic or not.
Note: I am currently on step 6 and things are looking up. I’ll be posting my code to my GitHub soon
Warning!
These are not nice posts you'll be working with. You will run across foul language and hatred, only proceed with these datasets with caution. BUT this process might also work for other data sets like marketing posts or sports posts.
1. Make your own tiny crumby data set by looking up horrific social media threads that have both homophobic and non-homophobic language
I believe iteration is important. So many things will be learned by piecing this together and keeping your first data set simple will allow you to be muuuch more comfortable moving to the next step.
2. Split these into training and validation data sets labeled as homophobic or non-homophobic.
You will likely have to label these yourself and it's a bit daunting, but do it anyway. This is for practice just get your first iteration moving.
3. Load a Wikipedia-trained model into your Jupyter Notebook and use the Fast.AI library to train it on your dataset.
Essentially you are training this model which can predict the next word in Wikipedia
text to predict the next word in the social media threads you’ve found. It is important to note that this will not be a good model, but it will run quickly and allow you to find errors in your setup quickly.
4. Fine-tune your model so that it can be over 30% accurate at predicting the next word in a post. Save it as an encoder(last layer unfrozen).
The closer you can get to 40% accuracy the more accurate it will be in later steps
5. Fine-tune this same model to be able to classify each text as homophobic or non-homophobic.
See if you can get it to 75% accurate, but don't worry about this. You've now iterated and are familiar with the process for making a model worth fine-tuning well.
6. Redo step 3-5 training your model on the Kaggle data-set that uses flagged Wikipedia posts.
7. Manually go through entries marked True in the "identiy_hate" column and manually label which are homophobic and run step 5 with your saved "encoder(model)."
8. Run steps 6& 7 on this twitter dataset from kaggle.
7. Use the Way Back Machine(Archive.org) to create a much more thorough dataset of homophobic and non-homophobic data from social media threads that have been removed from the internet because of their unacceptable content.
8. Use this model to scour massive numbers of social media posts and create a timeline and map of homophobia flare-ups in social media.
9. Find ways to help experts use this system to intervene, educate, and de-escalate social media as these flare-ups happen in hopes of fostering more tolerance, and understanding instead of divisiveness and hatred.
Comments