This is an NLP project that can detect misogyny (i.e. derogatory comments aimed at women) in sentences.
It includes multiple cleaning and tokenisation methods, and uses GloVe model for vectorisation and BERT model to classify the statements as misogynistic or not.
The project is built on a kaggle notebook with the input data being the following:
Download the GloVe model scraped from twitter through the link: https://www.kaggle.com/datasets/bertcarremans/glovetwitter27b100dtxt (file name: glove.twitter.27B.100d.txt)
Upload the Sexism annotated data csv file onto the environment: https://www.kaggle.com/datasets/pes2ug20cs532/sexism (file name: sexism_data.csv)
The notebook can be run in a regular python environment.