Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosine similarity: Top pca component vs she-he #7

Open
santoshbs opened this issue Jun 30, 2020 · 0 comments
Open

Cosine similarity: Top pca component vs she-he #7

santoshbs opened this issue Jun 30, 2020 · 0 comments

Comments

@santoshbs
Copy link

santoshbs commented Jun 30, 2020

Hello Debiaswe Research Team, Thank you for making the code related to your paper available. This is very helpful! I am writing to seek clarification on analyzing gender bias in word vectors associated with professions.

In your paper, you suggest using cosine similarity between a given profession vector and the top PCA component. I am trying to replicate the same in the wiki context. Unfortunately, I am getting results opposite to expected.

For example, when I compute the cosine similarity between the waitress vector (or nurse vector) and the top gender principal component, I get a -ve score. However, when I compute the cosine similarity between the same profession vector and she - he vector (as you show in the example here), I get a +ve score.

I am confused about why the sign flips when using PCA and straightforward gender vector. I request your help.

Thank you!
sbs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant