Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine the database of organizations #170

Open
Ly0n opened this issue Sep 14, 2023 · 2 comments
Open

Refine the database of organizations #170

Ly0n opened this issue Sep 14, 2023 · 2 comments

Comments

@Ly0n
Copy link
Member

Ly0n commented Sep 14, 2023

The database of active open source organizations in the field of environmental sustainability is more important to many users than the list of projects themselves. Therefore, here are several thoughts on how we could expand this database:

  1. Add topics / fields where the organisations are active.
  2. Cluster namespaces based on legal organizations. Many larger organizations such as Google, NOAA, NASA, or LF Energy have multiple namespaces that we could link together.
  3. We could generate activity scores for organization name spaces.
  4. Using the social media accounts of the various organizations, it is easy to create a "news" feed about open source in the field of environmental sustainability. For sure with have to blacklist some very large generic social media accounts and create a simple way to quickly review such post.

I think we'll come up with a lot more here in the near future.

@Ly0n Ly0n assigned Ly0n and unassigned Ly0n Sep 14, 2023
@jmertic
Copy link

jmertic commented Sep 14, 2023

We can probably do this easily from the Crunchbase data of projects; we capture the primary organization and then look up all the relevant organizational data in Crunchbase from there.

@Ly0n
Copy link
Member Author

Ly0n commented Nov 5, 2023

The organization data has been updated and can be found here as spreadsheet:
https://docs.getgrist.com/gSscJkc5Rb1R/OpenSustaintech/

Original CSV file:
https://github.com/protontypes/AwesomeCure/blob/main/csv/github_organizations.csv

I cleaned and labeled: organizations_names, website, country and organizations form of about 200 new organizations.
Possible next steps could be:

  1. Map projects topics to organizations so that we can map organizations based on topics.
  2. Improve "activity" of organizations. This is not correctly updated with the current script for datamining.
  3. Map organizations based on URL domain spaces.

@Ly0n Ly0n added the good first issue Good for newcomers label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants