Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

definition of favorites, listens and interest in track features #14

Open
yiyichen opened this issue Nov 28, 2017 · 2 comments
Open

definition of favorites, listens and interest in track features #14

yiyichen opened this issue Nov 28, 2017 · 2 comments

Comments

@yiyichen
Copy link

My team are conducting an academic research project using your dataset, and we were wondering if you can help us clarify what each of the three columns means

  • track_favorites
  • track_listens
  • track_interest

Specifically, we are looking to understand how these columns are generated and if they can be a good measure of popularity, or if any other columns should be used instead.

Thank you!

@mdeff
Copy link
Owner

mdeff commented Nov 29, 2017

All this data comes from the Free Music Archive API. The documentation is unfortunately rather scarce... (I am preparing a table to explain each field in greater details.)

  • track_favorites is the number of users who starred the track. If you look at the screenshot below, a logged in user can star a track by clicking on the star icon.
  • track_listens is the number of users who listened to the track. Anybody can increase the count by listening to the sound online by clicking on the play icon.
  • I don't know exactly what track_interest is. I guess it's a measure of how popular a track is that they use for recommendation. I asked the administrators but did unfortunately not get any answer. Let me know if you can figure it out!

I'd say the first two are solid measures of popularity. The third might be, but we should be sure what it is. You might also consider:

  • track_comments, the number of comments users wrote about a song.
  • favorites, listens and comments at the album and artist levels. A user may e.g. star a whole album instead of every individual tracks. If you look at the histograms here, you'll see that users listen much more to whole albums than individual tracks for example.
  • The number of downloads, as seen on the screenshot below. I did however not collect this because it's not available through the API. But it can be scraped from the website.
  • Another measure might be how often tracks appear in public mixes. These are playlists created by registered users. I did not collect this either, but it can again be scraped from the website.

Please let us know if you collect additional data and I will mention it here. :)

2017-11-29-15 43 11

@mdeff
Copy link
Owner

mdeff commented Feb 20, 2018

Todo: add an appendix in the paper which describes where each piece of information comes from, similar to the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants