Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support half-length bars when tracking downbeats / bars #413

Open
declension opened this issue Jan 29, 2019 · 5 comments
Open

Support half-length bars when tracking downbeats / bars #413

declension opened this issue Jan 29, 2019 · 5 comments

Comments

@declension
Copy link
Contributor

declension commented Jan 29, 2019

Background

Some songs, whilst regular in tempo and metre have an "extra" bar that are half as long (or arguably have a 50% longer bar).
Understandably this confuses DBNBarTrackingProcessor DBNDownBeatTrackingProcessor, which I've found if allowed will often choose 6-time (for a 4-4 piece) to account for these if they happen regularly enough. Obviously this means that half the time the bars are in the wrong place, and all the time they're in the wrong meter. In contrast, if it chooses the generally correct meter (e.g. 4-4) then it will be 50% out for many important bars (start of chorus etc).

Feature request

I don't know how the processors / RNN work (still a very new area to me) but it seems if they can accommodate BPM changes (the transition_lambda I think) perhaps there is a way to accommodate arbitrary or at least half-length bars, so that the beat part of the tuples emitted read 1, 2, 3, 4, 1, 2, 1, 2, 3, 4....

I wouldn't expect much luck with truly complex time signature changes (e.g. Master of Puppets) but half-bars feels like a sweet spot perhaps?

Example songs

I found a list of some better known ones.
Also on a heavier note: One Step Closer.

@superbock
Copy link
Collaborator

superbock commented Jan 29, 2019

The problem is that the DBN part of DBNBarTrackingProcessor DBNDownBeatTrackingProcessor models only a single bar length at the time, i.e. it does not allow meter changes. This choice was made in order to make everything computationally efficient.

There's code around which can deal with varying meter, but it operates on the beat level instead of frame level, please have a look at DBNBarTrackingProcessor.

Of course, this does not solve your issue, but it might be a way to deal with it. If not, you could alter the transition model of DBNBarTrackingProcessor DBNDownBeatTrackingProcessor to allow transitions from the middle of a bar to its start. In order to do so, you have to alter BarTransitionModel and add transitions with an adjustable probability.

Probably adding only transitions from the middle of a bar works better than what I tried back when working on the downbeat tracking system. I experimented with a single state space modelling up to N beats per bar and added "escape" connections, i.e. allow the bar to end at any beat position. Unfortunately this led to degraded performance since most musical pieces have a fixed time signature.

EDIT: fixed the confusion of DBNDownBeatTrackingProcessor and DBNBarTrackingProcessor

@declension
Copy link
Contributor Author

Thanks! Those are some useful tips.

I realised I actually meant DBNDownBeatTrackingProcessor which is the one I've been using, although I didn't / don't really understand the difference. Actually AFAICT the docs get them mixed up in once place. They seem to end up accomplishing the same thing in my mind, but I only managed to get the DownBeat one working originally.

@superbock
Copy link
Collaborator

As if I knew what you were referring to, I made the same mistake; I corrected my answer above.

As mentioned above, the difference lies in the "level" of signal it is working on:

  • DBNDownBeatTrackingProcessor works on a frame level, i.e. operating with 100 fps;
  • DBNBarTrackingProcessor works on the beat level, i.e. it requires beats as additional inputs in order to compute beat-synchronous features. It thus usually operates with 50 to 200 bpm (or whatever range gets detected by the beat tracker).

In case of DBNDownBeatTrackingProcessor modelling two bar lengths of 3 and 4 beats with the default tempo range leads to a total of 11157 + 14876 = 26033 states. Goal is to minimise transitions between these states as much as possible, since every transition into one of these states needs to be computed. With default settings there are 14876 and 21648 transitions, respectively — thus we decided to not allow meter changes, since this would multiply the number of transitions (and require even more memory).

In case of DBNBarTrackingProcessor, there are only 3 + 4 = 7 states, since it uses beat-synchronous features. Thus it is computationally feasible to have transitions between all these states.

Hope this explains the difference. Having this said, it is not impossible to add "shortcut" connections from the middle of a bar back to its start, since this would only double the number of transitions required. If no tempo change is modelled, this can be basically reduced to only one additional transition, but I guess the whole system will perform worse than one with tempo changes between beats.

P.S. the mixup is due to copy paste of docstrings I guess.

@declension declension changed the title Support half-length bars in DBNBarTrackingProcessor Support half-length bars when tracking downbeats / bars Jan 30, 2019
@declension
Copy link
Contributor Author

declension commented Jan 30, 2019

Hehe - sychronised typos 😆

Thanks for the explanation, I think that makes a lot more sense now.

Maybe I'll start by trying to get the DBNBarTrackingProcessor working (an end-to-end i.e. .wav to beats data example would be really helpful here for newcomers. Perhaps I'll document if I get it all working...), and then just give it the half-length metres in beats_per_bar (e.g. [2, 3, 4, 6] - with the 2 in particular allowing an uneven bar), without having to add shortcut transitions at all. Then, so long as the probability can be tweaked for a good trade-off, the transitions between full bars of different metres should allow this.... maybe..

This reminds me, I had a related but more general point about higher-level feature extraction, but I'll not muddy this issue with that.

@superbock
Copy link
Collaborator

I planned to do add automatic beat tracking to BarTracker in case no beats are given. This should be easy to accomplish.

But back to the topic: you can combine the beat and downbeat activations of RNNDownBeatProcessor to get a combined activation function (which also includes the downbeats) and use this with DBNBeatTrackingProcessor to track the beats and combine the extracted beat positions with the downbeat probability in a 2D array to be fed into DBNBarTrackingProcessor (which models the half-length bars). Probably a weighting for the bars similar to #408 would make sense as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants