Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: set directions by datapoint as vector c("north", "south-east", "west") #148

Open
peeter-t2 opened this issue Feb 5, 2020 · 11 comments

Comments

@peeter-t2
Copy link

Feature request

Locations of annotations can end up having predictable issues. E.g. for some points that are too close, it may be good to select the direction towards which to repulse more precisely, (e.g. in degrees or cardinal directions). Nudge_x & nudge_y work. E.g from image from another example, I might want to have all the annotations outside of the line, which might be easiest done by simply giving directions to repulse to for each of the datapoints as c(180,245,325,360).

image

@slowkow
Copy link
Owner

slowkow commented Feb 5, 2020

Thanks for the request. This is similar to previous discussions in #11 and #25.

These are the features that ggrepel supports right now:

# During the repulsion simulation, only permit motion along x or y axis
geom_text_repel(..., direction = "x")
geom_text_repel(..., direction = "y")

# Before running the repulsion simulation, move the labels slightly
geom_text_repel(..., nudge_x = 0.1)
geom_text_repel(..., nudge_x = 0.1, nudge_y = 0.1)

I need help to see clearly if there is a nice way to generalize this interface in order to give the user more control over how the labels move.

Could I please ask if you might have any comments or suggestions on what the ideal user interface would look like? Do any of the ideas below satisfy your needs?

Idea 1 is copied from #25. Idea 2 tries to describe what you have mentioned. Idea 3 is already available to ggrepel users (but I forgot which issue you copied this from).


Idea 1: constraints for up, right, down, left

A dataframe that constrains how much ggrepel is allowed to push each label in each direction.

> direction
      up right down left
[1,] 0.0     1    0    1
[2,] 0.5     1    0    1
[3,] 0.0     1    1    1
[4,] 0.0     1    0    1
[5,] 0.0     1    1    1
[6,] 0.0     1    1    1
  • In this case, our plot has 6 labels (1, 2, 3, 4, 5, 6).
  • We want to allow the algorithm to freely push all of the labels to the right (1) and to the left (1).
  • However, labels 1, 2, and 4 should not move down at all (0).
  • All labels except 2 should not move up. Label 2 can move up, but it will resist upward displacement (0.5).
  • Labels 1 and 4 should not move up (0) or down (0) at all.

We could pass the data frame to ggrepel like this:

ggplot(...) + geom_label_repel(..., direction = direction)

Idea 2: preference for angle of motion

Based on your comments, maybe there is another way to achieve a similar result.

Instead of giving a dataframe with 4 columns that constrain the 4 directions, we might instead give numeric vector with angles (in degrees, radians, or English words like "southeast") to indicate our preference for where we want each label to be pushed.

> direction
       deg
[1,]   0.0
[2,]  90.0
[3,]   0.0
[4,] 180.0
[5,]   0.0
[6,]   0.0

In the example above:

  • labels (1, 3, 5, 6) would prefer to move to the right
  • label 2 would prefer to move up
  • label 4 would prefer to move to the left

Users might want a more expressive syntax than an angle in order to express their preferences for where the labels go.

Idea 3: add extra unlabeled data points (already available in ggrepel)

This is an idea about using an existing feature.

image

As shown in this figure, we can indirectly manipulate where labels will be pushed by adding unlabeled (red) data points. Those additional data points will push the labels away.

It might be laborious to figure out how to add the additional data points in the right places. On the other hand, this approach is highly flexible, and anyone can already make use of this idea without any new ggrepel code.

@peeter-t2
Copy link
Author

peeter-t2 commented Feb 6, 2020

I would go for the "Idea 2" approximate angle of motion that you specify as a vector if you want some text to move differently or singular value if you don't. Specifying one value seems easier to understand than specifying 4 and it wouldn't necessarily be only horizontal/vertical. And I guess splitting the datapoints into groups and creating dummy lines can be tedious.

I guess creating dummy data points can be an option too, but this can still be difficult to control - and might be done easier manually outside R after all. E.g. the letter 'a' is a bit too close to the line on the right plot there. Adding data points might help but there will be a lot of trial and error needed probably.

Thanks a lot for being so quick and thorough to respond!

@r2evans
Copy link

r2evans commented Aug 13, 2020

Up front, three thoughts for this discussion:

  • thanks for ggrepel!
  • for idea 2, I'd suggest that an NA direction means "unconstrained";
  • is there a reason that nudge_x, nudge_y, and direction cannot be dynamic within aes? It doesn't work presently (ignoring unknown aesthetics: direction), but that could go a long way towards satisfying some per-point directionality constraints.

(I thought my issue/FR is mostly related to this issue, so I added it as a comment. If you would prefer this as a break-out issue, I can move it.)


I have a similar scenario, where ultimately I was hoping to nudge points in a direction suggested by another geom: in this case, using the red-line as an axis of sorts, push the labels outward along that axis. In this sample plot, the top "Y" and bottom "I" (among others) would be pushed further outwards left/right.

image

This is programmatic, so the orientation (north/south here, this is a coord_map) is not easily adjusted, and it is just as often diagonal.

My first thought was to repel the labels away from some other geom (i.e. the red line), but that won't be easy. (And it is not always two-dots-wide, sometimes more.)

My second thought was similar to idea 1 above, where in this case I'd effectively say left=1, right=1, up=0, down=0 for all points. If this is the thought, then I'd think making it an aesthetic (well, four) would be better than a static one, so that it can take advantage of the data provided to ggplot2.

Idea 2 (direction of nudging for each point) isn't quite right here, as it can be in either direction based on the orientation/side. And determining which side it should be on a priori kind-of breaks the intent of ggrepel, a package that does that for me (with sane defaults/logic). (I'm sure we could abuse complex numbers to garner an axis of preference, in a way, but ... too complex, I believe.)

I tried idea 3, adding ... a lot more otherwise-invisible points along the red line, and was not able to get much lateral offset difference.

@slowkow
Copy link
Owner

slowkow commented Aug 14, 2020

@r2evans Thanks for the comment and example! I think there is room for improvement in ggrepel, and I'd be happy to review pull requests that make it better.

@r2evans
Copy link

r2evans commented Aug 14, 2020

Nothing critical intended in my comment, I understand it's not a trivial feature to implement. Thanks!

@slowkow
Copy link
Owner

slowkow commented Jan 6, 2021

@aphalo Could I ask if you might be interested to share any thoughts on Ideas 1, 2, 3 above? I think you mentioned that this is a new feature you might be interested in.

@aphalo
Copy link
Contributor

aphalo commented Jan 7, 2021

@slowkow Sorry, I had not read this issue earlier. Broadly, I can think of three different approaches: a) more sofisitcated nudging, b) directional repulsion force, and c) defining a region that can have a more complex shape than the rectangle delimited by ylim and xlim. For b) your idea 2 seems best to me as it needs only a single vector and allows the force to be set locally perpendicular to any curve or shape edge. Still we would need in some way to specify when we allow repulsion towards opposite directions, say 90 and 270 degrees, versus only one, say 90 degrees. A new idea, would be to allow a range of angles, as this would make finding space for the labels easier.

There is, I think an important caveat: if we end manually setting a nudge or force angle for each individual label, repulsion is no longer automatic, and we gain almost nothing compared to not using repulsion. We can achieve this manual postioning by using nudging without repulsion (with the 'ggrepel' geoms to get the segments).

I have been trying to work out if any of this could be automated in some way, as this would be much more useful than manual tweaking.

Approach a) (sofisticated automatic nudging, or manual nudging of individual points) seems the easiest to implement as is does not require fiddling with the repulsion algorithm. Approach b) seems doable for idea 2) without big changes to the algorithm. However, repulsion is already quite sensitive to the supplied forces and other repulsion parameters, and could become even more dependent on tweaking. Of course, a) and b) can complement each other and even can be developed independently. I guess a) would work just fine with smooth lines or boundaries, but b) or a) + b) would work better with ragged lines like the solar spectrum.

These are mostly random thoughts... I will try to give a go to approach a) myself in the next few weeks.

@r2evans
Copy link

r2evans commented Jan 7, 2021

Perhaps I misunderstand, but I think I disagree with the premise of one sentence:

if we end manually setting a nudge or force angle for each individual label, repulsion is no longer automatic, and we gain almost nothing compared to not using repulsion

While I don't know the mechanism being used for the "nudging", I imagine that it is currently either unconstrained (both x and y nudging are available) or constrained in one axis. What is the difference between constraining one axis (x or y) and constraining the forced angle to be 90/270 or 0/180? While I understand that having different angles for each label is far from trivial, it's still a constraint on one axis (each).

If I misunderstand, my apologies. Thanks for the discussion and work!

@slowkow
Copy link
Owner

slowkow commented Jan 7, 2021

@r2evans I think the idea is:

  • direction = "x" allows the label to move left and right
  • direction = 0 allows the label only to move to the right
  • direction = 180 allows the label only to move to the left

Idea 4: Masks

Wouldn't it be great if ggrepel knew which pixels in the plotting area have already been occuppied by all previous layers (texts, columns, points, curves, shapes, etc.)?

The ggwordcloud package is doing something like this with the idea of masks (black and white PNG images indicating which areas are occupied or not).

Please see this code for details: https://github.com/lepennec/ggwordcloud/blob/48b37ac9c2adc31a9ebc641d01c544c9c20c7eb3/R/geom-text-wordcloud.R#L529-L547

I don't fully understand what is happening in the ggwordcloud code, but I would encourage anyone with interest to play with that code and see how far it can be pushed.

For example:

  • Is it possible to convert a ggplot2 object with multiple layers (lines, points, shapes, columns, etc.) into a simple black-and-white mask.png file? I don't know.

Screenshot 2021-01-07 at 11 09 52 AM


Just sharing ideas! I don't have concrete plans to implement any of these ideas right now. Please feel free to try anything you wish and open a pull request if you think users will benefit.

For anyone following along with this discussion, let me take this opportunity to share the Related Work link:

https://ggrepel.slowkow.com/articles/related-work.html

Many people have tried various ideas, and some ideas are even written up as academic papers. At the Related Work page, perhaps you can find an existing R package that suits you (e.g., directlabels)? Or an established idea that can be copied into R?

@aphalo
Copy link
Contributor

aphalo commented Jan 7, 2021

Idea 4-simplified: Let the user provide the matrix as an argument (say 50 x 50 cells), as this would allow to keep rendering and repulsion computations independent of other plot layers. Or at least one could think this as two separate problems: repulsion constrained by a matrix, and the generation of the matrix.
Just what came to my mind when reading your comment. I will have a look at "related work".

@aphalo
Copy link
Contributor

aphalo commented Jan 7, 2021

I had a quick look at position_nudge_repel() and arguments to nudge_x and nudge_y given as vectors with a suitable length just work with no change to the code. One could just add an example to the documentation.

 df <- data.frame(
   x = c(1,3,2,5),
   y = c("a","c","d","c")
 )

 ggplot(df, aes(x, y)) +
   geom_point() +
   geom_text_repel(aes(label = y),
                              min.segment.length = 0,
                              position = position_nudge_repel(x = c(-0.1, 0, 0.1, 0.1),  y = 0.15))

Rplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants