Normalize variabels #451

bonh · 2024-04-05T10:18:51Z

I get the feeling that normalizing the variables to be around 1 greatly improves the sampling. Specifically, it prevents the "point is inside the hull" error. Might be worth adding to the tutorial somewhere?

basnijholt · 2024-04-08T19:40:02Z

@bonh thanks for your post. Could you perhaps share some more details? Are you talking about the Learner2D? If so, the values should already be rescaled automatically.

It would be great if you could share some code!

bonh · 2024-04-09T09:23:07Z

I use LearnerND with 5 inputs ranging from $\mathcal O(1)$ to $\mathcal O(1\mathrm{e}{-6})$ mapping to one output in the range of $\mathcal O(1\mathrm{e}{-3})$.

The function is quite complex and I'm not able to share it (yet). I'll try to find a MWE.

What I noticed was, that in the nonscaled problem, the values chosen by adaptive did only vary by three or four digits right from the decimal point from one iteration to the next (until it failed).

basnijholt · 2024-04-09T19:23:14Z

At this exact moment I have no time to take a detailed look.

However, I from a quick look I am led to believe that the problem is that we're not using the value_scale parameter in the loss functions:

adaptive/adaptive/learner/learnerND.py

Lines 110 to 169 in d2c8041

    
           def default_loss(simplex, values, value_scale): 
        
               """ 
        
               Computes the average of the volumes of the simplex. 
        
               Parameters 
        
               ---------- 
        
               simplex : list of tuples 
        
                   Each entry is one point of the simplex. 
        
               values : list of values 
        
                   The scaled function values of each of the simplex points. 
        
               value_scale : float 
        
                   The scale of values, where ``values = function_values * value_scale``. 
        
               Returns 
        
               ------- 
        
               loss : float 
        
               """ 
        
               if isinstance(values[0], Iterable): 
        
                   pts = [(*x, *y) for x, y in zip(simplex, values)] 
        
               else: 
        
                   pts = [(*x, y) for x, y in zip(simplex, values)] 
        
               return simplex_volume_in_embedding(pts) 
        
           @uses_nth_neighbors(1) 
        
           def triangle_loss(simplex, values, value_scale, neighbors, neighbor_values): 
        
               """ 
        
               Computes the average of the volumes of the simplex combined with each 
        
               neighbouring point. 
        
               Parameters 
        
               ---------- 
        
               simplex : list of tuples 
        
                   Each entry is one point of the simplex. 
        
               values : list of values 
        
                   The scaled function values of each of the simplex points. 
        
               value_scale : float 
        
                   The scale of values, where ``values = function_values * value_scale``. 
        
               neighbors : list of tuples 
        
                   The neighboring points of the simplex, ordered such that simplex[0] 
        
                   exactly opposes neighbors[0], etc. 
        
               neighbor_values : list of values 
        
                   The function values for each of the neighboring points. 
        
               Returns 
        
               ------- 
        
               loss : float 
        
               """ 
        
               neighbors = [n for n in neighbors if n is not None] 
        
               neighbor_values = [v for v in neighbor_values if v is not None] 
        
               if len(neighbors) == 0: 
        
                   return 0 
        
               s = [(*x, *to_list(y)) for x, y in zip(simplex, values)] 
        
               n = [(*x, *to_list(y)) for x, y in zip(neighbors, neighbor_values)] 
        
               return sum(simplex_volume_in_embedding([*s, neighbor]) for neighbor in n) / len( 
        
                   neighbors 
        
               )

This should be a relatively easy fix.

@bonh, unrelated to this issue, how is your experience with sampling a 5D space? Does Adaptive produce good results, better results than random sampling or uniform sampling? Personally, I have not even tried running real simulations beyond 3D, always thinking that "the curse of dimensionally" would bite me.

bonh · 2024-04-10T07:40:35Z

we're not using the value_scale parameter in the loss functions

That'd explain my observations, thanks!

I Just started sampling a 5D space, before that it was 3D, too. My target is to train a surrogate approximating my complex, costly function. However, the function is not that costly that I cannot sample 4000 points in a reasonable time. My guess is, that I would get similar results with different sampling procedures because I'm filling the parameter space very well. So I didn't do a detailed analysis but I think that I require about 30 % less samples with adaptive compared to uniform (this is for 3D) to get comparable predictive accuracy with the trained surrogates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize variabels #451

Normalize variabels #451

bonh commented Apr 5, 2024

basnijholt commented Apr 8, 2024

bonh commented Apr 9, 2024

basnijholt commented Apr 9, 2024

bonh commented Apr 10, 2024

Normalize variabels #451

Normalize variabels #451

Comments

bonh commented Apr 5, 2024

basnijholt commented Apr 8, 2024

bonh commented Apr 9, 2024

basnijholt commented Apr 9, 2024

bonh commented Apr 10, 2024