Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare ceres with scipy curve_fit #1042

Open
insar-dev opened this issue Jan 12, 2024 · 8 comments
Open

Compare ceres with scipy curve_fit #1042

insar-dev opened this issue Jan 12, 2024 · 8 comments

Comments

@insar-dev
Copy link

insar-dev commented Jan 12, 2024

I compared ceres and scipy's curve_fit and found that ceres did not perform as well as curve_fit. The range calculated by ceres is 0, while the range calculated by curve_fit is more reasonable. What could be the possible reasons for this?
Code used for curve_fit:

import numpy as np
from scipy.optimize import curve_fit

def spherical_variogram_model(d,nugget,range_,psill):

#if h < range, use spherical equation, else (h>=range) adopted sill
return np.where(d<range_,nugget+(psill-nugget)*(1.5*d/range_-0.5*d**3/(range_**3)),psill)

Input data

lag = np.array([175.65234, 390.07074, 617.2337, 846.20544, 1079.8468, 1312.8428,
1545.525, 1777.5623, 2009.1091, 2239.874, 2472.7234, 2709.9663,
2941.5889, 3174.672, 3406.3713, 3639.4817, 3873.5212, 4107.823,
4341.223, 4571.714, 4805.605, 5041.677, 5271.1084, 5503.067,
5737.291, 5970.111, 6202.1523, 6437.138, 6670.4116, 6892.423])
gamma = np.array([20.815123, 21.075365, 19.551971, 20.397446, 22.835442, 24.360342,
28.06185, 27.736881, 28.389353, 27.340208, 30.65644, 31.15702,
29.316727, 28.009079, 26.655233, 24.42093, 20.318785, 23.195473,
22.581009, 23.794619, 22.329227, 22.553814, 21.483736, 20.519196,
21.407993, 18.173586, 21.122591, 25.138948, 24.998138, 34.060234])

nlag=len(lag)
#Pick a random initial value for the nugget
Nugget=(gamma[1]*lag[0]-gamma[0]*lag[1])/(lag[0]-lag[1])
if Nugget<0: Nugget=gamma[0]
#Pick a random initial value for the sill
Sill=(gamma[nlag-3]+gamma[nlag-2]+gamma[nlag-1])/3.0
#kick the starting value for the range
Range=lag[int(nlag/2)]

#Array of Initial Values. Also used in Gold Rule Fit
init_vals = [Nugget, Range, Sill]

#define the maximum values
maxlim=[max(gamma),max(lag),max(gamma)]

check=True
[Nugget,Range,Sill], _ = curve_fit(spherical_variogram_model, lag, gamma,method='trf', check_finite = check, p0=init_vals ,bounds=(0, maxlim) )
print("Estimated parameters:")
print("Range:", Range)
print("Sill:", Sill)
print("Nugget:", Nugget)

ceres :

using namespace ceres;
ceres::Problem problem;

template
class SphericalCovariance
{

public:
SphericalCovariance(T range, T sill, T nugget)
: _nugget(nugget)
, _range(range)
, _sill(sill)
{
_psill = _sill - _nugget;

}

public:
T compute(double d) const
{

	if (d <= _range) 
	{
		return _psill * ((3.0 * d) / (2.0 * _range) - (d * d * d) / (2.0 * _range * _range * _range)) + _nugget;
	}
	else {
		return _psill + _nugget;
	}
}

private:
T _sill;
T _range;
T _nugget;
T _psill;

};

struct ModelResidual
{
ModelResidual(double lag, double gamma)
: _lag(lag), _gamma(gamma) {}

template <typename T>
bool operator()(const T* const range,
				const T* const nugget, 
				const T* const sill, 
				T* residual) const 
{


	residual[0] = _gamma - SphericalCovariance( 
											   range[0], 
											   sill[0],
											   nugget[0]
											   ).compute(_lag);

	//residual[0] = y_ - exp(m[0] * x_ + c[0]);
	return true;
}

private:
// Observations for a sample.
const double _lag;
const double _gamma;
};

//solve

double nugget = guessNugget;
double sill = guessSill;
double range = guessRange;
for (size_t i = 0; i < lag.size(); ++i)
{
ceres::CostFunction* cost_function =
new ceres::AutoDiffCostFunction<ModelResidual, 1, 1, 1,1>(
new ModelResidual{lag[i], gamma[i]});

//SoftLOneLoss CauchyLoss HuberLoss
ceres::LossFunction* loss = nullptr;

problem.AddResidualBlock(cost_function, 
						 loss,
						 &range,
						 &nugget,
						 &sill);

}

problem.SetParameterLowerBound(&nugget, 0, 0.0);
problem.SetParameterUpperBound(&nugget, 0, maxNugget);
problem.SetParameterLowerBound(&range, 0, 0.0);
problem.SetParameterUpperBound(&range, 0, maxRange);
problem.SetParameterLowerBound(&sill, 0, 0.0);
problem.SetParameterUpperBound(&sill, 0, maxSill);

ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_progress_to_stdout = false;

ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);

@insar-dev
Copy link
Author

curve_fit result:
99cec6eff9c9efe07d79eaa16282631
ceres result:
image

@sandwichmaker
Copy link
Contributor

scipy and ceres use different optimization algorithms, especially for how they deal with bounds, so I am not surprised that they give different results. Can you share the log from ceres solver iterations and the output of Solver::Summary::FullReport()?

@insar-dev
Copy link
Author

@sandwichmaker

However, I find that curve_fit seems more reasonable in this case.

Solver Summary (v 2.1.0-eigen-(3.4.0)-no_lapack-no_openmp)

                                 Original                  Reduced

Parameter blocks 3 3
Parameters 3 3
Residual blocks 30 30
Residuals 30 30

Minimizer TRUST_REGION

Dense linear algebra library EIGEN
Trust region strategy LEVENBERG_MARQUARDT

                                    Given                     Used

Linear solver DENSE_QR DENSE_QR
Threads 1 1
Linear solver ordering AUTOMATIC 3

Cost:
Initial 3.105900e+02
Final 2.238625e+02
Change 8.672749e+01

Minimizer iterations 3
Successful steps 3
Unsuccessful steps 0
Line search steps 0

Time (in seconds):
Preprocessor 0.000296

Residual only evaluation 0.000058 (3)
Line search cost evaluation 0.000000
Jacobian & residual evaluation 0.000994 (6)
Line search gradient evaluation 0.000387
Linear solver 0.000500 (3)
Line search polynomial minimization 0.000000
Minimizer 0.002030

Postprocessor 0.000010
Total 0.002336

Termination: CONVERGENCE (Function tolerance reached. |cost_change|/cost: 9.231992e-10 <= 1.000000e-06)

@sandwichmaker
Copy link
Contributor

ty, I'd like to run your code myself. I think you have most of the code pasted above but not all of it, including your initial guesses etc. Can you post your entire ceres code please?

@insar-dev
Copy link
Author

@sandwichmaker I have already uploaded the code. Thank you.

//#include "FitTest.h"
#include <glog/logging.h>
#include <ceres/ceres.h>

template
class SphericalCovariance
{

public:
SphericalCovariance(T range, T sill, T nugget)
: _nugget(nugget)
, _range(range)
, _sill(sill)
{
_psill = _sill - _nugget;

}

public:
T compute(double d) const
{

	if (d <= _range)
	{
		return _psill * ((3.0 * d) / (2.0 * _range) - (d * d * d) / (2.0 * _range * _range * _range)) + _nugget;
	}
	else {
		return _psill + _nugget;
	}
}

private:
T _sill;
T _range;
T _nugget;
T _psill;

};

struct ModelResidual
{
ModelResidual(double lag, double gamma)
: _lag(lag), _gamma(gamma) {}

template <typename T>
bool operator()(const T* const range,
	const T* const nugget,
	const T* const sill,
	T* residual) const
{


	residual[0] = _gamma - SphericalCovariance(
		range[0],
		sill[0],
		nugget[0]
	).compute(_lag);

	//residual[0] = y_ - exp(m[0] * x_ + c[0]);
	return true;
}

private:
// Observations for a sample.
const double _lag;
const double _gamma;
};

void ceres_test()
{
using namespace ceres;
ceres::Problem problem;

std::vector<double> lag = { 175.65234, 390.07074, 617.2337, 846.20544, 1079.8468, 1312.8428,
			1545.525, 1777.5623, 2009.1091, 2239.874, 2472.7234, 2709.9663,
			2941.5889, 3174.672, 3406.3713, 3639.4817, 3873.5212, 4107.823,
			4341.223, 4571.714, 4805.605, 5041.677, 5271.1084, 5503.067,
			5737.291, 5970.111, 6202.1523, 6437.138, 6670.4116, 6892.423 };
std::vector<double> gamma = { 20.815123, 21.075365, 19.551971, 20.397446, 22.835442, 24.360342,
			  28.06185, 27.736881, 28.389353, 27.340208, 30.65644, 31.15702,
			  29.316727, 28.009079, 26.655233, 24.42093, 20.318785, 23.195473,
			  22.581009, 23.794619, 22.329227, 22.553814, 21.483736, 20.519196,
			  21.407993, 18.173586, 21.122591, 25.138948, 24.998138, 34.060234 };


double nugget = 20.601930259621913;
double sill = 28.062121873701631;
double range = 3639.4816871947096;

double maxRange = 6892.4227592761508;
double maxNugget = 34.060233524360626;
double maxSill = 34.060233524360626;

for (size_t i = 0; i < lag.size(); ++i)
{
	ceres::CostFunction* cost_function =
		new ceres::AutoDiffCostFunction<ModelResidual, 1, 1, 1,1>(
			new ModelResidual{lag[i], gamma[i]});

	//SoftLOneLoss CauchyLoss HuberLoss
	ceres::LossFunction* loss = nullptr;

	problem.AddResidualBlock(cost_function, 
							 loss,
							 &range,
							 &nugget,
							 &sill);
}

problem.SetParameterLowerBound(&nugget, 0, 0.0);
problem.SetParameterUpperBound(&nugget, 0, maxNugget);
problem.SetParameterLowerBound(&range,  0, 0.0);
problem.SetParameterUpperBound(&range,  0, maxRange);
problem.SetParameterLowerBound(&sill,   0, 0.0);
problem.SetParameterUpperBound(&sill,   0, maxSill);

ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_progress_to_stdout = false;

ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);

std::cout << summary.FullReport() << std::endl;
std::cout << "Fitted parameters:" << std::endl;
std::cout << "Nugget = " << nugget << std::endl;
std::cout << "Range = " << range << std::endl;
std::cout << "Sill = " << sill << std::endl;

}

@sandwichmaker
Copy link
Contributor

Ty, I will take a look at this in a bit.

@sandwichmaker
Copy link
Contributor

@insar-dev

in the python version of the code you have

def spherical_variogram_model(d,nugget,range_,psill):
   #if h < range, use spherical equation, else (h>=range) adopted sill
   return np.where(d<range_,nugget+(psill-nugget)*(1.5*d/range_-0.5*d**3/(range_**3)),psill)

which seems a bit different from what you have in the c++ version of the code

if (d <= _range)
	{
		return _psill * ((3.0 * d) / (2.0 * _range) - (d * d * d) / (2.0 * _range * _range * _range)) + _nugget;
	}
	else {
		return _psill + _nugget;
	}

In particular in the python code you are using psill - nugget when multiplying to (1.5*d/range_-0.5*d**3/(range_**3)) and the else clause only returns psill.

Are the c++ and python version of psill the same or different?

That said, I substituted the values found by scipy into the objective function and it is indeed a better minimum than the one found by ceres.

@insar-dev
Copy link
Author

insar-dev commented Jan 12, 2024

@sandwichmaker

There are issues with the variable naming in the code. You can use the following code instead:

def spherical_variogram_model(d,nugget,range_,sill):

#if h < range, use spherical equation, else (h>=range) adopted sill
return np.where(d<range_,nugget+(sill-nugget)*(1.5*d/range_-0.5*d**3/(range_**3)),sill)

sill = nugget + psill

Under the Gaussian and Exponential models, their results are similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants