Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse weight solvers #1552

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Sparse weight solvers #1552

wants to merge 2 commits into from

Conversation

arvoelke
Copy link
Contributor

@arvoelke arvoelke commented Jun 28, 2019

Motivation and context:
Currently Sparse transforms are only usable when they are manually specified. This PR allows certain solvers such as LstsqL1 and LstsqDrop to signal to the backend that the returned weights should be implemented using a sparse representation. This currently only works for weights=True. When weights=False this continues to use Dense decoders.

Interactions with other PRs:
Might conflict with some documentation changes in #1540.

How has this been tested?
Made sure an existing test is now using sparse weight matrices when scipy is installed. Added tests to each new warning.

How long should this take to review?

  • Average (neither quick nor lengthy)

Where should a reviewer start?
Start in solvers.py with the changes to existing solvers. Then see how this is handled in the connection builder.

Types of changes:

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have read the CONTRIBUTING.rst document.
  • I have updated the documentation accordingly.
  • I have included a changelog entry.
  • I have added tests to cover my changes.
  • I have run the test suite locally and all tests passed.

Still to do:
I've done a bit of profiling to see when it makes sense to do this. On my Ubuntu machine with a conda + scipy install and Python 3.6 I'm seeing that it helps when using a Lasso solver with < 20% sparsity (number of non-zero coefficients) and 1000 x 1000 weight matrices.

speed_improvement

Accuracy seems to be consistent with LstsqL2 when within 10-20% sparsity:

accuracy_difference

import time
import warnings

import numpy as np

from nengo.params import IntParam, NumberParam
from nengo.solvers import Solver, format_system, rmses
from nengo.utils.numpy import scipy_sparse


class Lasso(Solver):
    """Least-squares with L1-regularization to enforce sparsity."""
    
    compositional = False

    reg = NumberParam('reg', low=0)
    max_iter = IntParam('max_iter', low=1)
    tol = NumberParam('tol', low=0, low_open=True)

    def __init__(self, weights=False, reg=0.1, max_iter=5000, tol=1e-4,
                 sparse=True):
        """
        .. note:: Requires
                  `scikit-learn <https://scikit-learn.org/stable/>`_.

        Parameters
        ----------
        weights : bool, optional
            If False, solve for decoders. If True, solve for weights.
        reg : float, optional
            Amount of regularization, as a fraction of the neuron activity.
        max_iter : int, optional
            Maximum number of iterations for the scikit-learn solver.
        sparse : bool, optional
            Hints to the backend that the matrix is sparse.
        tol : float, optional
            Required tolerance for the solver to converge.

        Attributes
        ----------
        max_iter : int, optional
            Maximum number of iterations for the scikit-learn solver.
        reg : float
            Amount of regularization, as a fraction of the neuron activity.
        sparse : bool, optional
            Hints to the backend that the matrix is sparse.
        tol : float, optional
            Required tolerance for the solver to converge.
        weights : bool
            If False, solve for decoders. If True, solve for weights.
        """
        import sklearn.linear_model  # import here too to throw error early
        assert sklearn.linear_model
        self.reg = reg
        self.max_iter = max_iter
        self.tol = tol
        super().__init__(weights=weights, sparse=sparse)

    def __call__(self, A, Y, rng=np.random):
        from sklearn.linear_model import Lasso

        tstart = time.time()
        Y, _, _, _, matrix_in = format_system(A, Y)

        # Limitation: for all-to-all weights we could set
        # fit_intercept=True and then roll them into the
        # biases of the postsynaptic neurons.
        subsolver = Lasso(
            alpha=self.reg * A.max(),
            fit_intercept=False,
            copy_X=False,
            max_iter=self.max_iter,
            tol=self.tol,
            random_state=rng,
            selection='random',  # faster when tolerance is high
        )
        subsolver.fit(A, Y.copy())  # Y is read-only
        assert np.allclose(subsolver.intercept_, 0)
        X = subsolver.coef_.T
        if X.ndim == 1:
            X = X[:, None]

        t = time.time() - tstart
        weights = X if matrix_in or X.shape[1] > 1 else X.ravel()
        info = {
            'rmses': rmses(A, X, Y),
            'time': t,
            'sparsity': np.count_nonzero(weights) / weights.size,
        }
        return weights, info


from nengo.cache import Fingerprint
Fingerprint.whitelist(Lasso)
assert Fingerprint.supports(Lasso())


import nengo
from nengo.builder.ensemble import get_activities
from nengo.utils.numpy import rmse, scipy_sparse

LABEL_SEED = 'Seed'
LABEL_SOLVER = 'Solver'
LABEL_RMSE = 'RMSE'
LABEL_SPARSITY = 'Sparsity'
LABEL_TIME = 'Real Time / Simulation Time'

def trial(solver, seed, n_pre=1000, n_post=1000,
          t_sim=2.0, neuron_type=nengo.LIFRate()):

    with nengo.Network(seed=seed) as model:
        u = nengo.Node(output=lambda t: 2*t/t_sim - 1)
        x = nengo.Ensemble(n_pre, 1, neuron_type=neuron_type)
        y = nengo.Ensemble(n_post, 1, neuron_type=neuron_type)

        nengo.Connection(u, x, synapse=None)
        conn = nengo.Connection(x, y, solver=solver)
        p = nengo.Probe(y, synapse=None)

    with nengo.Simulator(model, progress_bar=None) as sim:
        start_time = time.time()
        sim.run(t_sim, progress_bar=None)
        end_time = time.time()
    
    assert conn.solver.weights

    if conn.solver.sparse:
        assert isinstance(sim.data[conn].weights, scipy_sparse.csr_matrix)
    
    p_u = np.linspace(-1, 1, len(sim.trange()))
    
    return {
        LABEL_RMSE: rmse(sim.data[p].squeeze(axis=1), p_u),
        LABEL_SPARSITY: sim.data[conn].solver_info.get('sparsity', None),
        LABEL_TIME: (end_time - start_time) / t_sim,
    }        


from collections import defaultdict

num_trials = 25
lasso_kwargs = {'max_iter': 500}

solvers = [
    #nengo.solvers.LstsqL1(weights=True),
    #nengo.solvers.LstsqDrop(drop=0.25, weights=True),
    #nengo.solvers.LstsqDrop(drop=0.90, weights=True),
    Lasso(reg=1e-0, weights=True, **lasso_kwargs),
    Lasso(reg=1e-1, weights=True, **lasso_kwargs),
    Lasso(reg=1e-2, weights=True, **lasso_kwargs),
    Lasso(reg=1e-3, weights=True, **lasso_kwargs),
    nengo.solvers.LstsqL2(weights=True),
]

data = defaultdict(list)

for seed in range(num_trials):
    for solver in solvers:
        print(seed, solver)
        res = trial(solver, seed=0)  # use same decoders/sparsity but get different sim times
        if res[LABEL_SPARSITY] is None:
            assert not solver.sparse
            S = (0, 0.2)  # hack to create horizontal band
        else:
            assert solver.sparse
            S = (res[LABEL_SPARSITY],)
        for sparsity in S:
            data[LABEL_SOLVER].append(type(solver).__name__)
            data[LABEL_SEED].append(seed)
            data[LABEL_SPARSITY].append(sparsity)
            data[LABEL_RMSE].append(res[LABEL_RMSE])
            data[LABEL_TIME].append(res[LABEL_TIME])


import matplotlib.pyplot as plt
import seaborn as sns
from pandas import DataFrame

df = DataFrame(data)

plt.figure()
sns.lineplot(data=df, x=LABEL_SPARSITY, y=LABEL_RMSE, hue=LABEL_SOLVER)
plt.show()

plt.figure()
sns.lineplot(data=df, x=LABEL_SPARSITY, y=LABEL_TIME, hue=LABEL_SOLVER)
plt.show()

This hints to the backend that it should use a sparse
matrix representation to implement the transform.
Currently only utilized for the `weights=True` case,
and kept `Dense` otherwise.
This wouldn't have worked anyways because the import has
already happened at this point (during the solver's `__init__`).
pytest.skip("Test requires no Scipy")

if solver_cls is LstsqL1:
pytest.importorskip('sklearn')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since sklearn requires scipy, this test doesn't actually do anything in this case. Test only makes sense for LstsqDrop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant