Sparse weight solvers #1552

arvoelke · 2019-06-28T20:05:19Z

Motivation and context:
Currently Sparse transforms are only usable when they are manually specified. This PR allows certain solvers such as LstsqL1 and LstsqDrop to signal to the backend that the returned weights should be implemented using a sparse representation. This currently only works for weights=True. When weights=False this continues to use Dense decoders.

Interactions with other PRs:
Might conflict with some documentation changes in #1540.

How has this been tested?
Made sure an existing test is now using sparse weight matrices when scipy is installed. Added tests to each new warning.

How long should this take to review?

Average (neither quick nor lengthy)

Where should a reviewer start?
Start in solvers.py with the changes to existing solvers. Then see how this is handled in the connection builder.

Types of changes:

New feature (non-breaking change which adds functionality)

Checklist:

I have read the CONTRIBUTING.rst document.
I have updated the documentation accordingly.
I have included a changelog entry.
I have added tests to cover my changes.
I have run the test suite locally and all tests passed.

Still to do:
I've done a bit of profiling to see when it makes sense to do this. On my Ubuntu machine with a conda + scipy install and Python 3.6 I'm seeing that it helps when using a Lasso solver with < 20% sparsity (number of non-zero coefficients) and 1000 x 1000 weight matrices.

Accuracy seems to be consistent with LstsqL2 when within 10-20% sparsity:

import time
import warnings

import numpy as np

from nengo.params import IntParam, NumberParam
from nengo.solvers import Solver, format_system, rmses
from nengo.utils.numpy import scipy_sparse


class Lasso(Solver):
    """Least-squares with L1-regularization to enforce sparsity."""
    
    compositional = False

    reg = NumberParam('reg', low=0)
    max_iter = IntParam('max_iter', low=1)
    tol = NumberParam('tol', low=0, low_open=True)

    def __init__(self, weights=False, reg=0.1, max_iter=5000, tol=1e-4,
                 sparse=True):
        """
        .. note:: Requires
                  `scikit-learn <https://scikit-learn.org/stable/>`_.

        Parameters
        ----------
        weights : bool, optional
            If False, solve for decoders. If True, solve for weights.
        reg : float, optional
            Amount of regularization, as a fraction of the neuron activity.
        max_iter : int, optional
            Maximum number of iterations for the scikit-learn solver.
        sparse : bool, optional
            Hints to the backend that the matrix is sparse.
        tol : float, optional
            Required tolerance for the solver to converge.

        Attributes
        ----------
        max_iter : int, optional
            Maximum number of iterations for the scikit-learn solver.
        reg : float
            Amount of regularization, as a fraction of the neuron activity.
        sparse : bool, optional
            Hints to the backend that the matrix is sparse.
        tol : float, optional
            Required tolerance for the solver to converge.
        weights : bool
            If False, solve for decoders. If True, solve for weights.
        """
        import sklearn.linear_model  # import here too to throw error early
        assert sklearn.linear_model
        self.reg = reg
        self.max_iter = max_iter
        self.tol = tol
        super().__init__(weights=weights, sparse=sparse)

    def __call__(self, A, Y, rng=np.random):
        from sklearn.linear_model import Lasso

        tstart = time.time()
        Y, _, _, _, matrix_in = format_system(A, Y)

        # Limitation: for all-to-all weights we could set
        # fit_intercept=True and then roll them into the
        # biases of the postsynaptic neurons.
        subsolver = Lasso(
            alpha=self.reg * A.max(),
            fit_intercept=False,
            copy_X=False,
            max_iter=self.max_iter,
            tol=self.tol,
            random_state=rng,
            selection='random',  # faster when tolerance is high
        )
        subsolver.fit(A, Y.copy())  # Y is read-only
        assert np.allclose(subsolver.intercept_, 0)
        X = subsolver.coef_.T
        if X.ndim == 1:
            X = X[:, None]

        t = time.time() - tstart
        weights = X if matrix_in or X.shape[1] > 1 else X.ravel()
        info = {
            'rmses': rmses(A, X, Y),
            'time': t,
            'sparsity': np.count_nonzero(weights) / weights.size,
        }
        return weights, info


from nengo.cache import Fingerprint
Fingerprint.whitelist(Lasso)
assert Fingerprint.supports(Lasso())


import nengo
from nengo.builder.ensemble import get_activities
from nengo.utils.numpy import rmse, scipy_sparse

LABEL_SEED = 'Seed'
LABEL_SOLVER = 'Solver'
LABEL_RMSE = 'RMSE'
LABEL_SPARSITY = 'Sparsity'
LABEL_TIME = 'Real Time / Simulation Time'

def trial(solver, seed, n_pre=1000, n_post=1000,
          t_sim=2.0, neuron_type=nengo.LIFRate()):

    with nengo.Network(seed=seed) as model:
        u = nengo.Node(output=lambda t: 2*t/t_sim - 1)
        x = nengo.Ensemble(n_pre, 1, neuron_type=neuron_type)
        y = nengo.Ensemble(n_post, 1, neuron_type=neuron_type)

        nengo.Connection(u, x, synapse=None)
        conn = nengo.Connection(x, y, solver=solver)
        p = nengo.Probe(y, synapse=None)

    with nengo.Simulator(model, progress_bar=None) as sim:
        start_time = time.time()
        sim.run(t_sim, progress_bar=None)
        end_time = time.time()
    
    assert conn.solver.weights

    if conn.solver.sparse:
        assert isinstance(sim.data[conn].weights, scipy_sparse.csr_matrix)
    
    p_u = np.linspace(-1, 1, len(sim.trange()))
    
    return {
        LABEL_RMSE: rmse(sim.data[p].squeeze(axis=1), p_u),
        LABEL_SPARSITY: sim.data[conn].solver_info.get('sparsity', None),
        LABEL_TIME: (end_time - start_time) / t_sim,
    }        


from collections import defaultdict

num_trials = 25
lasso_kwargs = {'max_iter': 500}

solvers = [
    #nengo.solvers.LstsqL1(weights=True),
    #nengo.solvers.LstsqDrop(drop=0.25, weights=True),
    #nengo.solvers.LstsqDrop(drop=0.90, weights=True),
    Lasso(reg=1e-0, weights=True, **lasso_kwargs),
    Lasso(reg=1e-1, weights=True, **lasso_kwargs),
    Lasso(reg=1e-2, weights=True, **lasso_kwargs),
    Lasso(reg=1e-3, weights=True, **lasso_kwargs),
    nengo.solvers.LstsqL2(weights=True),
]

data = defaultdict(list)

for seed in range(num_trials):
    for solver in solvers:
        print(seed, solver)
        res = trial(solver, seed=0)  # use same decoders/sparsity but get different sim times
        if res[LABEL_SPARSITY] is None:
            assert not solver.sparse
            S = (0, 0.2)  # hack to create horizontal band
        else:
            assert solver.sparse
            S = (res[LABEL_SPARSITY],)
        for sparsity in S:
            data[LABEL_SOLVER].append(type(solver).__name__)
            data[LABEL_SEED].append(seed)
            data[LABEL_SPARSITY].append(sparsity)
            data[LABEL_RMSE].append(res[LABEL_RMSE])
            data[LABEL_TIME].append(res[LABEL_TIME])


import matplotlib.pyplot as plt
import seaborn as sns
from pandas import DataFrame

df = DataFrame(data)

plt.figure()
sns.lineplot(data=df, x=LABEL_SPARSITY, y=LABEL_RMSE, hue=LABEL_SOLVER)
plt.show()

plt.figure()
sns.lineplot(data=df, x=LABEL_SPARSITY, y=LABEL_TIME, hue=LABEL_SOLVER)
plt.show()

This hints to the backend that it should use a sparse matrix representation to implement the transform. Currently only utilized for the `weights=True` case, and kept `Dense` otherwise.

This wouldn't have worked anyways because the import has already happened at this point (during the solver's `__init__`).

arvoelke · 2019-07-01T14:49:37Z

nengo/tests/test_solvers.py

+        pytest.skip("Test requires no Scipy")
+
+    if solver_cls is LstsqL1:
+        pytest.importorskip('sklearn')


Since sklearn requires scipy, this test doesn't actually do anything in this case. Test only makes sense for LstsqDrop.

arvoelke added 2 commits June 28, 2019 15:54

Allow solvers to mark weights as sparse

68d1ffb

This hints to the backend that it should use a sparse matrix representation to implement the transform. Currently only utilized for the `weights=True` case, and kept `Dense` otherwise.

Removed dead solver testing code

a3fa836

This wouldn't have worked anyways because the import has already happened at this point (during the solver's `__init__`).

arvoelke commented Jul 1, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse weight solvers #1552

Sparse weight solvers #1552

arvoelke commented Jun 28, 2019 •

edited

arvoelke Jul 1, 2019

Sparse weight solvers #1552

Are you sure you want to change the base?

Sparse weight solvers #1552

Conversation

arvoelke commented Jun 28, 2019 • edited

arvoelke Jul 1, 2019

Choose a reason for hiding this comment

arvoelke commented Jun 28, 2019 •

edited