[BUG] `param_shift` with `broadcast=True` does not work properly with shot vectors #5598

dwierichs · 2024-04-30T14:24:13Z

Expected behavior

The example below should work

Actual behavior

It raises a shape mismatch error

Additional information

It works if len(shots)=2, which is not a good sign. It means that we're wrongly contracting results across the shots axis, which only works as long as they have the right length.

Source code

param = pnp.array(0.5, requires_grad=True)
shots = (10, 100, 10)
dev = qml.device("default.qubit", wires=2, shots=shots)
@qml.qnode(dev)
def circuit(param):
    qml.RX(param, wires=0)
    return qml.expval(qml.Z(0))

qml.gradients.param_shift(circuit, broadcast=True)(param)

Tracebacks

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[36], line 11
      8     qml.RX(param, wires=0)
      9     return qml.expval(qml.Z(0))
---> 11 qml.gradients.param_shift(circuit, broadcast=True)(param)

File ~/repos/pennylane/pennylane/workflow/qnode.py:1088, in QNode.__call__(self, *args, **kwargs)
   1085 self._update_gradient_fn(shots=override_shots, tape=self._tape)
   1087 try:
-> 1088     res = self._execution_component(args, kwargs, override_shots=override_shots)
   1089 finally:
   1090     if old_interface == "auto":

File ~/repos/pennylane/pennylane/workflow/qnode.py:1042, in QNode._execution_component(self, args, kwargs, override_shots)
   1039 full_transform_program.prune_dynamic_transform()
   1041 # pylint: disable=unexpected-keyword-arg
-> 1042 res = qml.execute(
   1043     (self._tape,),
   1044     device=self.device,
   1045     gradient_fn=self.gradient_fn,
   1046     interface=self.interface,
   1047     transform_program=full_transform_program,
   1048     config=config,
   1049     gradient_kwargs=self.gradient_kwargs,
   1050     override_shots=override_shots,
   1051     **self.execute_kwargs,
   1052 )
   1053 res = res[0]
   1055 # convert result to the interface in case the qfunc has no parameters

File ~/repos/pennylane/pennylane/workflow/execution.py:798, in execute(tapes, device, gradient_fn, interface, transform_program, config, grad_on_execution, gradient_kwargs, cache, cachesize, max_diff, override_shots, expand_fn, max_expansion, device_batch_transform, device_vjp)
    793 else:
    794     results = ml_boundary_execute(
    795         tapes, device, execute_fn, gradient_fn, gradient_kwargs, _n=1, max_diff=max_diff
    796     )
--> 798 return post_processing(results)

File ~/repos/pennylane/pennylane/transforms/core/transform_program.py:88, in _apply_postprocessing_stack(results, postprocessing_stack)
     65 """Applies the postprocessing and cotransform postprocessing functions in a Last-In-First-Out LIFO manner.
     66 
     67 Args:
   (...)
     85 
     86 """
     87 for postprocessing in reversed(postprocessing_stack):
---> 88     results = postprocessing(results)
     89 return results

File ~/repos/pennylane/pennylane/transforms/core/transform_program.py:58, in _batch_postprocessing(results, individual_fns, slices)
     32 def _batch_postprocessing(
     33     results: ResultBatch, individual_fns: List[PostProcessingFn], slices: List[slice]
     34 ) -> ResultBatch:
     35     """Broadcast individual post processing functions onto their respective tapes.
     36 
     37     Args:
   (...)
     56 
     57     """
---> 58     return tuple(fn(results[sl]) for fn, sl in zip(individual_fns, slices))

File ~/repos/pennylane/pennylane/transforms/core/transform_program.py:58, in <genexpr>(.0)
     32 def _batch_postprocessing(
     33     results: ResultBatch, individual_fns: List[PostProcessingFn], slices: List[slice]
     34 ) -> ResultBatch:
     35     """Broadcast individual post processing functions onto their respective tapes.
     36 
     37     Args:
   (...)
     56 
     57     """
---> 58     return tuple(fn(results[sl]) for fn, sl in zip(individual_fns, slices))

File ~/repos/pennylane/pennylane/gradients/parameter_shift.py:433, in expval_param_shift.<locals>.processing_fn(results)
    430     res = results[start : start + num_tapes] if batch_size is None else results[start]
    431     start = start + num_tapes
--> 433     g = _evaluate_gradient(tape, res, data, r0)
    434     grads.append(g)
    436 # g will have been defined at least once (because otherwise all gradients would have
    437 # been zero), providing a representative for a zero gradient to emulate its type/shape.

File ~/repos/pennylane/pennylane/gradients/parameter_shift.py:213, in _evaluate_gradient(tape, res, data, r0)
    211 for i in range(len_shot_vec):
    212     shot_comp_res = [r[i] for r in res]
--> 213     shot_comp_res = _single_meas_grad(shot_comp_res, coeffs, unshifted_coeff, r0[i])
    214     g.append(shot_comp_res)
    215 return tuple(g)

File ~/repos/pennylane/pennylane/gradients/parameter_shift.py:164, in _single_meas_grad(result, coeffs, unshifted_coeff, r0)
    162 result = qml.math.stack(result)
    163 coeffs = qml.math.convert_like(coeffs, result)
--> 164 g = qml.math.tensordot(result, coeffs, [[0], [0]])
    165 if unshifted_coeff is not None:
    166     # add the unshifted term
    167     g = g + unshifted_coeff * r0

File ~/repos/pennylane/pennylane/math/multi_dispatch.py:151, in multi_dispatch.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    148 interface = interface or get_interface(*dispatch_args)
    149 kwargs["like"] = interface
--> 151 return fn(*args, **kwargs)

File ~/repos/pennylane/pennylane/math/multi_dispatch.py:398, in tensordot(tensor1, tensor2, axes, like)
    372 """Returns the tensor product of two tensors.
    373 In general ``axes`` specifies either the set of axes for both
    374 tensors that are contracted (with the first/second entry of ``axes``
   (...)
    395     tensor_like: the tensor product of the two input tensors
    396 """
    397 tensor1, tensor2 = np.coerce([tensor1, tensor2], like=like)
--> 398 return np.tensordot(tensor1, tensor2, axes=axes, like=like)

File ~/venvs/dev/lib/python3.10/site-packages/autoray/autoray.py:80, in do(fn, like, *args, **kwargs)
     31 """Do function named ``fn`` on ``(*args, **kwargs)``, peforming single
     32 dispatch to retrieve ``fn`` based on whichever library defines the class of
     33 the ``args[0]``, or the ``like`` keyword argument if specified.
   (...)
     77     <tf.Tensor: id=91, shape=(3, 3), dtype=float32>
     78 """
     79 backend = choose_backend(fn, *args, like=like, **kwargs)
---> 80 return get_lib_fn(backend, fn)(*args, **kwargs)

File ~/venvs/dev/lib/python3.10/site-packages/numpy/core/numeric.py:1099, in tensordot(a, b, axes)
   1097             axes_b[k] += ndb
   1098 if not equal:
-> 1099     raise ValueError("shape-mismatch for sum")
   1101 # Move the axes to sum over to the end of "a"
   1102 # and to the front of "b"
   1103 notin = [k for k in range(nda) if k not in axes_a]

ValueError: shape-mismatch for sum

System information

pl dev

Existing GitHub issues

I have searched existing GitHub issues to make sure the issue does not already exist.

The text was updated successfully, but these errors were encountered:

**Context:** Shot vectors together with the `broadcast=True` feature of `param_shift` suffers from the bug #5598 . For some shot vector lengths, this bug causes silent wrong results, for other lengths it causes a non-comprehensive error. **Description of the Change:** This PR explicitly disallows this combination and raises a `NotImplementedError`. A proper fix is in the works. **Benefits:** No silently wrong results and more comprehensive error message. **Possible Drawbacks:** **Related GitHub Issues:** #5598 Co-authored-by: Astral Cai <astral.cai@xanadu.ai>

**Context:** `param_shift` uses the internal method `_evaluate_gradient`, which mostly consists of logic to take the contraction of tape execution results with the parameter-shift rule coefficients and map it over tuple axes. It also needs to respect batching of execution results if `broadcast=True` is used in `param_shift`. **Description of the Change:** This PR cleans up `_evaluate_gradient`, extends it to multi-measurement and shot vector scenarios when broadcasting is used, and recycles helper methods from `gradient_transform.py` to reduce the code. We also add unit tests for this method, allowing to reduce integration test count in the future. **Benefits:** Prepare `_evaluate_gradient` for multi-measurement and shot vector support with `broadcast=True`. Improve testing and code quality. **Possible Drawbacks:** **Related GitHub Issues:** prepares a bug fix for #5598 [sc-62283] --------- Co-authored-by: Mudit Pandey <mudit.pandey@xanadu.ai> Co-authored-by: Astral Cai <astral.cai@xanadu.ai> Co-authored-by: Vincent Michaud-Rioux <vincentm@nanoacademic.com> Co-authored-by: lillian542 <38584660+lillian542@users.noreply.github.com>

dwierichs added the bug 🐛 Something isn't working label Apr 30, 2024

dwierichs self-assigned this Apr 30, 2024

This was referenced Apr 30, 2024

Fix qml.gradients docs examples #5596

Merged

Disallow param_shift(... broadcast=True) with shot vectors #5612

Merged

This was referenced May 7, 2024

Refactor _evaluate_gradient for param_shift #5666

Merged

Support multiple measurements and shot vectors in param_shift(... broadcast=True) #5667

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `param_shift` with `broadcast=True` does not work properly with shot vectors #5598

[BUG] `param_shift` with `broadcast=True` does not work properly with shot vectors #5598

dwierichs commented Apr 30, 2024

[BUG] param_shift with broadcast=True does not work properly with shot vectors #5598

[BUG] param_shift with broadcast=True does not work properly with shot vectors #5598

Comments

dwierichs commented Apr 30, 2024

Expected behavior

Actual behavior

Additional information

Source code

Tracebacks

System information

Existing GitHub issues

[BUG] `param_shift` with `broadcast=True` does not work properly with shot vectors #5598

[BUG] `param_shift` with `broadcast=True` does not work properly with shot vectors #5598