Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] param_shift with broadcast=True does not work properly with shot vectors #5598

Open
1 task done
dwierichs opened this issue Apr 30, 2024 · 0 comments · May be fixed by #5667
Open
1 task done

[BUG] param_shift with broadcast=True does not work properly with shot vectors #5598

dwierichs opened this issue Apr 30, 2024 · 0 comments · May be fixed by #5667
Assignees
Labels
bug 🐛 Something isn't working

Comments

@dwierichs
Copy link
Contributor

Expected behavior

The example below should work

Actual behavior

It raises a shape mismatch error

Additional information

It works if len(shots)=2, which is not a good sign. It means that we're wrongly contracting results across the shots axis, which only works as long as they have the right length.

Source code

param = pnp.array(0.5, requires_grad=True)
shots = (10, 100, 10)
dev = qml.device("default.qubit", wires=2, shots=shots)
@qml.qnode(dev)
def circuit(param):
    qml.RX(param, wires=0)
    return qml.expval(qml.Z(0))

qml.gradients.param_shift(circuit, broadcast=True)(param)

Tracebacks

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[36], line 11
      8     qml.RX(param, wires=0)
      9     return qml.expval(qml.Z(0))
---> 11 qml.gradients.param_shift(circuit, broadcast=True)(param)

File ~/repos/pennylane/pennylane/workflow/qnode.py:1088, in QNode.__call__(self, *args, **kwargs)
   1085 self._update_gradient_fn(shots=override_shots, tape=self._tape)
   1087 try:
-> 1088     res = self._execution_component(args, kwargs, override_shots=override_shots)
   1089 finally:
   1090     if old_interface == "auto":

File ~/repos/pennylane/pennylane/workflow/qnode.py:1042, in QNode._execution_component(self, args, kwargs, override_shots)
   1039 full_transform_program.prune_dynamic_transform()
   1041 # pylint: disable=unexpected-keyword-arg
-> 1042 res = qml.execute(
   1043     (self._tape,),
   1044     device=self.device,
   1045     gradient_fn=self.gradient_fn,
   1046     interface=self.interface,
   1047     transform_program=full_transform_program,
   1048     config=config,
   1049     gradient_kwargs=self.gradient_kwargs,
   1050     override_shots=override_shots,
   1051     **self.execute_kwargs,
   1052 )
   1053 res = res[0]
   1055 # convert result to the interface in case the qfunc has no parameters

File ~/repos/pennylane/pennylane/workflow/execution.py:798, in execute(tapes, device, gradient_fn, interface, transform_program, config, grad_on_execution, gradient_kwargs, cache, cachesize, max_diff, override_shots, expand_fn, max_expansion, device_batch_transform, device_vjp)
    793 else:
    794     results = ml_boundary_execute(
    795         tapes, device, execute_fn, gradient_fn, gradient_kwargs, _n=1, max_diff=max_diff
    796     )
--> 798 return post_processing(results)

File ~/repos/pennylane/pennylane/transforms/core/transform_program.py:88, in _apply_postprocessing_stack(results, postprocessing_stack)
     65 """Applies the postprocessing and cotransform postprocessing functions in a Last-In-First-Out LIFO manner.
     66 
     67 Args:
   (...)
     85 
     86 """
     87 for postprocessing in reversed(postprocessing_stack):
---> 88     results = postprocessing(results)
     89 return results

File ~/repos/pennylane/pennylane/transforms/core/transform_program.py:58, in _batch_postprocessing(results, individual_fns, slices)
     32 def _batch_postprocessing(
     33     results: ResultBatch, individual_fns: List[PostProcessingFn], slices: List[slice]
     34 ) -> ResultBatch:
     35     """Broadcast individual post processing functions onto their respective tapes.
     36 
     37     Args:
   (...)
     56 
     57     """
---> 58     return tuple(fn(results[sl]) for fn, sl in zip(individual_fns, slices))

File ~/repos/pennylane/pennylane/transforms/core/transform_program.py:58, in <genexpr>(.0)
     32 def _batch_postprocessing(
     33     results: ResultBatch, individual_fns: List[PostProcessingFn], slices: List[slice]
     34 ) -> ResultBatch:
     35     """Broadcast individual post processing functions onto their respective tapes.
     36 
     37     Args:
   (...)
     56 
     57     """
---> 58     return tuple(fn(results[sl]) for fn, sl in zip(individual_fns, slices))

File ~/repos/pennylane/pennylane/gradients/parameter_shift.py:433, in expval_param_shift.<locals>.processing_fn(results)
    430     res = results[start : start + num_tapes] if batch_size is None else results[start]
    431     start = start + num_tapes
--> 433     g = _evaluate_gradient(tape, res, data, r0)
    434     grads.append(g)
    436 # g will have been defined at least once (because otherwise all gradients would have
    437 # been zero), providing a representative for a zero gradient to emulate its type/shape.

File ~/repos/pennylane/pennylane/gradients/parameter_shift.py:213, in _evaluate_gradient(tape, res, data, r0)
    211 for i in range(len_shot_vec):
    212     shot_comp_res = [r[i] for r in res]
--> 213     shot_comp_res = _single_meas_grad(shot_comp_res, coeffs, unshifted_coeff, r0[i])
    214     g.append(shot_comp_res)
    215 return tuple(g)

File ~/repos/pennylane/pennylane/gradients/parameter_shift.py:164, in _single_meas_grad(result, coeffs, unshifted_coeff, r0)
    162 result = qml.math.stack(result)
    163 coeffs = qml.math.convert_like(coeffs, result)
--> 164 g = qml.math.tensordot(result, coeffs, [[0], [0]])
    165 if unshifted_coeff is not None:
    166     # add the unshifted term
    167     g = g + unshifted_coeff * r0

File ~/repos/pennylane/pennylane/math/multi_dispatch.py:151, in multi_dispatch.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    148 interface = interface or get_interface(*dispatch_args)
    149 kwargs["like"] = interface
--> 151 return fn(*args, **kwargs)

File ~/repos/pennylane/pennylane/math/multi_dispatch.py:398, in tensordot(tensor1, tensor2, axes, like)
    372 """Returns the tensor product of two tensors.
    373 In general ``axes`` specifies either the set of axes for both
    374 tensors that are contracted (with the first/second entry of ``axes``
   (...)
    395     tensor_like: the tensor product of the two input tensors
    396 """
    397 tensor1, tensor2 = np.coerce([tensor1, tensor2], like=like)
--> 398 return np.tensordot(tensor1, tensor2, axes=axes, like=like)

File ~/venvs/dev/lib/python3.10/site-packages/autoray/autoray.py:80, in do(fn, like, *args, **kwargs)
     31 """Do function named ``fn`` on ``(*args, **kwargs)``, peforming single
     32 dispatch to retrieve ``fn`` based on whichever library defines the class of
     33 the ``args[0]``, or the ``like`` keyword argument if specified.
   (...)
     77     <tf.Tensor: id=91, shape=(3, 3), dtype=float32>
     78 """
     79 backend = choose_backend(fn, *args, like=like, **kwargs)
---> 80 return get_lib_fn(backend, fn)(*args, **kwargs)

File ~/venvs/dev/lib/python3.10/site-packages/numpy/core/numeric.py:1099, in tensordot(a, b, axes)
   1097             axes_b[k] += ndb
   1098 if not equal:
-> 1099     raise ValueError("shape-mismatch for sum")
   1101 # Move the axes to sum over to the end of "a"
   1102 # and to the front of "b"
   1103 notin = [k for k in range(nda) if k not in axes_a]

ValueError: shape-mismatch for sum

System information

pl dev

Existing GitHub issues

  • I have searched existing GitHub issues to make sure the issue does not already exist.
@dwierichs dwierichs added the bug 🐛 Something isn't working label Apr 30, 2024
@dwierichs dwierichs self-assigned this Apr 30, 2024
dwierichs added a commit that referenced this issue May 1, 2024
**Context:**
Shot vectors together with the `broadcast=True` feature of `param_shift`
suffers from the bug #5598 .
For some shot vector lengths, this bug causes silent wrong results, for
other lengths it causes a non-comprehensive error.

**Description of the Change:**
This PR explicitly disallows this combination and raises a
`NotImplementedError`.

A proper fix is in the works.

**Benefits:**
No silently wrong results and more comprehensive error message.

**Possible Drawbacks:**

**Related GitHub Issues:**
#5598

Co-authored-by: Astral Cai <astral.cai@xanadu.ai>
dwierichs added a commit that referenced this issue May 24, 2024
**Context:**
`param_shift` uses the internal method `_evaluate_gradient`, which
mostly consists of logic to take the contraction of tape execution
results with the parameter-shift rule coefficients and map it over tuple
axes. It also needs to respect batching of execution results if
`broadcast=True` is used in `param_shift`.

**Description of the Change:**
This PR cleans up `_evaluate_gradient`, extends it to multi-measurement
and shot vector scenarios when broadcasting is used, and recycles helper
methods from `gradient_transform.py` to reduce the code.
We also add unit tests for this method, allowing to reduce integration
test count in the future.

**Benefits:**
Prepare `_evaluate_gradient` for multi-measurement and shot vector
support with `broadcast=True`.
Improve testing and code quality.

**Possible Drawbacks:**

**Related GitHub Issues:**
prepares a bug fix for #5598 

[sc-62283]

---------

Co-authored-by: Mudit Pandey <mudit.pandey@xanadu.ai>
Co-authored-by: Astral Cai <astral.cai@xanadu.ai>
Co-authored-by: Vincent Michaud-Rioux <vincentm@nanoacademic.com>
Co-authored-by: lillian542 <38584660+lillian542@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
1 participant