Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Bump test_complex_2d thresholds for LBFGS on complex64 #126358

Closed
wants to merge 1 commit into from

Conversation

eqy
Copy link
Collaborator

@eqy eqy commented May 16, 2024

Is this supposed to be bitwise identical? Wasn't sure how to interpret the comment but it seems to be giving mismatches like:

Mismatched elements: 1 / 2 (50.0%)
Greatest absolute difference: 4.6372413635253906e-05 at index (1,) (up to 1e-05 allowed)
Greatest relative difference: 3.4600801882334054e-05 at index (1,) (up to 1.3e-06 allowed)

To execute this test, run the following from the base repo dir:
     python test/test_optim.py -k test_complex_2d_LBFGS_cpu_complex64

on Neoverse-N2 SBSA ARM CPUs.

cc @vincentqb @jbschlosser @albanD @janeyx99 @crcrpar @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @ezyang @anjali411 @dylanbespalko @mruberry @lezcano @nikitaved @amjames

@eqy eqy requested a review from janeyx99 May 16, 2024 00:18
Copy link

pytorch-bot bot commented May 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126358

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3900c22 with merge base 636e799 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@eqy eqy added open source module: optimizer Related to torch.optim module: cpu CPU specific problem (e.g., perf, algorithm) module: complex Related to complex number support in PyTorch topic: not user facing topic category labels May 16, 2024
@lezcano
Copy link
Collaborator

lezcano commented May 16, 2024

I reckon that one vectorises and the other one doesn't, hence the error, but would be good to confirm it.

Comment on lines +1462 to +1481
rtol=4.5e-5,
atol=5e-5,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default tolerance for complex64? Souds like it should be at that level already?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho right! Sorry I confused myself with halving the bitlength. This is expected to match fp32.
The general rule here is that for single op, we expect these to hold. If you test a bunch of ops chained one after the other, then we might have to increase the tolerance yes.

@janeyx99
Copy link
Contributor

@eqy
Copy link
Collaborator Author

eqy commented May 22, 2024

@pytorchmergebot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased eqy-patch-3 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout eqy-patch-3 && git pull --rebase)

@janeyx99
Copy link
Contributor

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 22, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Not merging any PRs at the moment because there is a merge blocking https://github.com/pytorch/pytorch/labels/ci:%20sev issue open at:
#126896

Details for Dev Infra team Raised by workflow job

@eqy
Copy link
Collaborator Author

eqy commented May 23, 2024

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: complex Related to complex number support in PyTorch module: cpu CPU specific problem (e.g., perf, algorithm) module: optimizer Related to torch.optim open source topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants