Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Scan code for dangerous uses of "auto" #196

Open
KenBirman opened this issue Apr 15, 2021 · 3 comments
Open

Request: Scan code for dangerous uses of "auto" #196

KenBirman opened this issue Apr 15, 2021 · 3 comments

Comments

@KenBirman
Copy link
Contributor

KenBirman commented Apr 15, 2021

I ran into a bug in my own C++ code that got me thinking about whether we might have the same issue anywhere in Derecho or Cascade. I think we should quickly do a visual scan of our code.

The issue: suppose you have a std::vector or some other C++ STL type that offers an indexed getter but has a structure under the covers. Now consider these three lines:

std::vector<int> my_vec {11111, 22222, 33333}; auto item = my_vec[1]; ++item;

Compare with these three:

std::vector<int> my_vec {11111, 22222, 33333}; int item = my_vec[1]; ++item;

At a glance you would expect them to be the same. In the "auto" case you might assume that item is going to be of type int and that this code is the same as if you explicitly declared it to be an int. But in fact you would be wrong! The indexed getter is a const method that returns an lvalue reference, template-expanded by the std::vector class, and this can be constexpr evaluated statically. So, at compile time, the auto-declared item becomes a reference to the underlying element of the original vector. When you increment item, the contents of my_vec[1] will change, even though the code is written as if item was a separate copy of that vector element!

This bug seems to be fundamental to the way that C++ 17 is aggressively auto-expanding and constexpr-evaluating code these days. When you think about it, the approach makes sense -- but unless someone anticipated this when building the STL and used the && postfix operator to explicitly make this index operator an rvalue-only type, we won't get an error and we will get the wrong behavior.

So, I think we should code-review Derecho and Cascade to ensure that we don't have any errors of this form in our code.

@mpmilano
Copy link
Member

I need to point out here that this isn't actually the case.

Expanding Ken's code example a bit:

#include <vector>
#include <iostream>

int main(){

    std::vector<int> my_vec {11111, 22222, 33333};
    auto item = my_vec[1];
    ++item;

    for (const auto& item : my_vec){
	std::cout << item << std::endl;
    }
}

You'll find that this prints:
11111
22222
33333

The rules governing auto are found here and haven't changed significantly since C++11. A bare auto (as opposed to auto& or auto []) will never deduce a reference.

So then where did the mysterious behavior Ken was originally bitten by come from? I can't be sure, but there are three cases where auto might behave differently than you expected:

  • The type deduced by auto was an unexpected non-reference with implicit conversion operator to the expected type. For example, std::reference_wrapper<T> is an object (not a reference), which behaves like a reference, and converts to a T on demand when explicitly requested. auto will also never call a conversion/coercion operator, so it wouldn't generate that T.
  • A special case of the above: the container or object for which auto is deducing a type has been specialized, returning something very odd rather than the expected. An example of this is vector<bool>, whose operator[] returns a std::vector<bool>::reference, not a bool&. This std::vector<bool>::reference turns out to be a proxy object, and follows the same rules as the above case.
  • An even more specialized case of the above -- auto with structured binding. aka auto [x,y] = std::pair{3,4}. I don't recommend ever using structured-binding-auto without using a reference, as it's free to deduce a reference use in weird cases.

Further reading: https://blog.petrzemek.net/2017/12/08/when-auto-seemingly-deduces-a-reference-in-cpp/

@KenBirman
Copy link
Contributor Author

Matthew, you are exactly correct -- I tried to strip my actual example down and didn't in fact test that the my simple version would malfunction! On the other hand, it did arise with a std::vector, and had the "effect" I described.

Is there any kind of quick check we can do to confirm that we don't have that sort of issue in our code base? The bug was pretty annoying to me when it occurred. Had I not used auto, it wouldn't have happened.

@mpmilano
Copy link
Member

Frustratingly there's nothing "quick" that would help here. The fastest check I can imagine is pretty similar to the one you're already proposing:

  • audit any uses of destructuring bind (auto [x...]), and just make sure they're all explicitly using refs or const refs. There's probably gonna be, like, two uses of this construct in the entire codebase
  • audit uses of non-ref auto to make sure you now what category of object the RHS of the binder is returning (ref vs. value vs. wrapper)

This is really just an instance of "surprise copy constructor" not doing what you want, which is a source of tons of C++ bugs overall... it's frustrating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants