-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed #27452 -- Added serial fields to contrib.postgres. #18123
base: main
Are you sure you want to change the base?
Conversation
65d6140
to
fd0e9b4
Compare
e892da4
to
97cbc23
Compare
fccac26
to
2b48193
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great start!
In addition to the specific comments I added, we'll need some migrations tests to cover adding/removing serial fields and converting between serial and auto fields.
tests/schema/tests.py
Outdated
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_serial_to_integer(self): | ||
from django.contrib.postgres.fields import SerialField | ||
|
||
self.serial_to_integer_test(SerialField, IntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_serial_to_small_integer(self): | ||
from django.contrib.postgres.fields import SerialField | ||
|
||
self.serial_to_integer_test(SerialField, SmallIntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_serial_to_big_integer(self): | ||
from django.contrib.postgres.fields import SerialField | ||
|
||
self.serial_to_integer_test(SerialField, BigIntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_small_serial_to_integer(self): | ||
from django.contrib.postgres.fields import SmallSerialField | ||
|
||
self.serial_to_integer_test(SmallSerialField, IntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_small_serial_to_small_integer(self): | ||
from django.contrib.postgres.fields import SmallSerialField | ||
|
||
self.serial_to_integer_test(SmallSerialField, SmallIntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_small_serial_to_big_integer(self): | ||
from django.contrib.postgres.fields import SmallSerialField | ||
|
||
self.serial_to_integer_test(SmallSerialField, BigIntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_big_serial_to_big_integer(self): | ||
from django.contrib.postgres.fields import BigSerialField | ||
|
||
self.serial_to_integer_test(BigSerialField, BigIntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_big_serial_to_integer(self): | ||
from django.contrib.postgres.fields import BigSerialField | ||
|
||
self.serial_to_integer_test(BigSerialField, IntegerField) | ||
|
||
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific") | ||
@isolate_apps("schema") | ||
@tag("serial") | ||
def test_big_serial_to_small_integer(self): | ||
from django.contrib.postgres.fields import BigSerialField | ||
|
||
self.serial_to_integer_test(BigSerialField, SmallIntegerField) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a good use case for subTest
to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the issue is we would either need to generate the test models with a dynamic name, or clean up the test models after the subtests - IMO it's more trouble than worth it.
e7fdd21
to
8052e52
Compare
@LilyFoote , thanks for the review! Could you point me to some tests I could copy to get started? I'm not sure how to test the migrations other than with the |
8052e52
to
42c53c4
Compare
For database defaults I added tests in tests/migrations/test_autodetector.py and tests/migrations/test_operations.py. |
569c960
to
0ebc30b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates @csirmazbendeguz
I think there are a few tests/asserts we need to add 👍
Thanks @sarahboyce , fair enough, I made some adjustments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't really read many of the tests yet; not much has caught my eye that isn't already being discussed.
def _check_default(self): | ||
if self.default is NOT_PROVIDED: | ||
return [] | ||
return [ | ||
checks.Error( | ||
f"{self.__class__.__name__} does not accept default values.", | ||
obj=self, | ||
id="fields.E014", | ||
), | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I would expect most users to operate this way; it feels overly prescriptive to me.
Suppose for some reason I only wanted a subset of objects to be part of the sequence, and the rest to default to 0, I might want to set default
on the field and pass DatabaseDefault()
only when I want to use the sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @InvalidInterrupt !
I don't think this is an issue, it's possible to create an object without using the sequence:
class Dummy(Model):
serial = SerialField()
Dummy.objects.create(serial=0).serial # 0
Dummy.objects.create(serial=0).serial # 0
Dummy.objects.create(serial=0).serial # 0
Dummy.objects.create().serial # 1
Dummy.objects.create().serial # 2
Dummy.objects.create().serial # 3
Dummy.objects.create(serial=1).serial # 1
Dummy.objects.create().serial # 4
Is this what you're suggesting?
class Dummy(Model):
serial = SerialField(default=0)
Dummy.objects.create().serial # 0
Dummy.objects.create().serial # 0
Dummy.objects.create().serial # 0
Dummy.objects.create(serial=DatabaseDefault()).serial # 1
Dummy.objects.create(serial=DatabaseDefault()).serial # 2
Dummy.objects.create(serial=DatabaseDefault()).serial # 3
Dummy.objects.create(serial=1).serial # 1
Dummy.objects.create(serial=DatabaseDefault()).serial # 4
I think the first one is better because it's consistent with how AutoField
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been testing on a little project, if we add a SerialField
to an existing model I cannot make it nullable or assign a default and so I have to give a temporary default in order to create a migration and add the field.
However, whatever I give here is ignored and the field is populated everywhere with as a serial starting from 1
I feel like it shouldn't prompt us in that case (forcing my input and then ignoring it to me is a bug).
Also considering this, a default option might be nice 🤔 (cc @LilyFoote)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see 🤔 Thanks for catching that @sarahboyce ! I tested it too but I didn't notice the the input was ignored. 😱 I'll investigate.
Yes, default
makes sense in the context of migrations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I think the minimal change feature-wise would be to adapt the migration generation code to know that SerialField
can be created without an explicit default.
For a user-chosen default my question would be "why a SerialField
and not an IntegerField
?".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining @InvalidInterrupt !
I think the only place this could be useful is in migrations. So when adding a serial field, it could be pre-filled with a default number.
I would find this API confusing otherwise, users may confuse it with the start
parameter for example.
I don't think it's too restrictive since users can still opt out of using the sequence by providing an explicit value.
That said, it's always easier to remove a restriction than to add one.
In my opinion, the use-case of migrations is an edge-case and it can be solved by first creating an IntegerField
then changing it to a SerialField
, so it wouldn't justify enabling this.
If someone can provide a good use-case this restriction could be removed in the future, but for now I have to disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My description of the features potential use may have been confusing; but using default
in this manner would be just like (every?) other field type, so I'm sceptical it would be confused for something like start
outside of the context of this thread. In regards to API design, as it stands SerialField
will be the exception. Eliminating that exception would probably also fix the migrations without adding a kludge to the migration code for this.
I've just finished mocking up the removal of this restriction to confirm that it works as expected and doesn't break anything else. I'm going to finish cleaning up the code and then push it to my fork for reference as a PoC (if anyone is curious: https://github.com/InvalidInterrupt/django/tree/ticket_27452).
That being said, I don't think this is a big issue; and it shouldn't hold up this PR. If I'm not persuading anyone that we should remove this guardrail and trust the Django users, then we should just move along for now (Sarah's migration issue notwithstanding).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, even AutoField
allows you to set a default
, even though that means that you'll only be able to save a single instance of the model without specifying an explicit value for the PK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and AutoField
allows a default
too, it will cause integrity errors of course but it allows it. So yes you are raising a valid point, this would be an exception, thank you for pointing that out. It may be worth enabling for consistency's sake. @LilyFoote , @sarahboyce what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there's a problem with migrating a SerialField
with a default
value. It's impossible to do this in Postgres, since serial
already implies the field has a default sequence. As far as I can tell, there's no trivial way to add a serial
field to a table with an initial default value other than the sequence.
So I think the simplest option would be to disable default
after all. Or to not use the built-in serial
type when creating a field.
Thanks for the review @InvalidInterrupt , it's much appreciated! 🙏 |
@@ -75,6 +76,7 @@ def handle_inspection(self, options): | |||
"field names." | |||
) | |||
yield "from %s import models" % self.db_module | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csirmazbendeguz these are minor clean up changes that will happen on the PR either by you or by a merger
- we want to limit the changed lines to keep the diff really tight to what's important, occasionally you have added a new line or removed a line, I will probably revert some of this (you can too if you like). This is considered formatting changes
- you have added
@tag("serial")
for your tests, this is not how we organise tests generally and needs to be removed - the commits need to be squashed. This will likely be 1 commit (or if we can think of a nice way of splitting them we will)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this for informational purposes 👍
I forgot to say that I have tested this with a local project and tried out a number of things to try and see if I can find any issues (had a look in the admin, reversed migrations, used them in GeneratedFields
) and this comment is the only thing I'm concerned about. It looks really good ⭐
5212ef3
to
f521fdd
Compare
I realized instead of adding new system checks, I could just raise a I think it's better, because:
|
So I would like to summarize what I figured out today for both myself and others to see. We have two options: either enable serial fields to define a If
If
|
some notes so I don't forget: EDIT: |
So I decided to stop fighting against what's already in place and try to conform to the existing code. I managed to achieve the following:
|
def test_serial_to_integer(self): | ||
from django.contrib.postgres.fields import SerialField | ||
|
||
self.serial_to_integer_test(SerialField, IntegerField) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of wish we tested all these in something like postgres_tests/test_schema
but given this is a pre-existing pattern here import testing against django.contrib.postgres
this doesn't have to be done in this PR.
table = self.field.model._meta.db_table | ||
column = self.field.column | ||
return f"nextval(pg_get_serial_sequence('{table}', '{column}'))", [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be an SQL injection vector and will crash if any table or column name includes a quote.
table = self.field.model._meta.db_table | |
column = self.field.column | |
return f"nextval(pg_get_serial_sequence('{table}', '{column}'))", [] | |
table = self.field.model._meta.db_table | |
column = self.field.column | |
return "nextval(pg_get_serial_sequence(%s, %s))", [table, column] |
I would personally use Func
instead to avoid this error prone boilerplate
class NextSerialSequence(Func):
template = "nextval(pg_get_serial_sequence(%(expressions)s))"
And override SerialFieldMixin._db_default_expression
to return
@cached_property
def _db_default_expression(self):
return NextSerialSequence(
Value(self.model._meta.db_table),
Value(self.column),
output_field=self,
)
Which should also avoid the db_default
injection in __init__
and removal in deconstruct
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @charettes , fair enough. We should really set db_default
to something though, overriding _db_default_expression
is not enough because this variable is checked during migrations.
def get_prep_value(self, value): | ||
value = super().get_prep_value(value) | ||
if value is None: | ||
value = self.get_default() | ||
return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this is required given we set default = DatabaseDefault()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for supporting None
arguments:
SerialModel.objects.create(serial=None)
Since it's possible to provide None
to auto fields, it should also be possible to provide it for serial fields.
If that's not necessary, we can remove it.
sql_create_sequence = ( | ||
"CREATE SEQUENCE IF NOT EXISTS %(sequence)s " | ||
"AS %(data_type)s " | ||
"OWNED BY %(table)s.%(column)s" | ||
) | ||
sql_reset_sequence = ( | ||
"SELECT setval(pg_get_serial_sequence(%s, %s), " | ||
"coalesce(max(%s), 1), max(%s) IS NOT NULL) FROM %s;" | ||
) | ||
sql_rename_sequence = "ALTER SEQUENCE %s RENAME TO %s;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about it isn't there something fundamentally wrong here about not using the native serial
types?
Why are we going the extra length of manually emulating through schema definition and explicitly using nextval(pg_get_serial_sequence(...))
on INSERT
when simply using serial
and friends does it by itself?
I could be missing something here but it seems that most of the complexity of this patch stems from trying to emulate what Postgres does very well already by itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@charettes , it's because Postgres doesn't support it very well. serial
is not a true type, and it's not possible to alter a field's type to serial
. we have to emulate it because that's the only way it can work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on how Postgres doesn't support serial
very well? It has been used to power AutoField
for over a decade before we switched to IDENTITY
so I'm surprised to learn that it poses problems.
From my understanding the main reason for the manual sequence management and db_default
assignment is to support altering back and forth between integer, auto, and serial fields. Is that really going to be a common enough use case that it warrants all the complexity it requires?
I think it's a question worth asking because there are no mentions of supporting this use case in the accepted ticket-27452 and I personally don't think that it will be common enough operation to warrant that much extra complexity to the schema editor.
Don't get me wrong, it is impressive that you got it working and I'm sure you learned a ton doing so. I'm just doubtful that the code complexity tradeoff is worth it (1.4k additional line) considering that this could be simply implemented using mixins including db_returning = True
and using get_db_type = "serial"
at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use serial
when creating a table or when adding a field, but you can't use it for anything else, e.g. you can't alter a column's type to serial and you can't cast to serial. But do let me know if I'm wrong, it didn't work when I tested it.
I agree it adds some complexity, but the majority of that 1.4k lines is just tests, the schema editor received 160 additions and 44 removals only. I assumed the functionality to transition between integer fields <-> serial fields <-> auto fields would be a requirement because of the similarities between these fields. If all we wanted to support was creating serial fields, I'm not sure this would need any changes to core, it could be a package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding the main reason for the manual sequence management and db_default assignment is to support altering back and forth between integer, auto, and serial fields.
Yes this is correct. What I meant was Postgres can't do this out of the box with serial fields.
Trac ticket number
ticket-27452
Branch description
Added
SmallSerialField
,SerialField
,BigSerialField
tocontrib.postgres
.Previous PR
Checklist
main
branch.