-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Lack of handling for L0 segments in binlog import #33157
Comments
what does -1 means? |
when we do bulk insertion L0 also need to do a handoff? |
Currently logs of L0 segments will be instored with the prefix |
"replay" means parse delta log and convert to delete messages? |
We need to ensure that after users invoke load(refresh=true), all bulk insertion L0 segments are loaded by the query node. Is this currently achievable? @congqixia |
Yep, L0 segments will be generated, instead of applying deletes on L1, L2 segments. |
backup l0 segments -> new l0 segments? So why we need to import L1/L2 segment and L0 segment in one request? I mean, we can import partition l0 segments after all l1/l2 segments in one partition, and import non-part l0 segments after all partitions restored. |
The proposed approach above indeed has a drawback: when users enable partitionKey (e.g., 64), the backup tool will invoke import once for each partition. Consequently, all data under the Proposal 2
To distinguish between L0 import from L1, L2 imports, a new flag needs to be added in the options:
|
"-1" means all partition, FYI |
IMHO, this shall be done after #32990 change |
why not add a partition tag? |
Abstract Execute interface for import/preimport task, simplify import scheduler. issue: #33157 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #33157 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Is there an existing issue for this?
Environment
Current Behavior
In the current binlog import process, we have not handled L0 segments (#27349), which will result in the loss of user delete data. See also: zilliztech/milvus-backup#316
Expected Behavior
During the backup-restore process, we need to consider L0 segments and import them into the collection. The strategy is to replay the L0 segments rather than merge delete data into L1 and L2 segments because users might be performing the incremental-backups, and the delete data in L0 segments could be from a long time ago.
We will define the order of files passed in during binlog import. The first input file will be the prefix for the insert log, the second input file will be the prefix for the L1 and L2 segments delta log, and the third and subsequent input files will be the prefixes for the L0 segments delta log:
To distinguish between segments under the same prefix (for example, "files/delta_log/col_0/part_0" as mentioned above) and determine whether they are L0 or non-L0, it is necessary to pass a list of L0 segment IDs in the options:
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: