Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: bitmap detection is incorrect for MD arrays that use journaling #20381

Closed
clockfort opened this issue Apr 26, 2024 · 6 comments · Fixed by #20404
Closed

storage: bitmap detection is incorrect for MD arrays that use journaling #20381

clockfort opened this issue Apr 26, 2024 · 6 comments · Fixed by #20404
Assignees
Labels

Comments

@clockfort
Copy link

clockfort commented Apr 26, 2024

Explain what happens

MD arrays that use write journaling ( https://www.kernel.org/doc/Documentation/md/raid5-cache.txt for a brief introduction, if one is unfamiliar) rather than write-intent bitmaps should not receive this warning. The warning for such users is both misleading and isn't even actionable, as bitmaps and journals are mutually exclusive concepts in MD.
image

Version of Cockpit

315

Where is the problem in Cockpit?

Storage

Server operating system

Fedora

Server operating system version

40

Extra Info

[clockfort@clockfort-nas]$ sudo mdadm --detail /dev/md128
/dev/md128:
           Version : 1.2
     Creation Time : Sun Apr 21 22:16:24 2024
        Raid Level : raid6
        Array Size : 70312507392 (65.48 TiB 72.00 TB)
     Used Dev Size : 11718751232 (10.91 TiB 12.00 TB)
      Raid Devices : 8
     Total Devices : 9
       Persistence : Superblock is persistent

       Update Time : Fri Apr 26 09:57:01 2024
             State : clean
    Active Devices : 8
   Working Devices : 9
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : journal

              Name : clockfort-nas.clockfort.com:128  (local to host clockfort-nas.clockfort.com)
              UUID : 01e5f004:2e5d965f:7cb0a87e:79545d47
            Events : 65113

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       5       8       81        5      active sync   /dev/sdf1
       6       8       97        6      active sync   /dev/sdg1
       7       8      113        7      active sync   /dev/sdh1

       8     259        5        -      journal   /dev/nvme0n1p5

@clockfort clockfort added the bug label Apr 26, 2024
@clockfort clockfort changed the title storage: bitmap detection logic is incorrect for MD arrays that use journaling storage: bitmap detection is incorrect for MD arrays that use journaling Apr 26, 2024
@jelly
Copy link
Member

jelly commented Apr 29, 2024

@mvollmer can you take a look?

@mvollmer mvollmer self-assigned this Apr 29, 2024
@mvollmer
Copy link
Member

Thanks! Let's see if UDisks2 can give us information about the journal.

@mvollmer
Copy link
Member

@clockfort, do you think we should add journal support to Cockpit in general? Would you be able to help?

@clockfort
Copy link
Author

I'd be happy with just enough logic so that it doesn't show up as a warning.

Honestly, I think it's only quite rarely used outside of probably Meta's datacenters. It is kind of strictly better than bitmaps, and maybe the reason why people don't use it is that high-level tooling like cockpit doesn't support it 😄

If cockpit is interested in fuller support, yes, I could help out, but all I really want is to not have to mentally gloss over cockpit warning flags on some of my servers.

mvollmer added a commit to mvollmer/cockpit that referenced this issue Apr 30, 2024
A mdraid can have different consistency policies, and "bitmap" is only
one of them. We should only tell people to add a bitmap when the
policy is "resync".

UDisks2 and Cockpit don't understand consistency policy at all, so we
do the bare minimum here to detect a journal.

Fixes cockpit-project#20381
@mvollmer
Copy link
Member

I'd be happy with just enough logic so that it doesn't show up as a warning.

Yes, that needs to be fixed quickly -> #20404

If cockpit is interested in fuller support, yes, I could help out, but all I really want is to not have to mentally gloss over cockpit warning flags on some of my servers.

Absolutely.

UDisks2 and Cockpit seemed to have missed the whole consistency policy thing. MDRaid support is very old and a bit neglected, I am afraid. I think we should at least expose the consistency policy in the UDisks2 API, and disable the "Remove" action for journal disks. That kind of thing. It would be great if you could help us find the right balance there.

@mvollmer
Copy link
Member

@clockfort, fyi, I have started hacking on UDisks2 a bit: storaged-project/udisks#1274, feel free to join in! :-)

mvollmer added a commit to mvollmer/cockpit that referenced this issue Apr 30, 2024
A mdraid can have different consistency policies, and "bitmap" is only
one of them. We should only tell people to add a bitmap when the
policy is "resync".

UDisks2 and Cockpit don't understand consistency policy at all, so we
do the bare minimum here to detect a journal.

Fixes cockpit-project#20381
mvollmer added a commit to mvollmer/cockpit that referenced this issue May 3, 2024
A mdraid can have different consistency policies, and "bitmap" is only
one of them. We should only tell people to add a bitmap when the
policy is "resync".

UDisks2 will have a ConsistencyPolicy property soonish, and we
approximate its value until then.

Fixes cockpit-project#20381
jelly pushed a commit that referenced this issue May 6, 2024
A mdraid can have different consistency policies, and "bitmap" is only
one of them. We should only tell people to add a bitmap when the
policy is "resync".

UDisks2 will have a ConsistencyPolicy property soonish, and we
approximate its value until then.

Fixes #20381
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants