Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: DRBD_META_SIZE configurable at cluster level and at instance creation time #1641

Open
zen2 opened this issue Jan 24, 2022 · 2 comments

Comments

@zen2
Copy link

zen2 commented Jan 24, 2022

The actual limit of DRBD_META_SIZE = 128 in /lib/_constants.py permit to have a maximum disk size of 4Tb.
On our cluster we change this limit to be able to create disk bigger that 4Tb and especially to be able to grow up disk beyond the 4Tb hard limit.

Few years ago we use a hook script that calculate an adapted size for DRBD_META_SIZE depending on DRBD disk size that we need and then replace the value directly in _constants.py. We clearly understand later that DRBD_META_SIZE have to be a global fixed value to be able to grow up disk later. So actually we use a bash script to modify DRBD_META_SIZE on all nodes once and at node creation actually.

The hardcoded 128M limit of DRBD_META_SIZE is clearly a strong limit for a ganeti cluster.
We manage up to 32Tb volume on our cluster. So I think this limit should be configurable at cluster level.

Maximum DRBD disk size is depending on DRBD meta size so it could be more intuitive to have a configurable maximum DRBD size and then calculate the corresponding DRBD meta size:
DRBD_META_SIZE = 1 + DRBD_MAX_SIZE / 32768

A good addition and alternative is to be able to specify a maximum drbd disk size (so drbd meta size) at instance creation time:
gnt-instance add -o image -t drbd --disk 0:size=20G --disk 1:size=2T,maxsize=8T -n node1:node2 my.instance

This instance will have:

  • the cluster default drbd meta size for disk 0
  • a specific drbd meta size for disk 1 that will allow to grow up the disk until 8Tb
    disk_drbd_meta_size = 8 * 1024^2 / 32768 + 1 = 257 Mb

To be able to modify at cluster level and at instance creation time the DRBD meta size can permit ganeti to be more scalable at DRBD disk level.

@zen2 zen2 changed the title DRBD_META_SIZE should be configurable at cluster level and at instance creation time FEATURE: DRBD_META_SIZE configurable at cluster level and at instance creation time Jan 24, 2022
@rbott
Copy link
Member

rbott commented Jan 24, 2022

I remember we discussed this problem during the last GanetiCon. Simply changing the value to something bigger was deemed the right solution. However I think I recall it was unclear how to handle the upgrade path (or more specifically: how to handle a possible downgrade path, once existing devices have been resized during the upgrade).

Does anyone have any ideas on how to handle that gracefully?

@zen2
Copy link
Author

zen2 commented Jan 25, 2022

To have DRBD_META_SIZE constant modified regularly on our cluster, I can tell about our experience:

  • DRBD meta size at drbd creation time is defined by master node DRBD_META_SIZE value
  • DRBD meta size constant is used only at creation time and don't impact existing DRBD instance
  • DRBD handle on his own meta size of resources so resync/replace primary/secondary drbd handle the same size from remaining drbd sibling
  • you can add DRBD disks to existing instances with a different DRBD meta disk size
  • instance DRBD meta size is stored (or deducted from drbd resource ?) in instance definition (disks child 1)
  • the only thing related to drbd meta size is the possible size of drbd volume:
    you can't create a big size volume if the drbd meta size constant is not big enough to handle the volume size

Our 10 years production cluster have currently many different drbd meta sizes:

   Nb DRBD                        META SIZE     DRBD MAX SIZE
     45         - child 1: plain, size 128M         <   4 Tb
     16         - child 1: plain, size 1.0G         <  32 Tb
     11         - child 1: plain, size 129M         <=  4 Tb
     10         - child 1: plain, size 513M         <= 16 Tb

We add regularly disks with different drbd meta disk sizes.
We use gnt-instance grow-disk regularly on instances disks.
We relocate regularly instances with several disks using different drbd meta disk size from one node to another one.

So I think that we don't need a specific upgrade/downgrade path because:

  • DRBD resources keep their own meta size definition
  • Ganeti cluster keep drbd meta size at instance level (if not deducted from drbd resources)
  • Master node DRBD_META_SIZE is only used at DRBD creation time

The only things to take in consideration are:

  • the relationship between DRBD volume size and DRBD metadata size: we need some check to avoid drbdsetup to fail and tell that it can't create a volume of this size because asked metadata size is not big enough.
  • keep same DRBD meta disk size on all cluster nodes to keep consistency in case of master failover

Notes:

  • we use only external drbd metadata
  • drbd meta logical volumes are on an SSD disks RAID 1 based lvm vg
  • drbd storage logical volumes are on an SAS disks RAID 5 based lvm vg
  • Ganeti version 2.15.2
  • DRBD version: 8.4.10 (api:1/proto:86-101)

This is the script that we use on our cluster if you want to make some lab tests:

gnt-drbd-meta-disk-size

#!/bin/bash
#
# gnt-drbd-meta-disk-size get | set SIZE
#
# get or set ganeti DRBD meta disk size computed from DRBD disk size
#
# Author: zentoo ( b4b1 @ free.fr )
# 2014-2022
#

SOURCE_PATH="/usr/share/ganeti/default/ganeti/_constants.py"
TAG="DRBD_META_SIZE"

usage ()
{
  cat << EOF
usage: $(basename $0) get | set DISK_SIZE[MGT]

get or set ganeti DRBD meta disk size computed
          from DRBD disk size given as parameter

- SIZE is the DRBD disk size is expressed in MiB by default
- M/G/T suffixes can be used for MiB/GiB/TiB
- If DRBD meta disk size computed is < 128 MiB,
  128 MiB ganeti default will be used

This script can be used to set temporary DRBD meta disk size
for DRBD disk size creation > 4064 GB.

Note:
  - Master node DRBD meta disk size set the DRBD meta disk size
    of any DRBD disk created on cluster
  - So it's advised to set the same DRBD meta disk size on all cluster nodes
    for consistency and master failover usage
EOF
  exit 1
}

check ()
{
  if [ ! -f "$SOURCE_PATH" ]; then
    echo "Error: $SOURCE_PATH doesn't exist"
    exit 2
  fi
  if ( ! grep -q "$TAG" "$SOURCE_PATH" ); then
    echo "Error: $TAG is not a valid TAG"
    exit 2
  fi
}

get_size ()
{
  DRBD_META_SIZE=$(grep "$TAG" "$SOURCE_PATH" | sed "s|.* = ||")
  DRBD_SIZE=$(( ( $DRBD_META_SIZE - 1 ) * 32768 ))
  echo "Actual DRBD_META_SIZE = $DRBD_META_SIZE MB"
  echo -n "Ideal for a maximum DRBD disk size of $DRBD_SIZE MB"
  [ $DRBD_SIZE -ge 1024 ] && echo -n ", $(( $DRBD_SIZE / 1024 )) GB"
  [ $DRBD_SIZE -ge 1048576 ] && echo -n ", $(( $DRBD_SIZE / 1048576 )) TB"
  echo
}

set_size ()
{
  case $2 in
    ''|'M') POWER=0 ;;
       'G') POWER=1 ;;
       'T') POWER=2 ;;
  esac
  DRBD_SIZE=$(( $1 * 1024 ** POWER ))
  DRBD_META_SIZE=$(( $DRBD_SIZE / 32768 + 1 ))
  echo "For a $1$2 DRBD disk: ideal DRBD_META_SIZE is $DRBD_META_SIZE MB"

  if [[ $DRBD_META_SIZE -lt 128 ]]; then
    DRBD_META_SIZE=128
    echo "Using DRBD_META_SIZE = $DRBD_META_SIZE MB (ganeti default)"
  fi

  [ -f "${SOURCE_PATH}c" ] && rm "${SOURCE_PATH}c"
  [ -f "${SOURCE_PATH}o" ] && rm "${SOURCE_PATH}o"
  sed -i "s|^$TAG = .*|$TAG = $DRBD_META_SIZE|g" "$SOURCE_PATH"

  echo
  echo "-> $SOURCE_PATH"
  echo "   have been modified with DRBD_META_SIZE=$DRBD_META_SIZE"
  echo
  echo "-> IMPORTANT:  Don't forget to do it on both DRBD nodes."
}

# Main

check
case "$1" in
  "get") [ $# != 1 ] && usage
	 get_size ;;
  "set") [ $# != 2 ] || [[ ! "$2" =~ ^[0-9]+[MGT]?$ ]] && usage
         SIZE=$(echo "$2" | sed "s|^\([0-9]*\).*|\1|g")
         UNIT=$(echo "$2" | sed "s|[0-9]*||g")
	 set_size $SIZE $UNIT ;;
  *) usage ;;
esac
exit 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants