Skip to content

Feature: Hardware Devices

Hugo Josefson edited this page Nov 28, 2019 · 9 revisions

Relevant links

Goal:

HW Devices basic management.

User stories

View a list of recognized hardware (by Class, by Driver)

Adam is an IT administrator in a mid-size company. He is part of a small team responsible for the company IT infrastructure, including Linux setup and performing server/workstation HW upgrades which usually consist of adding or replacing devices like graphics/network cards or hotplug/unplug disks. Once HW device is installed or removed, he checks whether this change is recognized by OS using command-line utilities. His wish is to have a nice-looking graphical tool listing devices for this verification.

View a list of recognized hardware (by IOMMU Groups)

Frank is Virtualization System Administrator in a large enterprise, responsible for oVirt environment. Part of his job is to manage VMs with host device passthrough via IOMMU Groups. To properly assign IOMMU Group to a VM, he needs to understand IOMMU Group particular devices belongs to.

Change device driver

James works in the same team as Frank. He's requested to reuse the host machine for other tasks then just virtualization but is not allowed to reboot it. He knows, devices used by a VM are bound to the VFIO driver (either manually or by oVirt, e.g.). So to fulfill his task he needs to unbind the device (like i915 graphics card) from the VFIO and bind it to its corresponding driver so the device can start operate normally for other applications.

Configure device-specific functions, like spawning SR-IOV virtual functions

Suzan is the newest Frank's and James colleague. The company bought new servers to host virtual machines. The servers are equipped by SR-IOV (single root input/output virtualization) devices, esp. network cards. As a part of the new host setup, she needs to spawn multiple virtual functions on these devices. Recently, she uses command-line utilities but she would prefer graphical interface to configure virtual functions. She would appreciate an easy way to see allowed maximum value as well. She performs similar task for vGPU.

View NUMA topology

B.J. works as Senior Virtualization Administrator in an investment company. The company runs high-performance VMs usually requiring access to specialized HW. For high performance computing is crucial to align the VM topology with the topology of underlying host. The devices are associated to VM by IOMMU Groups, each of them belonging to a NUMA node or particular CPU. B.J. is responsible for fine-tuning of the computational environment. To properly pin particular VM in a virt management console to one or more NUMA nodes, B.J. needs to understand its topology well.

Some of the tools used so far:

  • lspci, hwloc, lshw, lstopo, modinfo
    • tool for graphical representation, incl. interconnection of recent tooling output, is missing

Use Cases

By Adam: Review attached HW devices from OS perspective (Read-only) Prior: add/remove HW device (either hot (un)plug or with cold reboot)

  • see list of available devices
  • search for the [added|removed] device
  • if found:
    • verify (see) device details (bus, address, name, vendor, product, driver, IOMMU Group etc)
    • see what other devices are associated with the driver

By Frank: Review IOMMU Group(s) (Read-only)

  • select IOMMU Group
  • check list of its devices
  • check details for particular device - driver, address, type, etc.
    • see what other devices belong to IOMMU Group

By James: Change device driver

  • select a device by its class
  • review device's details, esp. driver
  • unbind from the driver
  • bind to new driver
    • TBD: automatic matching device-driver would be nice
    • if automatic driver preselection is not possible, user selects one from all available drivers

By Suzan: Configure device-specific functions

  • select a device
  • device type-specific operation is allowed, means:
    • Enter count of Virtual Functions to be spawned for SR-IOV
    • Enter count of vGPUs
    • TBD: other type-specific operations

By B.J.: Review the NUMA topology

  • In Cockpit: understand the HW NUMA topology (nodes, CPUs, IOMMU Groups, devices), Read Only
    • see available NUMA Nodes
    • see CPUs associated to particular node
    • see what IOMMU Groups are associated with the node
    • see what devices are in an IOMMU Group
  • Outside Cockpit:
    • pin a VM to particular node(s)

Design

The design will be similar to the 'Services' screen, it means:

  • On top of the page, exactly 1 bus can be selected to be displayed (PCI, USB, SCSI, ...)
    • Bus-specific view is rendered, for
    • PCI: The Group-By Selector (Options: Class | IOMMU Group | NUMA Topology)
      • by Class: panels listing basic device classes of >0 device count (like 'Network controller (3)' or 'Display controller (1)') panel body lists all devices of the class with details and buttons
        • expanding row to see further details special class reserved for the Unclassified devices (might happen)
      • by IOMMU Group:
        • table with sorting and searching, sorted by IOMMU Group by default
        • on row click: display detail
      • by NUMA Topology:
        • grid of cards for NUMA Nodes with basic stats (Count of CPUs, and devices)
        • on NUMA Node click, detail is displayed:
          • Tree of CPUs and IOMMU Groups
          • on IOMMU Group click:
            • IOMMU Group detail is displayed
    • USB:
      • read-only tree-grid showing USB devices
    • other buses:
      • TBD: but the presentation will be bus-specific

Screenshots: TBD

Data Example

  • Device Class: Network controller, Display controller
  • IOMMU Group: numeric 0, 1, 2, ...
  • Device: RTL-8100/8101L/8139 PCI Fast Ethernet Adapter
  • Device Detail: vendor, product, driver, iommu group, class, pci address, capabilities or more

Data dump of a system

I (Garrett) have made an HTML processor that converts the output from lshw to a structured YAML file that lists hardware on the machine.

The command that needs to be run (assuming lshw is installed on the system) is:

sudo lshw -sanitize -html > hardware-`hostname`.html

I will release the simple script to convert the output from above to YAML next week. (It's EOB for the week for me right now.)

The YAML file can then be used in an HTML mockup I'm working on. Building the design around the data means I can check it out and make quick changes based on real data and see how well it scales across many computers (without needing access to the hardware myself and/or building a custom version of Cockpit).

The generated HTML will eventually be exported to SVG for use as a mockup that will be included on this page (as a PNG), and it could possible be used as reference HTML within Cockpit.

How To

  • sysfs as primary datasource
  • react/redux, probably no other libs are needed
  • new cockpit plugin will be added - accessed by 'Devices' from Cockpit's main menu
  • refresh:
    • based on monitoring udev events using udevadm

Implementation Roadmap

  • list of pci devices in the by-Class hierarchy
  • add IOMMU Group view for pci devices
  • add active actions, namely bind/unbind to a driver
  • add support for other buses (like usb or scsi)
  • add NUMA Topology view
Clone this wiki locally