Skip to content

Skyline-9/U2-Background-Removal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

github-issues-shield Jupyter Notebook TensorFlow Keras Python MIT License LinkedIn


Logo

U2 Background Removal

Deep Learning based background removal built with U2-Net: U Square Net (Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood Dehghan, Osmar R. Zaiane and Martin Jagersand)

Read the original paper »

Table of Contents
  1. Introduction
  2. Model
  3. Next Steps
  4. Citation

Introduction

Why U2-Net?

Removing the background of a picture is an old problem, but traditional CV algorithms such as image thresholding fall short without intensive pre/post processing, and even then, the task is very difficult when the object has colors similar to the background.

With recent advances in DL literature, Saliency Object Detection (SOD) has emerged as one of the predominate ways to separate foreground and background. In short, SOD is a task based on segmenting the most visually attractive objects in an image, typically by creating a saliency map to distinguish the important foreground from the background.

Most SOD networks work based on using features extracted by existing backbones such as AlexNet, VGG, ResNet, ResNeXt, and DenseNet. However, the problem is that all of these backbones are originally designed for image classification, meaning they "extract features that are representative of semantic meaning rather than local details and global contrast information, which are essential to saliency detection."

For the International Conference on Pattern Recognition (ICPR) 2020, Qin et al. proposed a novel network for SOD called U2-net that allows training from scratch and achieves comparable or better performance than those based on existing pre-trained backbones.

Model

ReSidual U-Block (RSU)

image

Qin et al. proposed a novel block called RSU, consisting of

  1. An input convolution layer which transforms the feature map to an intermediate map
  2. A U-Net like symmetric encoder-decoder structure which takes the intermediate feature map as input and learns to extract and encode the multi-scale contextual information
  3. A residual connection which fuses local features and the multi-scale features

Architecture

In encoder stages En 1, En 2, En 3 and En 4, we use residual U-blocks RSU-7, RSU-6, RSU-5 and RSU-4, respectively. As mentioned before, “7”, “6”, “5” and “4” denote the heights (L) of RSU blocks. The L is usually configured according to the spatial resolution of the input feature maps.

image

Loss Function

image

Model Comparison

Comparison of model size and performance of the U2-Net with other state-of-the-art SOD models image

Credit: original paper

Next Steps

  • Data augmentation
    • The original paper performed image augmentation by horizontally flipping the training set
  • Evaluation metrics
    • Precision recall curve, F-measure, Mean Absolute Error
  • Add support for video
  • Set up web version with Tensorflow.js

Citation

@InProceedings{Qin_2020_PR,
title = {U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection},
author = {Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin},
journal = {Pattern Recognition},
volume = {106},
pages = {107404},
year = {2020}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published