Skip to content

Binary classification to filter and block unsolicited NSFW content from annoying coworkers... --- ...

License

Notifications You must be signed in to change notification settings

lucylow/salty-wet-man

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Salty Wet Man (SWM)

The goal of Salty Wet Man is to flag inappropriate online content to make the internet a safer and more inclusive space for everyone.

Status GitHub Issues GitHub Pull Requests License


Table_of_Contents


Motivation: Online Safety

A chessboard features 16 playing pieces with 6 types where each piece has special moves and the end game is to capture the opponent's King resulting in "checkmate". What is the most powerful piece on the chess board? Many people will say the King or Queen because they are the highest rank. However I believe the most powerful are the nine Pawns (lowest rank). This is because through pawn promotion gameplay, the nine Pawns have the power to get promoted to become Queens, Rooks, Bishops, or Knights. Therefore we need to nurture and protect them throught gameplay as they are the seeds for the future.

Being online can astronomically magnify threats and risks that vulnerable children already face offline.

Children are increasingly exposed to digital media and online technology at an early age. They are going online to do schoolwork, play games, and socialize with over 4 billion people (1 in 3 children) connected to the internet. Around 60% of fourth to eighth graders have access to phones or tablets and almost half of them have access to a computer in their bedrooms.

Access to the internet can lead to risks of exposure to online predators posed by online sexual abuse and exploitation, cyberbullying, exposure to harmful inappropriate content, and use and sharing of personal data. The COVID19 global pandemic with it's lockdown measures has led to widespread school closures and physical distancing measures increasing our dependence on technology to connect. Law enforcement authorities and reporting agencies have seen a statistically signficant increase in the amount of child sexual abuse material being shared online, of which an ever increasing percentage involves self-generated content.


Computer Vision Technical Solution

Innovation at UNICEF is about doing new things to solve problems and improve the lives of children around the world. Technological solutions like Online Protection Tools are key to efficiently respond the digital risks for children. Four categories of digital risks defined by UNICEF: Content, Contact, Conduct and Contract Risks: https://www.unicef.org/innovation/apply-ChildOnlineSafety. Focusing on Content Risks, which is defined as exposure to harmful or age-inappropriate content, such as pornography, child sexual abuse material, hate speech and extremism, discriminatory or hateful content, disinformation, online games, gambling, content that endorses risky or unhealthy behaviours and violent content which may be upsetting or show criminal activity.

  • Defining NSFW material is subjective and the task of identifying these images is non-trivial

  • Salty-Wet-Man identifies images solving a binary classification success/failure problem:

    • [SFW] positively trained for neutral images that are safe for work

    • [NSFW] negatively trained for inappropriate images that are not safe for work


Convolutional_Neural_Networks

Image Datasets

  • Theoretically CNN is best since large learning capacity and complexity
  • Stationarity of statistics
  • Locality of pixel dependencies

NSFW Images

  • Static images
  • Uncontrolled backgrounds
  • Multiple people and partial figures
  • Different camera angles

GPU Implementation

  • Heavy computation required - Size of CNN network limited by GPU memory avaliabe
  • Highly optimized implementation of 2D convolutions
  • Solution to spread network over multiple GPUs via parallel processing

Object_Recognition

Deep Learning's Impact on Computer Vision

deep learning impact

Labeled Image-Training Datasets

  • Small image datasets (order of tens of thousands of images) - MNIST digit-recognition with best error rate
  • Large image datasets (order of hundreds of thousands of images) - ImageNet

ImageNet used for Large Scale Object Recognition

  • Dataset over 15 million labeled images
  • Variable-resolution images (256x256)
  • Training, validation, and testing images
  • Benchmark - ImageNet Large-Scale Visual Recognition Challenge (ILSVRC)

NSFW_Object_Recognition:_Content-Based_Retrival_via_Localization

Image Location with Large Areas of Skin-colored Regions

  • Skin region properties - image, color, and texture

  • Input RGB values (skin spatial pixels) with log-opponent representation

    • L(x) = 105*logbaseten(x+1+n)
    • I = L(G)
    • Rg = L(R) - L(G)
    • By = L(B) - (L(G) + L(R))/2
  • Intensity of image (texture) smooth-ed with median filter, then subtracted from original image

  • Query By Image Content (QBIC)

    • Absraction of an image to search for colored textured regions
    • Uses image decomposition, pattern matching, and clustering algorithms
    • Find a set of images similar to a query image

Elongated Regions Grouping

  • Group 2D and 3D constraints on body/skin regions
  • Model human body == cylindrical parts within skeleton geometry
  • Identify region outline

Classify Regions into Human Limbs

  • Geometric grouping algorithms - matching view to collection of images of an object
  • Make hypothesis object present, and an estimate of appearance via future vector from compressed image
  • Minimum distance classifer to match feature vectors

NSFW_Object_Recognition:_Detection,_and_Segmentation

  • Object Image Segmentation

    • Group together skin pixels
    • Normalized cut
  • Input image each pixel with a category label

    • For every pixel - Check if the pixel [skin or not-skin]
  • If atleast 30% of the image area skin, the image will be identified as passing the skin filter

  • Training data for this super expensive - need to find images with every pixel labeled


NSFW_Object_Recognition_Image_Cropping


Neural_Network_Classifier_Model

  • VGG16 is a CNN for large-scale image recognition
  • Model achieves 92.7% top-5 test accuracy on ImageNet
  • Implemented with Keras and Tensorflow backend in this project

Architecture

  • Fixed input of 224 x 224 RGB image
  • Three fully-connected (FC) layers
    • 4096, 4096, and 1000 chanels respectively
  • Max pooling layers
  • Hidden layers have ReLu Retification
  • Final layer is soft-max layer
  • Total 16 Layers

Disadvantages

  • Super slow - takes weeks to train
  • Large disk/bandwidth network achitecture with +533MB
  • Consider varient VGG19 classifer

Keras Implementation

keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Full Keras VGG Code


Neural_Network_Errors_and_Overfitting

Data Augmentation

  • Label peserving transfomations

  • RGB channel intensities

    • Add transformation (covariance matrix) to each RGB image pixel
    • Object idenity invariant to changes in intensity/colour of images

Dropout Rates

  • ReLu neutrons
  • Dropout is used for first two fully-connected (FC) layers (4096 and 4096)

Technical_Installations

Requires heavy computation

  1. Install Python dependencies and packages (Keras, TensorFlow, and TensorFlow.js) - best to run from virtualenv

  2. Download and convert the VGG16 model to TensorFlow.js format

  3. Launch Node.js script to load converted model and compute maximally-activating input images for convnet's filters using gradient ascent in the input space. Save image files under dist/filters directory

  4. Launch Node.js script to calculate internal convolutional layers' activations and gradient-based Class Activation Map (CAM). Save image files under dist/activation directory

  5. Compile. Launch web view at https://lucylow.github.io/salty-wet-man/


Technical_Visualizations

yarn visualize

Increase the number of filters to visualize per convolutional layer from default 8 to larger value (ex. 18):

yarn visualize --gpu --filters 18

Default image used for internal-activation and CAM visualization is "nsfw.jpg". Switch to another image by using the "--image waifu-pic.jpeg" 👀

yarn visualize --image waifu-pic.jpeg

Technical_User_Privacy_Considerations

  • HTML5 Local Storage Data

    • Salty Wet Man cache stores data on user's local device
    • Data.js information is removed when user clears cache
    • Storage.setItem( 'game_state', JSON.stringify(gameState));
  • User.js File

    • User.js file added for user privacy
    • Template for configuring privacy and security
    • Reduce tracking from web analytics, tracking, finger-printing, or shoulder surfers
    • Harden browser settings against data disclosure or code execution vulnerabilities

References