Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.
/ mycv Public archive

My own solutions for Computer Vision! This is a sub-project of Mergen; all the Python codes must be translated to C++. In order to debug them faster and more easily, I used Python and libraries like OpenCV (previously Pillow), Matplotlib and NumPy.

Notifications You must be signed in to change notification settings

fulcrum6378/mycv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MyCV

This is a subproject of Mergen IV in Python which helps with faster debugging of computer vision algorithms. These methods are translated to C++ for the main project. MyCV was initiated in 10 July 2023 and after a successful image analysis, translations began at 24 September.

The project divides the process of image analysis to the following steps:

1. /vis/

Tools for reading output from vis/camera.cpp

  • rgb_to_bitmap.py : extracts RGB image frames from a single big file named "vis.rgb", and saves them in Bitmap (*.bmp) images with BMP metadata (from /vis/metadata) at beginning of each file.

  • test_yuv.py : displays a raw YUV bitmap image (test.yuv) using OpenCV and Matplotlib. (not to be mistaken with YUYV *.yuv image)

=> Output: BMP image frames


config.py defines tweaks used in most steps.

2. /segmentation/: Image Segmentation

  • Region-growing methods (succeeded)
    • region_growing_1.py : this method focuses on a pixel and analysed its neighbours, I left it incomplete and moved to the 2nd method, but then I realised this method was much better!
    • region_growing_2.py : this method focuses on a neighbour then determines if it fits in the same segment. It is more object-oriented than the previous one, and contains more boilerplate code! It takes ~22 seconds here and ~5 seconds in C++ plus ~3.3 seconds in /tracing/ because data structure of "segments" is a map rather than a vector, making ~9 seconds totally!
    • region_growing_3.py : an improved and completed version of the 1st method; it takes ~20 seconds here.
    • region_growing_4.py : same as the 3rd method, but without recursion because of C++ restrictions, and segment IDs start from 0 not -1. It takes ~30 seconds here, but ~2 to ~4 seconds in C++ with a Samsung Galaxy A50 phone!
    • region_growing_5.py : same as the 4th method, but it compares colours of all pixels of a segment with the very first pixel it finds. The results we like a KMeans filter, which I didn't like!
  • Clustering methods
    • clustering_1d.py : clustering all pixels in a 1-dimensional way... left incomplete; not logically suitable!

=> Output: segmentation data extracted using pickle library


3. /tracing/: Image Tracing

Trying to interpret segments of images in terms of vector graphics instead of raster images; but the vectors must be able to be easily compared to others of their kind. It must:

  • Detect Shapes: it calculates path points in relative percentage-like numbers, just like a vector image. Vector paths can be stored in 2 different types:
    1. 8-bit: position of each point will range from 0 to 256 (uint8_t, unsigned byte).
    2. 16-bit: position of each point will range from 0 to 65,535 (uint16_t, unsigned short).
  • Detect Gradients: it temporarily computes an average colour of all pixels.

Because outputs of each segmentation method is not the same, each method must have its own implementation of tracing, and those implementations will have the suffixes referring to those method (e.g. "_rg4").

Tracing methods:

  • Surrounder: finds a random border pixel, then navigates through its neighbours until it detects all border pixels of a segment. It messes up when a shape has inner borders.
  • Comprehender: analyses all pixels if they are border ones.

Because of C++ maximum stack restrictions (stack overflow), surrounder_rg4.py was forked from surrounder_rg3.py with no recursion.

=> Output: vector data in JSON files (good for debugging, instead of unreadable pickle dumps)


4. /storage/

Shapes and their details need to be temporarily stored in a non-volatile memory (SSD/hard disk/SD), in a way that it enables super-fast searching and easily finding similar shapes. This is actually some kind of Short-Term Memory. First I wanted to put the data in a 4+ dimensional array, making a Datacube, but it was a bad idea. Then...

  1. feature_database.py : I wanted to separate features/details of shapes into separate small databases and this code was intended to be a super-fast mini-DBMS, but due to limitations of writing/appending into files, I realised this method was not even practical!

  2. sequence_files_1.py : in this method we save shapes and their details in storage, much more separately than the previous method. Each feature will have a folder (resembling a table in a database), and also shapes are stored in a separate folder. Quantities of feature are clustered and IDs of their shapes are put into separate files.

  3. sequence_files_2.py : same as the previous, except that ratio index doesn't store the exact float number anymore, it just stores mere shape IDs.

    Eventually I concluded that this method is most suited for a Long-Term Memory rather than a Short-Term Memory!

  4. volatile_indices_1.py : The structure is the same as Sequence Files 2, but all data except /shapes/ reside in the RAM, as Short-Term Memory. It uses the pickle library in order export from volatile memory.

  • datacube.cpp : I figured maybe the idea of a Datacube might be useful in terms of a Long-Term Memory rather than in short-term. But this time, our datacube is a 4-dimensional dict/map rather than an array.

Helper files:

  • xxx_global.py : holds important codes for all implementations of this storage method.

  • xxx_extractor.py : extract indexes from volatile mode (pickle) to non-volatile mode (e.g. Sequence Files).

  • xxx_forgetter.py : forgets some shapes from storage. (it's now implemented only in C++)

  • xxx_summariser.py : reads all indexes and displays a summary.

  • xxx_validator.py : validates current indexes.

  • shape_x.py : holds global tools for manipulating xth version of shape files.

  • shape_x_viewer.py : renders a shape file, plus a summary of its details.

=> Output: data properly and efficiently structured and stored in memory


5. /perception/

Trying to make sense of the resulting segments:

  1. tracking.py : it tracks visual objects from a previous frame, using their details and positions and measures their differences.

There were also two discontinued approaches, but their works were either merged into this section or were unnecessary:

  • /comparison/ It extracts a shape from /storage/output/ and looks for similar items in the same directory, using the databases of the previous step. Therefore, every database will have its own implementation of comparison.

  • /resegmentation/: Object Tracking Visual objects must be tracked across frames, this method is a continuity of Segmentation and must be integrated to it. Each method should have its own implementation of Segmentation.

=> A simple dict/map containing a shape's previous and current ID along with their differences


/debug/

This section provides you with server-client tools for easily debugging the C++ implementations over a network.

run.ps1 executes main.py which accepts some command codes listed in its header. Most commands require your Android phone and your PC to be connected to the same network (like Wi-Fi). Few of them also require your phone to be listed in ADB [$ adb devices].


License

Copyright © Mahdi Parastesh - All Rights Reserved.

About

My own solutions for Computer Vision! This is a sub-project of Mergen; all the Python codes must be translated to C++. In order to debug them faster and more easily, I used Python and libraries like OpenCV (previously Pillow), Matplotlib and NumPy.

Topics

Resources

Stars

Watchers

Forks

Languages