Summary

VIAME is a computer vision application designed for do-it-yourself artificial intelligence including object detection, object tracking, image mosaicing, stereo measurement, image/video search, image/video annotation, rapid model generation, and tools for the evaluation of different algorithms. Originally targeting marine species analytics, it now contains many common algorithms and libraries, and is also useful as a generic computer vision library. The core infrastructure connecting different system components is currently the KWIVER library, which can connect C/C++, python, and matlab nodes together in a graph-like pipeline architecture. Alongside the pipelined image processing system are a number of standalone tools for accomplishing the above. Both a desktop and web version exists for deployments in different types of environments.

This manual is synced to the VIAME ‘master’ branch and is updated frequently, you may have to press ctrl-F5 to see the latest updates to avoid using your browser cache of this webpage.

Contents:

Documentation Overview

There are 5 types of documentation within VIAME:

  1. A quick-start guide meant for first time users using the desktop version

  2. An overview presentation covering the basic design of VIAME

  3. The VIAME Web and DIVE Desktop docs and in-GUI help menu

  4. Our YouTube video channel (work in progress)

  5. This manual, meant for more advanced users and developers


Installing from Pre-Built Binaries

First, download the binaries for your operating system from the main github page:

https://github.com/VIAME/VIAME

Next, use the offline quick start guide located at the below link to complete the installation:

VIAME Offline Installation

Building VIAME From Source

See the platform-specific guides below, though the process is similar for each. This document corresponds to the example located online here and also to the building_and_installing_viame example folder in a VIAME installation.

Building on Linux

These instructions are designed to help build VIAME on a fresh machine. They were written for and tested on Ubuntu 16.04. Other Linux machines will have similar directions, but some steps (particularly the dependency install) may not be exactly identical. VIAME has also been built on: CentOS/RHEL 6+, Fedora 19+, and Ubuntu 16.04+ at a minimum.

Install Dependencies

Different Linux distributions may have different packages already installed, or may use a different package manager than apt, but on Ubuntu this should help to provide a starting point:

sudo apt-get install git zip git wget curl libcurl4-openssl-dev libgl1-mesa-dev libexpat1-dev \
  libgtk2.0-dev libxt-dev libxml2-dev libssl-dev liblapack-dev openssl libssl-dev g++ zlib1g-dev

And on CentOS 7:

sudo yum -y groupinstall 'Development Tools'
sudo yum install -y zip git wget openssl openssl-devel zlib zlib-devel freeglut-devel \
  mesa-libGLU-devel lapack-devel libXt-devel libXmu-devel libXi-devel expat-devel readline-devel \
  curl curl-devel atlas-devel file which

If using VIAME_ENABLE_PYTHON, versions 3.6 or above is required with development packages and also pip installed, with 3.6, 3.8 and 3.10 the most tested versions. For example, [Anaconda3 2021.05](https://repo.anaconda.com/archive/) could be used, though you also try using native python, e.g. install python3, python3-dev, and numpy (or alternatively whatever python distribution you want to use), e.g.:

sudo apt-get install python3.8 python3.8-dev python3-pip

If using VIAME_ENABLE_CUDA for GPU support, you should install CUDA (version 10.0 or above is required, 11.7 or 11.6 being the most tested versions. Other versions may work depending on your build settings but are not officially supported yet):

https://developer.nvidia.com/cuda-toolkit-archive
Install CMAKE

Depending on the OS, the version of cmake you get with your local package manager (apt/yum/dnf) is sometimes too old to use for building VIAME (you currently need at least CMake 3.13) so you may or may not need to do a manual install of CMake. First you could try using the package manager then running ‘cmake –version’ to see if it’s appropriate. If a manual install is required, go to the cmake website, https://cmake.org/download, and download the appropriate binary distribution (for Ubuntu, this would be something like cmake-3.27.1-Linux-x86_64.sh, though newer versions will be out by the time you read this). Alternatively, download the appropriate binary distribution (for Ubuntu, this would be something like cmake-3.27.1-Linux-x86_64.sh, though newer versions will be out by the time you read this), or for windows the .msi or .zip installer. Lastly the source version could be built using the below instructions, though this is usually not necessary if a binary version is available for your platform.

cd ~/Downloads
tar zxfv cmake-3.27.1.tar.gz
cd cmake-3.27.1
./bootstrap --system-curl --no-system-libs
make
sudo make install
sudo ln -s /usr/local/bin/cmake /bin/cmake

These instructions build the source code into a working executable, installs the executable into a personal system directory, and then lets the operating system know where that directory is so it can find cmake in the future in case /usr/local/bin isn’t in your PATH variable by default.

Clone the Source Code

With all our dependencies installed, we need to build the environment for VIAME itself. VIAME uses git submodules rather than requiring the user to grab each repository totally separately. To prepare the environment and obtain all the necessary source code, use the following commands. Note that you can change src o whatever you want to name your VIAME source directory.

git clone git@github.com:Kitware/VIAME.git src
cd src
git submodule update --init --recursive
Build VIAME

VIAME may be built with a number of optional plugins–VXL, PyTorch, OpenCV, Scallop-TK, and Matlab–with a corresponding option called VIAME_ENABLE_[option], in all caps. For each plugin to install, you need a cmake build flag setting the option. The flag looks like -DVIAME_ENABLE_OPENCV:BOOL=ON, of course changing OPENCV to match the plugin. Multiple plugins may be used, or none. If uncertain what to turn on, it’s best to just leave the default enable and disable flags which will build most (though not all) functionalities. At a minimum, these are core components we recommend leaving turned on:

Flag

Description

VIAME_ENABLE_OPENCV

Builds OpenCV and basic OpenCV processes (video readers, simple GUIs)

VIAME_ENABLE_VXL

Builds VXL and basic VXL processes (video readers, image filters)

VIAME_ENABLE_PYTHON

Turns on support for using python processes (multiple algorithms)

VIAME_ENABLE_PYTORCH

Installs all pytorch processes (detectors, trackers, classifiers)

And a number of flags which control which system utilities and optimizations are built, e.g.:

Flag

Description

VIAME_ENABLE_CUDA

Enables CUDA (GPU) optimizations across all processes (OpenCV, Torch, etc…)

VIAME_ENABLE_CUDNN

Enables CUDNN (GPU) optimizations across all processes

VIAME_ENABLE_VIVIA

Builds VIVIA GUIs (tools for making annotations and viewing detections)

VIAME_ENABLE_KWANT

Builds KWANT detection and track evaluation (scoring) tools

VIAME_ENABLE_DOCS

Builds Doxygen class-level documentation for projects (puts in install share tree)

VIAME_BUILD_DEPENDENCIES

Build VIAME as a super-build, building all dependencies (default behavior)

VIAME_INSTALL_EXAMPLES

Installs examples for the above modules into install/examples tree

VIAME_DOWNLOAD_MODELS

Downloads pre-trained models for use with the examples and training new models

And lastly, a number of flags which build algorithms with more specialized functionality:

Flag

Description

VIAME_ENABLE_TENSORFLOW

Builds TensorFlow object detector plugin

VIAME_ENABLE_DARKNET

Builds Darknet (YOLO) object detector plugin

VIAME_ENABLE_BURNOUT

Builds Burn-Out based pixel classifier plugin

VIAME_ENABLE_SMQTK

Builds SMQTK plugins for image/video search

VIAME_ENABLE_SCALLOP_TK

Builds Scallop-TK based object detector plugin

VIAME_ENABLE_SEAL

Builds Seal Multi-Modality GUI

VIAME_ENABLE_ITK

Builds ITK cross-modality image registration

VIAME_ENABLE_UW_CLASSIFIER

Builds UW fish classifier plugin

VIAME_ENABLE_MATLAB

Turns on support for and installs all matlab processes

VIAME_ENABLE_LANL

Builds an additional (Matlab) scallop detector

VIAME can be built either in the source directory tree or in a seperate build directory (recommended). Replace “[build-directory]” with your location of choice, and run the following commands:

mkdir [build-directory]
cd [build-directory]
cmake [build_flags] [path_to_source_tree]
make -j8 # or just make for a unthreaded build

Depending on which enable flags you have set and your system configuration, you may need to set additional cmake variables to point to dependency locations. An example is below for a system with CUDA, Python, and Matlab enabled, though the versions are old. Please do not use CUDA <10 or python 2.7 anymore.

http://www.viametoolkit.org/wp-content/uploads/2017/03/cmake-options.png

Building on Mac OSX

Building on Mac is very similar to Linux, minus the dependency install stage. Currently, we have only tested VIAME with OSX 10.11.5 and Clang 7.3.0, but other versions may also work. Make sure you have a C/C++ development environment set up, install git, install cmake either from the source or a using a binary installer, and lastly, follow the same Linux build instructions above.

Building on Windows

Building on windows can be very similar to Linux if using a shell like cygwin (https://www.cygwin.com/), though if not you may want to go grab the GUI ersions of CMake (https://cmake.org/) and TortoiseGit (https://tortoisegit.org/). Currently Visual Studio 2019 is supported and the most tested version.

First do a Git clone of the source code for VIAME. If you have TortoiseGit this involves right clicking in your folder of choice, selecting Git Clone, and then entering the URL to VIAME (https://github.com/VIAME/VIAME.git) and the location of where you want to put the downloaded source code.

Next, do a git submodule update to pull down all required packages. In TortoiseGit right click on the folder you checked out the source into, move to the TortoiseGit menu section, and select Submodule Update.

Next, install any required dependencies for items you want to build. If using CUDA, version 11.0 or above is desired, along with Python 3.6+. Other versions have yet to be tested extensively, though may work. On Windows it can also be beneficial to use Anaconda to get multiple python packages. Boost Python (turned on by default when Python is enabled) requires Numpy and a few other dependencies.

Finally, create a build folder and run the CMake GUI (https://cmake.org/runningcmake/). Point it to your source and build directories, select your compiler of choice, and setup and build flags you want.

The biggest build issues on Windows arise from building VIAME in super-build and exceeded the windows maximum folder path length. This will typically manifest as build errors in the kwiver python libraries. To bypass these errors you have 2 options:

  1. Build VIAME in as high level as possible (e.g. C:/VIAME) or, alternatively

  2. Set the VIAME_BUILD_KWIVER_DIR path to be something small outside of your superbuild location, e.g. C:/tmp/kwiver to bypass path length limits. Thi is performed, for example, in the nightly build server cmake script as an example https://github.com/VIAME/VIAME/blob/master/cmake/build_server_windows.cmake

Updating VIAME

If you already have a checkout of VIAME and want to switch branches or update your code, it is important to re-run:

git submodule update --init --recursive

After switching branches to ensure that you have on the correct hashes of sub-packages within the build (e.g. fletch or KWIVER). Very rarely you may also need to run:

git submodule sync

Just in case the address of submodules has changed. You only need to run this command if you get a “cannot fetch hash #hashid” error.

Build Tips ‘n Tricks

Super-Build Optimizations:

When VIAME is built as a super-build, multiple solutions or makefiles are generated for each individual project in the super-build. These can be opened up if you want to experiment with changes in one and not rebuild the entire superbuild. VIAME places these projects in [build-directory]/build/src/* and fletch in [build-directory]/build/src/fletch-build/build/src/*. You can also run ccmake or the cmake GUI in these locations, which can let you manually change the build settings for sub-projects (say, for example, if one doesn’t build).

Python:

The default Python used is 3.8, though other versions may work as well. It depends on your build settings, operating system, and which dependency projects are turned on.

Known Build Issues

Issue:

When compiling with CUDA turned on:

nvcc fatal   : Visual Studio configuration file 'vcvars64.bat' could not be found for
installation at 'Microsoft Visual Studio XX.0/VC/bin/x86_amd64/../../..'

or similar.

Solution:

Express/Community versions of visual studio don’t ship with a file called vcvars64.bat You can add one manually be placing a bat file called ‘vcvars64.bat’ in folder ‘Microsoft Visual Studio XX.0VCbinamd64’ for your version of visual studio. This file should contain just a single line:

CALL setenv /x64

Issue:

Boost fails to build early with error in *_out.txt:

c++: internal compiler error: Killed (program cc1plus)

Solution:

You are likely running out of memory and your C++ compiler is crashing (common on VMs with a small amount of memory). Increase the amount of memory availability to your VM or buy a better computer if not running a VM with at least 1 Gb of RAM.

Issue:

On VS2015 with Python enabled: error LNK1104: cannot open file 'python27_d.lib'

Solution:

If you want to link against python in debug mode, you’ll have to build Python itself to enable debug libraries, as the default python distributions do not contain them. Alternatively switch to Release or RelWDebug modes.

Issue:

ImportError: No module named numpy.distutils

Solution:

You have python installed, but not numpy. Install numpy.

Issue:

cannot find cublas_v2.h or linking issues against CUDA

Solution:

VIAME contains a VIAME_DISABLE_GPU_SUPPORT flag due to numerous issues relating to GPU code building. Alternatively you can debug the issue (incorrect CUDA drivers for OpenCV, Torch, etc…), or alternatively not having your CUDA headers set to be in your include path.

Issue:

CMake Error at CMakeLists.txt:200 (message):
  Unable to locate CUDNN library

Solution:

You have enabled CUDNN but the system is unable to locate CUDNN, as the message says.

Note CUDNN is installed seperately from CUDA, they are different things.

You need to set the VIAME flag CUDNN_LIBRARY to something like /usr/local/cuda/lib64/libcudnn.so. Alternatively you can set CUDNN_ROOT to /usr/local/cuda/lib64 manually if that’s where you installed it.

Issue:

When VIAME_ENABLE_DOC is turned on and doing a multi-threaded build, sometimes the build fails.

Solution:

Run make -jX multiple times, or don’t run make -jX when VIAME_ENABLE_DOCS is enabled.

Issue:

CMake says it cannot find MATLAB

Solution:

Make sure your matlab CMake paths are set to something like the following

Matlab_ENG_LIBRARY:FILEPATH=[matlab_install_loc]/bin/glnxa64/libeng.so
Matlab_INCLUDE_DIRS:PATH=[matlab_install_loc]/extern/include
Matlab_MEX_EXTENSION:STRING=mexa64
Matlab_MEX_LIBRARY:FILEPATH=[matlab_install_loc]/bin/glnxa64/libmex.so
Matlab_MX_LIBRARY:FILEPATH=[matlab_install_loc]/bin/glnxa64/libmx.so
Matlab_ROOT_DIR:PATH=[matlab_install_loc]

Examples Folder Overview

In the ‘[install]/examples’ folder, there are a number of subfolders, with each folder corresponding to a different core functionality. The scripts in each of these folders can be copied to and run from any directory on your computer, the only item requiring change being the ‘VIAME_INSTALL’ path at the top of the each run script. These scripts can be opened and edited in any text editor to point the VIAME_INSTALL path to the location of your installed (or built) binaries. This is true on both Windows, Linux, and Mac.

The ‘examples’ folder is one of two core entry points into running VIAME functionality. The other is to copy project files for your operating system, ‘[install]/configs/prj-linux’ or ‘[install]/configs/prj-windows’ to a directory of your choice and run things from there. Not all functionality is in the default project file scripts, however, but it is a good entry point if you just want to get started on object detection and/or tracking.

Each example is run in a different fashion, but there are 3 core commands you need to know in order to run them on Linux:

‘bash’ - for running commands, e.g. ‘bash run_annotation_gui.sh’ which launches the application

‘ls’ - for making file lists of images to process, e.g. ‘ls *.png > input_list.txt’ to list all png image files in a folder

‘cd’ - go into an example directory, e.g. ‘cd annotation_and_visualization’ to move down into the annotation_and_visualization example directory. ‘cd ..’ is another useful command which moves one directory up, alongside a lone ‘ls’ command to list all files in the current directory.

To run the examples on Windows, you just need to be able to run (double click) the .bat scripts in the given directories. Additionally, knowing how to make a list of files, e.g. ‘dir > filename.txt’ on the windows command line can also be useful for processing custom image lists.

Key Toolkit Capabilities

Object Detection

http://www.viametoolkit.org/wp-content/uploads/2018/02/many_scallop_detections_gui.png

Measuring Fish Lengths Using Stereo

http://www.viametoolkit.org/wp-content/uploads/2018/02/fish_measurement_example.png

Image and Video Search for Rapid Model Generation

http://www.viametoolkit.org/wp-content/uploads/2018/01/search_ex.png

GUIs for Visualization and Annotation

http://www.viametoolkit.org/wp-content/uploads/2018/02/annotation_example.png

Illumination Normalization and Color Correction

http://www.viametoolkit.org/wp-content/uploads/2018/09/color_correct.png

Detector and Tracker Evaluation

http://www.viametoolkit.org/wp-content/uploads/2018/02/scoring-2.png

Image Enhancement and Filtering

http://www.viametoolkit.org/wp-content/uploads/2018/09/color_correct.png

This document corresponds to this example online, in addition to the image_enhancement example folder in a VIAME installation. This directory stores assorted scripts for debayering, color correction, illumination normalization, and general image contrast enhancement.

Build Requirements

These are the build flags required to run this example, if building from the source.

In the pre-built binaries they are all enabled by default.

VIAME_ENABLE_OPENCV set to ON (required)
VIAME_ENABLE_VXL set to ON (optional)

Code Used in Example

plugins/opencv/ocv_debayer_filter.cxx
plugins/opencv/ocv_debayer_filter.h
plugins/opencv/ocv_image_enhancement.cxx
plugins/opencv/ocv_image_enhancement.h
packages/kwiver/arrows/burnout/burnout_image_enhancement.h
packages/kwiver/arrows/burnout/burnout_image_enhancement.cxx

Object Detection Examples

Detection Overview

http://www.viametoolkit.org/wp-content/uploads/2018/02/skate_detection.png

This document corresponds to this example online, in addition to the object_detection example folder in a VIAME installation.

This folder contains assorted examples of object detection pipelines running different detectors such as YOLOv2, ScallopTK, Faster RCNN, and others. Several different models are found in the examples, trained on a variety of different sensors. It can be useful to try out different models to see what works best for your problem.

Running the Command Line Examples

Each run script contains 2 calls. A first (‘source setup_viame.sh’) which runs a script configuring all paths required to run VIAME calls, and a second to ‘kwiver runner’ running the desired detection pipeline. For more information about pipeline configuration, see the pipeline examples. Each example processes a list of images and produces detections in various format as output, as configured in the pipeline files.

Each pipeline contains 2-10 nodes, including a imagery source, in this case an image list loader, the actual detector, detection filters, and detection writers. In the habcam example an additional split processes is added early in the pipeline, as habcam imagery has stereo pairs typically encoded in the same png.

Running Examples in the GUI

The annotation GUI can also be used to run object detector or tracking pipelines. To accomplish this, load imagery using the annotation gui, then select Tools -> Execute Pipeline and select a pipeline to run, see below. Special notes: the ‘habcam’ pipeline only processes the left sides of images, assuming that the image contains side-by-side stereo pairs, and the ‘svm’ pipeline requires ‘.svm’ model files to exist in a ‘category_models’ directory from where the GUI is run. New pipelines can be added to the GUI by adding them to the default pipelines folder, with the word ‘embedded’ in them by default.

http://www.viametoolkit.org/wp-content/uploads/2018/08/vpview_run_det.png

Build Requirements

These are the build flags required to run these examples, if building from the source. In the pre-built binaries they are all enabled by default.

Minimum:

VIAME_ENABLE_OPENCV (default) (for image reading) or alternatively VIAME_ENABLE_VXL if
you set :image_reader:type to vxl in each .pipe config.

Per-Example:

run_habcam - VIAME_ENABLE_OPENCV, VIAME_ENABLE_DARKNET, VIAME_ENABLE_SCALLOP_TK
run_scallop_tk - VIAME_ENABLE_SCALLOP_TK
run_yolo - VIAME_ENABLE_DARKNET
run_lanl - VIAME_ENABLE_MATLAB

Running Detectors From C++ Code

We will be using a Hough circle detector as an and example of the mechanics of implementing a VIAME detector in cxx code.

In general, detectors accept images and produce detections. The data types that we will need to get data in and out of the detector are implemented in the Vital portion of KWIVER. For this detector, we will be using an image_container to hold the input image and a detected_object_set to hold the detected objects. We will look at how these data types behave a little later.

Vital provides an algorithm to load an image. We will use this to get the images for the detector. The image_io algorithm provides a method that accepts a file name and returns an image.

kwiver::vital::image_container_sptr load(std::string const& filename) const;

Now that we have an image, we can pass it to the detector using the following method on hough_circle_detector and get a list of detections.

virtual vital::detected_object_set_sptr detect( vital::image_container_sptr image_data ) const;

The detections, for example, can be drawn on the original image to see how well the detector is performing.

The following program implements a simple single object detector.

#include <arrows/ocv/image_container.h>
#include <arrows/ocv/image_io.h>
#include <arrows/ocv/hough_circle_detector.h>

#include <string>

int main( int argc, char* argv[] )
{
  // get file name for input image
  std::string filename = argv[1];

  // create image reader
  kwiver::vital::algo::image_io_sptr image_reader( new kwiver::arrows::ocv::image_io() );

  // Read the image
  kwiver::vital::image_container_sptr the_image = image_reader->load( filename );

  // Create the detector
  kwiver::vital::algo::image_object_detector_sptr detector( new kwiver::arrows::ocv::hough_circle_detector() );

  // Send image to detector and get detections.
  kwiver::vital::detected_object_set_sptr detections = detector->detect( the_image );

  // See what was detected
  std::cout << "There were " << detections->size() << " detections in the image." << std::endl;

  return 0;
}

This sample program implements the essential steps of a detector.

Now that we have a simple program running, there are two concepts that are supported by vital that are essential for building larger applications; logging and configuration support.

Logging

Vital provides logging support through macros that are used in the code to format and display informational messages. The following piece of code implements a logger and generates a message.

// Include the logger interface
#include <vital/logger/logger.h>

// get a logger or logging object
kwiver::vital::logger_handle_t logger( kwiver::vital::get_logger( "test_logger" ));

float data;

// log a message
LOG_ERROR( logger, "Message " << data );

The vital logger is similar to most loggers in that it needs logging object to provide context for the log message. Each logger object has an associated name that can be used to when configuring what logging output should be displayed. The default logger does not provide any logger output control, but there are optional logging providers which do.

There are logging macros that produce a message with an associated severity, error, warning, info, debug, trace. The log text can be specified as an output stream expression allowing type specific output operators to provide formatting. The output line in the above example could have been written as a log message.

kwiver::vital::logger_handle_t logger( kwiver::vital::get_logger( "detector_test" ));
LOG_INFO( logger, "There were " << detections->size() << " detections in the image." );

Note that log messages do not need an end-of-line at the end.

Refer to the separate logger documentation for more details.

Detector Configuration Support

In our detector example we just used the detector in its default state without specifying any configuration options. This works well in this example, but there are cases and algorithms where the behaviour needs to be modified for best results.

Vital provides a configuration package that implements a key/value scheme for specifying configurable parameters. The config parameters are used to control an algorithm and in later examples it can be used to select the algorithm. The usual approach is to create a config structure from the contents of a file, but the values can be programatically set also. The key for a config entry has a hierarchical format

The full details of the config file structure are available in a separate document.

All algorithms support the methods get_confguration() and set_configuration(). The get_confguration() method returns a structure with the expected configuration items and default parameters. These parameters can be changed and sent back to the algorithm with the set_configuration() method. The hough_circle_detector, the configuration is as follows:

dp = 1

Description: Inverse ratio of the accumulator resolution to the
image resolution. For example, if dp=1 , the accumulator has the same
resolution as the input image. If dp=2 , the accumulator has half as
big width and height.

max_radius = 0

Description: Maximum circle radius.

min_dist = 100

Description: Minimum distance between the centers of the detected
circles. If the parameter is too small, multiple neighbor circles may
be falsely detected in addition to a true one. If it is too large,
some circles may be missed.

min_radius = 0

Description: Minimum circle radius.

param1 = 200

Description: First method-specific parameter. In case of
CV_HOUGH_GRADIENT , it is the higher threshold of the two passed to
the Canny() edge detector (the lower one is twice smaller).

param2 = 100

Description: Second method-specific parameter. In case of
CV_HOUGH_GRADIENT , it is the accumulator threshold for the circle
centers at the detection stage. The smaller it is, the more false
circles may be detected. Circles, corresponding to the larger
accumulator values, will be returned first.

Lets modify the preceding detector to accept a configuration file.

#include <vital/config/config_block_io.h>
#include <arrows/ocv/image_container.h>
#include <arrows/ocv/image_io.h>
#include <arrows/ocv/hough_circle_detector.h>

#include <string>

int main( int argc, char* argv[] )
{
  // (1) get file name for input image
  std::string filename = argv[1];

  // (2) Look for name of config file as second parameter
  kwiver::vital::config_block_sptr config;
  if ( argc > 2 )
  {
    config = kwiver::vital::read_config_file( argv[2] );
  }

  // (3) create image reader
  kwiver::vital::algo::image_io_sptr image_reader( new kwiver::arrows::ocv::image_io() );

  // (4) Read the image
  kwiver::vital::image_container_sptr the_image = image_reader->load( filename );

  // (5) Create the detector
  kwiver::vital::algo::image_object_detector_sptr detector( new kwiver::arrows::ocv::hough_circle_detector() );

  // (6) If there was a config structure, then pass it to the algorithm.
  if (config)
  {
    detector->set_configuration( config );
  }

  // (7) Send image to detector and get detections.
  kwiver::vital::detected_object_set_sptr detections = detector->detect( the_image );

  // (8) See what was detected
  std::cout << "There were " << detections->size() << " detections in the image." << std::endl;

  return 0;
}

We have added code to handle the optional second command line parameter in section (2). The read_config_file() function converts a file to a configuration structure. In section (6), if a config block has been created, it is passed to the algorithm.

The configuration file is as follows. Note that parameters that are not specified in the file retain their default values.

dp = 2
min_dist = 120
param1 = 100

Configurable Detector Type

To further expand on our example, the actual detector algorithm can be selected at run time based on the contents of our config file.

#include <vital/algorithm_plugin_manager.h>
#include <vital/config/config_block_io.h>
#include <vital/algo/image_object_detector.h>
#include <arrows/ocv/image_container.h>
#include <arrows/ocv/image_io.h>

#include <string>

int main( int argc, char* argv[] )
{
  // (1) Create logger to use for reporting errors and other diagnostics.
  kwiver::vital::logger_handle_t logger( kwiver::vital::get_logger( "detector_test" ));

  // (2) Initialize and load all discoverable plugins
  kwiver::vital::algorithm_plugin_manager::load_plugins_once();

  // (3) get file name for input image
  std::string filename = argv[1];

  // (4) Look for name of config file as second parameter
  kwiver::vital::config_block_sptr config = kwiver::vital::read_config_file( argv[2] );

  // (5) create image reader
  kwiver::vital::algo::image_io_sptr image_reader( new kwiver::arrows::ocv::image_io() );

  // (6) Read the image
  kwiver::vital::image_container_sptr the_image = image_reader->load( filename );

  // (7) Create the detector
  kwiver::vital::algo::image_object_detector_sptr detector;
  kwiver::vital::algo::image_object_detector::set_nested_algo_configuration( "detector", config, detector );

  if ( ! detector )
  {
    LOG_ERROR( logger, "Unable to create detector" );
    return 1;
  }

  // (8) Send image to detector and get detections.
  kwiver::vital::detected_object_set_sptr detections = detector->detect( the_image );

  // (9) See what was detected
  std::cout << "There were " << detections->size() << " detections in the image." << std::endl;

  return 0;
}

Since we are going to select the detector algorithm at run time, we no longer need to include the hough_circle_detector header file. New code in section (2) initializes the plugin manager which will be used to instantiate the selected algorithm at run time. The plugin architecture will be discussed in a following section.

The following config file will select and configure our favourite hough_circle_detector

# select detector type
detector:type =   hough_circle_detector

# specify configuration for selected detector
detector:hough_circle_detector:dp =           1
detector:hough_circle_detector:min_dist =     100
detector:hough_circle_detector:param1 =       200
detector:hough_circle_detector:param2 =       100
detector:hough_circle_detector:min_radius =   0
detector:hough_circle_detector:max_radius =   0

First you will notice that the config file entries have a longer key specification. The ‘:’ character separates the different levels or blocks in the config and enable scoping of the value specifications.

The “detector” string in the config file corresponds with the “detector” string in section (7) of the example. The “type” key for the “detector” algorithm specifies which detector is to be used. If an alternate detector type “foo” were to be specified, the config would be as follows.

# select detector type
detector:type =             foo
detector:foo:param1 =       20
detector:foo:param2 =       10

Since the individual detector (or algorithm) parameters are effectively in their own namespace, configurations for multiple algorithms can be in the same file, which is exactly how more complicated applications are configured.

Sequencing One or More Algorithms in a Pipeline

In a real application, the input images may come from places other than a file on the disk and there may be algorithms applied to precondition the images prior to object detection. After detection, the detections could be overlaid on the input imagery or compared against manual annotations.

Ideally this type of application could be structured to flow the data from one algorithm to the next, but writing this a one monolithic application, changes become difficult and time consuming. This is where another component of KWIVER, sprokit, can be used to simplify creating a larger application from smaller component algorithms.

Sprokit is the “Stream Processing Toolkit”, a library aiming to make processing a stream of data with various algorithms easy. It provides a data flow model of application building by providing a process and interconnect approach. An application made from several processes can be easily specified in a pipeline configuration file.

Lets first look at an example application/pipeline that runs our hough_circle_detector on a set of images, draws the detections on the image and then displays the annotated image.

# ================================================================
process input
  :: frame_list_input
  :image_list_file    images/image_list_1.txt
  :frame_time          .3333
  :image_reader:type   ocv

# ================================================================
process detector
  :: image_object_detector
  :detector:type    hough_circle_detector
  :detector:hough_circle_detector:dp            1
  :detector:hough_circle_detector:min_dist      100
  :detector:hough_circle_detector:param1        200
  :detector:hough_circle_detector:param2        100
  :detector:hough_circle_detector:min_radius    0
  :detector:hough_circle_detector:max_radius    0

# ================================================================
process draw
  :: draw_detected_object_boxes
  :default_line_thickness 3

# ================================================================
process disp
  :: image_viewer
  :annotate_image         true
  # pause_time in seconds. 0 means wait for keystroke.
  :pause_time             1.0
  :title                  NOAA images

# ================================================================
# connections
connect from input.image
        to   detector.image

connect from detector.detected_object_set
        to   draw.detected_object_set
connect from input.image
        to draw.image

connect from input.timestamp
        to   disp.timestamp
connect from draw.image
        to   disp.image

# -- end of file --

Our example pipeline configuration file is made up of process definitions and connections. The first process handles image input and uses a configuration style we saw in the description of selectable algorithms, to select an “ocv” reader algorithm. The next process is the detector, followed by the process that composites the detections and the image. The last process displays the annotated image. The connections section specify how the inputs and outputs of these processes are connected.

This pipeline can then be run using the ‘kwiver runner ‘ app

Object Tracking Examples

http://www.viametoolkit.org/wp-content/uploads/2018/02/computed_track_example.png

This document corresponds to this example online, in addition to the object_tracking example folder in a VIAME installation.

This folder contains object tracking examples using an assortment of trackers. Additional tracking examples will be added in the future.

Detection File Formats and Conversions

This document corresponds to this example online, in addition to the “detection file
conversions” example folder in a VIAME installation.

This folder contains examples of different formats which VIAME supports, and additionally
how to convert between textual formats representing object detections, tracks, results,
etc. There are multiple ways to perform format conversions, either using KWIVER pipelines
with reader/writer nodes (e.g. see pipelines directory) or using quick standalone
scripts (see scripts). Conversion pipelines are simple, containing a detection input
node (reader) and output node (writer).

Integrated Detection Formats

A subset of the output ASCII formats already integrated into VIAME is listed below.
New formats can be integrated to the system by implementing a derived version of the
vital::detected_object_set_input or vital::read_object_track_set classes in C++ or
python, which produce either detected_object_sets or object_track_sets, respectively.

VIAME CSV - System Default Comma Seperated Value Detection Format

There are 3 parts to a VIAME csv. First, 9 required fields comma seperated, with
a single line for either each detection, or each detection state, in a track:

- 1: Detection or Track Unique ID
- 2: Video or Image String Identifier
- 3: Unique Frame Integer Identifier
- 4: TL-x (top left of the image is the origin: 0,0)
- 5: TL-y
- 6: BR-x
- 7: BR-y
- 8: Auxiliary Confidence (how likely is this actually an object)
- 9: Target Length

Where detections can be linked onto tracks on multiple frames via sharing the
same track ID field. Depending on the context (image or video) the second field
may either be video timestamp or an image filename. Field 3 is a unique frame
identifier for the frame in the given video or loaded sequence, starting from 0
not 1. Fields 4 through 7 represent a bounding box for the target in the imagery.
Depending on the context, auxiliary confidence may represent how likely this
detection is an object, or it may be the confidence in the length measurement,
if present. If length measurement is not present, it can be specified with a
value less than 0, most commonly “-1”.

Next, a sequence of optional species <=> score pairs, also comma seperated:

- 10,11+ : class-name, score (this pair may be omitted or repeated)

There can be as many class, score pairs as necessary (e.g. fields 12 and 13, 14
and 15, etc…). In the case of tracks, which may span multiple lines and thus
have multiple probabilities per line, the probabilities from the last state in
the track should be treated as the aggregate probability for the track and it’s
okay for prior states to have no probability to prevent respecifying it. In the
class and score list, the highest scoring entries should typically be listed first.

Lastly, optional categorical values associated with each detection in the file
after species/class pairs. Attributes are given via a keyword followed by any
space seperate values the attribute may have. Possible attributes are:

(kp) head 120 320 [optional head, tail, or arbitrary keypoints]
(atr) is_diseased true [attribute keyword then boolean or numeric value]
(note) this is a note [notes take no form just can’t have commas]
(poly) 12 455 40 515 25 480 [a polygon for the detection]
(hole) 38 485 39 490 37 470 [a hole in a polygon for a detection]
(mask) ./masks/mask02393.png [a reference to an external pixel mask image]

Throwing together all of these components, an example line might look like:

1,image.png,0,104,265,189,390,0.32,1.5,flounder,0.32,(kp) head 120 320

This file format is supported by most GUIs and detector training tools. It can
be used via specifying the ‘viame_csv’ keyword in any readers or writers

COCO JSON - Common Objects in Context

COCO jsons are a json schema popularized by the COCO academic computer vision
competitions, but are now also used in other applications more widely, for
example in the cvat annotation tool. It is defined at https://cocodataset.org

Compared to the CSV format they are typically larger but much more extensible,
structured, and have more capacities for optional fields.

The COCO JSON reader/writer can be specified in config files using ‘coco’.

HABCAM - Space or comma seperated annotation format used by the HabCam project

A typical habcam annotation looks like:

201503.20150517.png 527 201501 boundingBox 458 970 521 1021

Which corresponds to image_name, species_id (species id to labels seperate),
date, annot_type [either boundingBox, line, or point], tl_x, tl_y, bl_x, bl_y

For the point type, only 1 set of coordinate is provided

An alternative format, that the reader also supports, looks like:

201503.20150517.png,527,scallop,”””line””: [[458, 970], [521, 1021]]”

which is more or less the same as the prior, just formatted differently.

The habcam reader/writer can be specified in config files using ‘habcam’.

KW18 (Deprecated) - Kitware KW18 Column Seperated Track Format

KW18s are a space seperated file format for representing detections or tracks.

Each KW18 file has a header stating its contents, as follows: # 1:Track-id
2:Track-length 3:Frame-number 4:Tracking-plane-loc(x) 5:Tracking-plane-loc(y)
6:velocity(x) 7:velocity(y) 8:Image-loc(x) 9:Image-loc(y) 10:Img-bbox(TL_x)
11:Img-bbox(TL_y) 12:Img-bbox(BR_x) 13:Img-bbox(BR_y) 14:Area 15:World-loc(x)
16:World-loc(y) 17:World-loc(z) 18:timestamp 19:track-confidence

The kw18 reader/writer can be specified in config files using ‘kw18’.

KWIVER CSV (Deprecated) - Additional Comma Seperated Value Detection Format

A detection only CSV format contains 1 detection per line, with each line as follows:

- 1: frame number
- 2: file name
- 3: TL-x (top left of the image is the origin: 0,0)
- 4: TL-y
- 5: BR-x
- 6: BR-y
- 7: detection confidence
- 8,9+ : class-name score (this pair may be omitted or repeated)

The kwiver reader/writer can be specified in config files using ‘csv’. We reccomend
you don’t use it for anything.

Example Conversions

There are multiple ways to perform format conversions, either using KWIVER pipelines
with reader/writer nodes (e.g. see pipelines directory in this example directory) or
using quick standalone scripts (see scripts). Conversion pipelines are simple,
containing a detection input node (reader) and output node (writer) and can be run
with the ‘kwiver runner’ command line tool.

Length Measurement Examples

http://www.viametoolkit.org/wp-content/uploads/2018/02/fish_measurement_example.png

Running the Demo

This section corresponds to this example online, in addition to the measurement_using_stereo example folder in a VIAME installation. This folder contains examples covering fish measurement using stereo. This example is currently a work in progress.

Run CMake to automatically download the demo data into this example folder. Alternatively you can download the demo data directly.

Setup:

Make sure you build VIAME with VIAME_ENABLE_PYTHON=True and VIAME_ENABLE_OPENCV=True.

For simplicity this tutorial will assume that the VIAME source directory is [viame-source] and the build directory is [viame-build]. Please modify these as needeed to match your system setup. We also assume that you have built VIAME.

Additionally this example requires an extra python dependency to be installed. On Linux or Windows, ‘pip install ubelt’.

Running via the pipeline runner

To run the process using the sprokit C++ pipeline we use the the pipeline runner:

# First move to the example directory
cd [viame-build]/install/examples/measurement_using_stereo

# The below script runs pipeline runner on the measurement_example.pipe
bash run_measurer.sh

This example runs at about 4.0Hz, and takes 13.3 seconds to complete on a 2017 i7 2.8Ghz Dell laptop.

Running via installed opencv python module

The above pipeline can alternatively be run as a python script.

# move to your VIAME build directory
cd [viame-build]
# Run the setup script to setup the proper paths and environment variables
source install/setup_viame.sh

# you may also want to set these environment variables
# export KWIVER_DEFAULT_LOG_LEVEL=debug
export KWIVER_DEFAULT_LOG_LEVEL=info
export SPROKIT_PYTHON_MODULES=kwiver.processes:viame.processes

You should be able to run the help command

python -m viame.processes.opencv.ocv_stereo_demo --help

The script can be run on the demodata via

python -m viame.processes.opencv.ocv_stereo_demo \
    --left=camtrawl_demodata/left --right=camtrawl_demodata/right \
    --cal=camtrawl_demodata/cal.npz \
    --out=out --draw -f
Running via the standalone script

Alternatively you can run by specifying the path to opencv module (if you have a python environment you should be able to run this without even building VIAME)

# First move to the example directory
cd [viame-source]/examples/measurement_using_stereo

# Run the camtrawl module directly via the path
python ../../plugins/opencv/python/viame/processes/opencv \
    --left=camtrawl_demodata/left --right=camtrawl_demodata/right \
    --cal=camtrawl_demodata/cal.npz \
    --out=out --draw -f

Without the –draw flag the above example, this example runs at about 2.5Hz, and takes 20 seconds to complete on a 2017 i7 2.8Ghz Dell laptop.

With –draw it takes significantly longer (it runs at 0.81 Hz and takes over a minute to complete), but will output images like the one at the top of this readme as well as a CSV file.

Note that the KWIVER C++ Sprokit pipline offers a significant speedup (4Hz vs 2.5Hz), although it currently does not have the ability to output the algorithm visualization.

Calibration File Format

For the npz file format the root object should be a python dict with the following keys and values:


R: extrinsic rotation matrix
T: extrinsic translation
cameraMatrixL: dict of intrinsict parameters for the left camera
fc: focal length
cc: principle point
alpha_c: skew
cameraMatrixR: dict of intrinsict parameters for the right camera
fc: focal length
cc: principle point
alpha_c: skew
distCoeffsL: distortion coefficients for the left camera
distCoeffsR: distortion coefficients for the right camera

For the mat file, format the root structure should be a dict with the key Cal whose value is a dict with the following items:


om: extrinsic rotation vector (note rotation matrix is rodrigues(om))
T: extrinsic translation
fc_left: focal length of the left camera
cc_left: principle point
alpha_c_left: skew
kc_left: distortion coefficients for the left camera
fc_right: focal length of the right camera
cc_right: principle point
alpha_c_right: skew
kc_right: distortion coefficients for the right camera

Detector Training Examples

This document corresponds to this example online, in addition to the object_detector_training example folder in a VIAME installation.

The common detector training API is used for training multiple object detectors from the same input format for both experimentation and deployment purposes. By default, each detector has a default training process that handles issues such as automatically reconfiguring networks for different output category labels, while simulatenously allowing for more customization by advanced users.

Future releases will also include the ability to use stereo depth maps in training, alongside additional forms of data augmentation and more easily definable data source nodes for alternative input file structures.

Input data used for training should be put in the following format:

[root_training_dir]
…labels.txt
…folder1
……image001.png
……image002.png
……image003.png
……groundtruth.csv
…folder2
……image001.png
……image002.png
……groundtruth.csv

where groundtruth can be in any file format for which a “detected_object_set_input” implementation exists (e.g. viame_csv, kw18, habcam), and labels.txt contains a list of output categories (one per line) for the trained detection model. “labels.txt” can also contain any alternative names in the groundtruth which map back to the same output category label. For example, see training_data/labels.txt for the corresponding groundtruth file in training_data/seq1. The “labels.txt” file allows the user to selectively train models for certain sub-categories or super-categories of object by specifying only the categories of interest to train a model for, and any synonyms for the same category on the same line.

After formatting data, a model can be trained via the ‘viame_train_detector’ tool, the only modification required from the scripts in this folder being setting your .conf files to the correct groundtruth file format type.

Build Requirements

These are the build flags required to run this example, if building from the source.

In the pre-built binaries they are all enabled by default.

VIAME_ENABLE_OPENCV set to ON
VIAME_ENABLE_PYTHON set to ON
VIAME_ENABLE_DARKNET set to ON (for yolo_v2 training)
VIAME_ENABLE_SCALLOP_TK set to ON (for scallop_tk training)

Code Used in Example

plugins/core/viame_train_detector.cxx
packages/kwiver/vital/algo/train_detector.h
packages/kwiver/vital/algo/train_detector.cxx
packages/kwiver/vital/algo/detected_object_set_input.h
packages/kwiver/vital/algo/detected_object_set_input.cxx
packages/kwiver/arrows/darknet/darknet_trainer.h
packages/kwiver/arrows/darknet/darknet_trainer.cxx
plugins/core/detected_object_set_input_habcam.h
plugins/core/detected_object_set_input_habcam.cxx

Video and Image Search Examples

Overview

This document corresponds to this folder online, in addition to the search_and_rapid_model_generation example folder in a VIAME installation.

This directory contains methods to accomplish two tasks:

(a) Performing exemplar-based searches on an archive of unannotated imagery or videos
(b) Quickly training up detection models for new categories of objects on the same ingest
Video and Image Archive Search using VIAME

Video archive search can be performed via a few methods. The default includes a pipeline which generates object detections, tracks, and lastly temporal descriptors around each track. The descriptors get indexed into an arbitrary data store (typically a nearest neighbor index, locality-sensitive hashing table, or other). At query time, descriptors on a query image or video are matched against the entries in this database. A default GUI (provided via the VIVIA toolkit) is provided which allows performing iterative refinement of the results, by annotating which were correct or incorrect, in order to build up a better model for the input query. This model can be for a new object category (or sub-category attribute) and saved to an output file to be reused again in future pipelines or query requests. Input regions to query against can either be full frame descriptors, around just object detections, or, lastly, object tracks.

Initial Setup
Building and running this example requires either a VIAME install or a build from source with:

(a) The python packages: numpy, pymongo, torch, torchvision, matplotlib, and python-tk
(b) A VIAME build with VIAME_ENABLE_SMQTK, YOLO, OPENCV, PYTORCH, VXL, and VIVIA enabled.

First, you should decide where you want to run this example from. Doing it in the example folder tree is fine as a first pass, but if it is something you plan on running a few times or on multiple datasets, you probably want to select a different place in your user space to store generated databases and model files. This can be accomplished by making a new folder in your directory and either copying the scripts (.sh, .bat) from this example into this new directory, or via copying the project files located in [VIAME-INSTALL]/configs/prj-linux (or prj-windows) to this new directory. After copying these scripts to the directory you want to run them from, you may need to make sure the first line in the top, “VIAME_INSTALL”, points to the location of your VIAME installation (as shown below) if your installation is in a non-default directory, or you copied the example files elsewhere. If using windows, all ‘.sh’ scripts in the below will be ‘.bat’ scripts that you should be able to just double-click to run.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_0_new_project.png
Ingest Image or Video Data

First, create_index.[type].sh should be called to initialize a new database, and populate it with descriptors generated around generic objects to be queried upon. Here, [type] can either be ‘around_detections’, ‘detection_and_tracking’, or ‘full_frame_only’, depending on if you want to run matching on spatio-temporal object tracks, object detections, or full frames respectively (see VIAME quick start guide). If you want to run it on a custom selection of images, make a file list of images called ‘input_list.txt’ containing your images, one per line. For example, if you have a folder containing png images, run ‘ls [folder]/*.png > input_list.txt’ on the command line to make this list. Alternatively, if ingesting videos, make a directory called ‘videos’ which contains all of your .mpg, .avi, .etc videos. If you look in the ingest scripts, you can see links to these sources if you wish to change them. Next run the ingest script, as below.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_1_ingest.png

This should take a little bit if the process is successful, see below. If you already have a database present in your folder it will ask you if you want to remove it.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_2_ingest.png

If your ingest was successful, you should get a message saying ‘ingest complete” with no errors in your output log. If you get an error, and are unable to decipher it, send a copy of your database/Logs folder and console output to ‘viame.developers@gmail.com’.

Perform an Image Query

After performing an ingest ‘bash launch_search_interface.sh’ should be called to launch the GUI.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_3_launch_gui.png
In this example, we will first start with an image query.

Select, in the top left, Query -> New

From the Query Type drop down, select Image Exemplar

Next select an image to use as an exemplar of what you are looking for. This image can take one of two forms, either a large image containing many objects including your object of interest, or a cropped out version of your object.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_4_new_query.png

Whatever image you give, the system will generate a full-frame descriptor for your entire image alongside sub-detections on regions smaller than the full image.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_5_query_result.png

Select the box you are most interested in.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_6_select_fish.png

Press the down arrow to highlight it (the selected box should light up in green). Press okay on the bottom right, then okay again on the image query panel to perform the query.

Optionally, the below four instructions are an aside on how to generate an image chip just showing your object of interest. They can be ignored if you don’t need them. If the default object proposal techniques are not generating boxes around your object for a full frame, you can use this method then select the full frame descriptor around the object. In the below we used the free GIMP painter tool to crop out a chip. Install this using ‘sudo apt-get install gimp’, on Ubuntu, https://www.gimp.org/ on Windows).

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_7_crop_fish.png

Right click on your image in your file browser, select ‘Edit with Gimp’, press Ctrl-C to open the above dialogue, highlight the region of interest, press enter to crop.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_8_cropped_fish.png

Save out your crop to wherever you want, preferably somewhere near your project folder.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_9_select_fish_again.png

Now you can put this chip through the image query system, instead of the full frame one.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_10_initial_results.png

Regardless which method you use, when you get new results they should look like this. You can select them on the left and see the entries on the right. Your GUI may not look like this depending on which windows you have turned on, but different display windows can be enabled or disabled in Settings->Tool Views and dragged around the screen.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_11_initial_results.png

Results can be exported by highlighting entries and selecting Query -> Export Results in the default VIAME csv format and others. You can show multiple entries at the same time by highlighting them all (hold shift, press the first entry then the last), right-clicking on them, and going to ‘Show Selected Entries’.

Train a IQR Model
http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_12_adjudacation.png

When you perform an initial query, you can annotate results as to their correct-ness in order to generate a model for said query concept. This can be accomplished via a few key-presses. Either right click on an individual result and select the appropriate option, or highlight an entry and press ‘+’ or ‘-’ on your keyboard for faster annotation.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_13_feedback.png

You might want to annotate entries from both the top results list, and the requested feedback list (bottom left in the above). This can improve the performance of your model significantly. After annotating your entries press ‘Refine’ on the top left.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_14_next_n_results.png

There we go, that’s a little better isn’t it.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_15_next_n_results.png http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_16_next_n_results.png

Okay these guys are a little weird, but nothing another round of annotations can’t fix.

After you’re happy with your models, you should export them (Query -> Export IQR Model) to a directory called ‘category_models’ in your project folder for re-use on both new and larger datasets.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_17_saved_models.png

The category models directory should contain only .svm model files.

Re-Run Models on Additional Data

If you have one or more .svm model files in your category_models folder, you can run the ‘bash process_list_using_models.sh’ script in your project folder. This can either be on the same data you just processed, or new data. By default, this script consumes the supplied input_list.txt and produces a detection file called ‘svm_detections.csv’ containing a probability for each input model in the category_models directory per detection. Alternatively this pipeline, this can be run from within the annotation GUI.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_18_produced_detections.png

The resultant detection .csv file is in the same common format that most other examples in VIAME take. You can load this detection file up in the annotation GUI and select a detection threshold for your newly-trained detector, see here. You can use these models on any imagery, it doesn’t need to be the same imagery you trained it on.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_19_edited_detections.png
Correct Results and Train a Better Model

If you have a detection .csv file for corresponding imagery, and want to train a better (deep) model for the data, you can first correct any mistakes (either mis-classifications, grossly incorrect boxes, or missed detections) in the annotation GUI. To do this, set a detection threshold you want to annotate at, do not change it, and make the boxes as perfect as possible at this threshold. Over-ride any incorrectly computed classification types, and create new detections for objects which were missed by the initial model. Export a new detection csv (File->Export Tracks) after correcting as many boxes as you can. Lastly, feed this into the ground-up detector training example. Make sure to set whatever threshold you set for annotation in the [train].sh script you use for new model training.

http://www.viametoolkit.org/wp-content/uploads/2018/07/iqr_20_edited_detections.png
Tuning Algorithms (Advanced)

Coming Soon….

Rapid Model Generation

Rapid model generation can be performed using the same method as image and video search (above), just saving out the resultant trained detection models after performing iterative query refinement. These models can then be used in detection pipelines, or further refined or used in future video searches.

GUIs for Visualization and Annotation

http://www.viametoolkit.org/wp-content/uploads/2018/02/annotation_example_painted.png

This document corresponds to this example online, in addition to the annotation_and_visualization example folder in a VIAME installation.

There are a number of GUIs in the system. As part of the VIVIA package the vpView GUI, the current default desktop annotator, is useful for displaying detections, their respective probabilities, for running existing automated detectors, and for making new annotations in video. There are additionally simpler GUIs which can be enabled in .pipe files. vpView can either be pointed directly to imagery, pointed to a compressed video file (see [install-dir]/configs/prj-*/for_videos) or given an input prj file that points to the location of input imagery and any optional settings (e.g. groundtruth, computed detections, and/or homographies for the input data). If you just want to use the tool to make annotations you don’t need to specify the later three, and just need to set a DataSetSpecifier or [reccommended] use the File->New Project option to load imagery directly without a prj file. Also, see the below example guide and videos.

There are 2 default run scripts in this folder. “launch_view_interface” launches the main vpview annotation and results display GUI while “run_display_pipe” runs the simpler in-pipeline display GUI. Lastly, “run_chip_pipe” creates image chips and “run_draw_pipe” does the same as display, only writing out images with boxes drawn on top of them to file.

vpView Annotation Process Overview

Notable Annotation GUI Shortcut Keys

  • r = Zoom back to the full image

  • hold ctrl + drag = create a box in annotation mode (create detection/track)

vpView GUI Project File Overview

Examples of the optional contents of loadable prj files are listed below for quick reference. For those not familiar with the tool, downloading the above manual is best. Project files are no longer required to be used (imagery can be opened directly via the ‘New Project’ dropdown), however, these are listed here for advanced users who may want to configure with multiple homographies.

Note: The list is not complete, but currently focusing on the most used (and new) parameters

  • DataSetSpecifier = filename(or glob) Filename with list of images for each frame or glob for sequence of images

  • TracksFile = filename Filename containing the tracks data.

  • TrackColorOverride = r g b rgb color, specified from 0 to 1, overrides the default vpView track color for this project only

  • ColorMultiplier = x Event and track colors are scaled by the indicated value. Can be used in conjunction with the TrackColorOverride

  • EventsFile = filename Filename containing the events data.

  • ActivitiesFile = filename Filename containing the activity data.

  • SceneElementsFile= filename Filename containing the scene elements (in json file).

  • AnalysisDimensions = W H Dimension (in pixel/image coordinates) of AOI. Ignored when using a mode that leverages image corner points for imagetransformation.

  • OverviewOrigin = U V Offset of image data such that “0, 0” occurs at the AOI origin. Should always be negative #’s. Like AnalysisDimensions, unused when image corner points are used.

  • AOIUpperLeftLatLon = lat lon Required for “Translate image” mode of corner point usage (Tools->Configure->Display); also required for displaying an AOI when the source imagery isn’t ortho-stabilized

  • AOIUpperRightLatLon = lat lon

  • AOILowerLeftLatLon = lat lon

  • AOILowerRightLatLon = lat lon If the UpperLeft and LowerRight are specified, an AOI “box” can be displayed. Depending on the nature of the homography controlling image display / transformation, additional corner point may improve the designation of the region.

  • FrameNumberOffset = N Positive value to offset imagery relative to the track/event data. A value of 3 would mean that the 1st image would correspond to track frame 2 (0-based numbering)

  • ImageTimeMapFile = filename Specifies file containing map of “filename <space> timestamp (in seconds)” one line per frame. The file can be created via File->Export Image Time Stamps

  • HomographyIndexFile = filename Specifies file containing frame number/timestamp/homography sequence for all frames specified by the DataSetSpecifier. If the tag is set and the number of homographies match the image source count, the “Image-loc“‘s of the tracks (not the “Img-bbox”) are stored in coordinate frame mapped to by the homographies. This enables track trails during playback (for source imagery that isn’t stored in stabilized form)

  • HomographyReferenceFrame = frame index Specifies the frame to use as the reference homography frame for stabilizing the video (if homographies are present). If, instead of stabilizing the video, the homographies should be used to stabilize the tracks, set the HomographyReferenceFrame to -1 (defaults to 0).

  • FiltersFile = filename (note, support not yet in master) Specifies file containing definitions of spatial filters for the project. The coordinate system (lat/lon or image/pixels) must be consistent between states when the filters are saved versus when the project / filter file is loaded.

  • IgnoreImageCoords = true/false (note, support not yet in master) In the world mode, ignores the image bounding box data and shows a head (point/dot) at the end of the track tail. This parameter only affects the track head and not the tail.

  • ColorWindow = W (defaults to 255) Window / range of input color values that will be mapped. The value gives the total range, not the distance from the median.

  • ColorLevel = L (defaults to 127) Input color value that will be mapped to the median output value, and also serves as the median value of the input color range.

Scoring Detectors and Trackers

http://www.viametoolkit.org/wp-content/uploads/2018/02/scoring-2.png

This document corresponds to this example online, in addition to the scoring_and_roc_generation example folder in a VIAME installation.

The KWANT package provides scoring tools that can be used to calculate the probability of detecting an item, along with other scoring metrics such as ROC curves, specificity, sensitivities, etc. The input to these tools must be in the Kitware kw18 format. Several scripts are provided to convert other formats (such as habcam annotations and Scallop-tk outputs) to kw18 format. The format is very simple so additional converters can be easily created.

An example of running scoring tools can be found here. The scoring tool takes two files: the actual detections in the truth file and the computed detections. The computed detections are scored against the truth file to give a set of statistics as shown below. Additional parameters that can be passed to the tool and other options can be found in the KWANT documentation.

HADWAV Scoring Results:
   Detection-Pd: 0.748387
   Detection-FA: 8
   Detection-PFA: 0.0338983
   Frame-NFAR: not computed
   Track-Pd: 0.748387
   Track-FA: 8
   Computed-track-PFA: 0.0338983
   Track-NFAR: not computed
   Avg track (continuity, purity ): 13.693, 1
   Avg target (continuity, purity ): 20.1419, 0.748387
   Track-frame-precision: 0.947826

The tool was originally written to analyze object tracks in full motion video imagery so some of the terminology and calculated metrics may not apply.

One main metric is the probability of detection Pd. This is calculated as follows:

Pd = (num detections match truth) / (num truth)

Detection files can be written in the kw18 format by using the appropriate writer in the pipeline or by running one of these converters. One downside to using the kw18 writer in the pipeline is that the image file name is not captured. All the converters take the same set of command line options. For example:

Usage: habcam_to_kw18.pl [opts] file
  Options:
    --help                     print usage
    --write-file file-name     Write image file/index correspondence to file
    --read-file  file-name     Read image file/index correspondence to file

In order to get the best statistics the number of images processed must be the same as the number of images in the truth set. Computed detections and truth are compared on an image basis so the number of truth entries must be limited to the same number of images as the computed detections. The options to these converters aide in this regard.

Calculated detections are converted first and use the –out-file option to write out the list of files processed. The truth set is processed next with the –in-file option referring to the file created in the previous step. The –cache-only flag should be added to this second conversion to cause images not in the first step to be skipped.

The score_tracks tool is run as follows:

score_tracks --computed-tracks computed_det.kw18 --truth-tracks ground_truth2.kw18

A full list of the options can be coaxed from the tool by using the -? option.

Video Archive Summarization

This document corresponds to this example online, in addition to the archive_summarization example folder in a VIAME installation.

This example covers scripts which simultaneously create a searchable index of video archive and plots detailing different organism counts over time. The ‘summarize_and_index_videos’ script performs both of these tasks, while the ‘summarize_videos’ script only performs the later. Plots are generated, for each video, in the ‘database’ output folder, and can alternatively be viewed by the ‘launch_timeline_viewer’ script. Queries can be performed via the ‘launch_search_interface’ script, in a fashion similar to both the ‘search_and_rapid_model_generation’ example in ‘search_and_rapid_model_generation’ and in the binary install guide.

Frame Level Classification

Overview

This document corresponds to this online example, in addition to the frame_level_classification example folder in a VIAME installation.

Frame level classification is useful for computing properties such as whether or not an organism is just within a frame (as opposed to counting instances of it) or for performing techniques such as background (substrate) classification.

Two methods are provided, training SVM models which is useful for cases with less training data, and deep (ResNet50) models for standard deep learning classification when lots of training samples are provided. The third option for generating full frame classifiers is using the search and rapid model generation to perform it during video search. When there are lots of full frame labels, the deep learning method generally yields the best performance.

Training data must be supplied in a similar format to object detector training, that is the below directory structure (where ‘…’ indicates a subdirectory):

[root_training_dir]
…labels.txt
…folder1
……image001.png
……image002.png
……image003.png
……groundtruth.csv
…folder2
……image001.png
……image002.png
……groundtruth.csv

where groundtruth can be in any file format for which a “detected_object_set_input” implementation exists (e.g. viame_csv, kw18, habcam), and labels.txt contains a list of output categories (one per line) for the trained detection model. “labels.txt” can also contain any alternative names in the groundtruth which map back to the same output category label. For example, see training_data/labels.txt for the corresponding groundtruth file in training_data/seq1. The “labels.txt” file allows the user to selectively train models for certain sub-categories or super-categories of object by specifying only the categories of interest to train a model for, and any synonyms for the same category on the same line.

New Module Creation Examples

This document corresponds to this runable example of these example simple plugins, alongside these example plugin templates. Additionally, all of the former can be found in [viame-install]/examples/hello_world_pipeline folder, [viame-source]/plugins/hello_world folder, and [viame-source]/plugins/templates folder in a VIAME installation, respectively. Throughout these folders are example object detectors, image filters, and image classifier implementations written in both Python and C++.

Simple C++ Detector Plugin Example

A new detector plugin can be added by creating a class that implements the kwiver::vital::algo::image_object_detector interface. This interface is defined in an abstract base class in file vital/algo/image_object_detector.h. Similar interfaces exist for several other types of functions.

The directory plugins/templates/cxx contains files that can be used as a starting point for implementing new detectors. These files contain markers in the form @text@ that are to be replaced with the string (the title) of your new detector.

All files in that directory should be copied to a new directory and renamed as appropriate. The files template_detector.{cxx,h} should be renamed to a name that indicates the specific detector being implemented.

The CMakeLists.txt and register_algorithms.cxx files should keep their original names. Change the following place holders in all files to personalize the detector.

@template@ - name of the detector.

@template_lib@ - name of the plugin library that will contain the detector. Can be the same name as the detector.

@template_dir@ - name of the source subdirectory containing the detector files. For example if the detector is in the directory plugins/ex_fish_detector, then ‘template_dir’ should be replaced with ‘ex_fish_detector’.

The place holders also appear in capital letters indicating that the replacement string should be capitalized.

The main work that has to be done to integrate a detector into the VIAME framework is to convert the input image from the VIAME format to the format needed by the detector, and to convert the detections to a detected_object_set as needed by the framework.

Many detectors take images in OpenCV matrix format. This data structure can be extracted from the image_container_sptr that is available using the following code:

// input image is kwiver::vital::image_container_sptr image_data
// CV format image is extracted using the following line
cv::Mat cv_image = kwiver::arrows::ocv::image_container::vital_to_ocv( image_data->get_image() );

Now that you have the image in a compatible format, it can be passed to the detector. Detectors usually return a set of bounding boxes, each annotated with one or more classification labels. These boxes can be converted to a detected_object_set using the following pseudo-code.

// Allocate a detected object set that we will fill with new detections
auto detected_objects = std::make_shared<kwiver::vital::detected_object_set>();

FOREACH bounding-box returned from detector

    // Create a bounding box from the values returned. The new box takes
    // coordinates in the following order: left, top, right, bottom.
    // If the detector does not return exactly these values, they are
    // easy to calculate
    kwiver::vital::bounding_box_d bbox( left, top, right, bot);

    // Create a new detected object type structure. This is used to hold the
    // classification labels and associated probabilities or scores.
    auto dot = std::make_shared< kwiver::vital::detected_object_type >();

    FOREACH pair of classification label and score
        // Add the class name and probability to the detected object type
        dot->set_score( class_name, probability );
    END_FOREACH

    // Now that we have processed one detected object (as defined by a bounding box)
    // it has to be added to the detected_object_set
    detected_objects->add( std::make_shared< kwiver::vital::detected_object >( bbox, 1.0, dot ));
END_FOREACH

// When all detections have been processed, the detected object set for this input
// image is just returned from the detect() method
return detected_objects;

Python Detector Plugin

Similarly to the above C++ object detector, the python templates in the above directory can be copied into a new plugin module, and the template keywords replaced with a module name of your choosing.

External Plugin Creation

This document corresponds to this example online, in addition to the external_plugin_creation example folder in a VIAME installation.

This directory contains the source files needed to make a loadable algorithm plugin implementation external to VIAME, which links against an installation, or in the case of python generates a loadable script. This is for cases where we might want to just make a plugin against pre-compiled binaries, instead of building all of VIAME itself.

The procedure is slightly different depending on whether you are developing an external C++ or Python module. C++ modules require linking your module against VIAME, the output of which is plugin DLL which can be used directly in VIAME pipelines. Python processes can be made without compilation, and placed in your PYTHONPATH for use by the plugin system.