DeepSpeech Docker image available

A new Docker image is available for training DeepSpeech models using a GPU backend. The image is based on Mozilla's DeepSpeech 0.7.4 one for training models, adding more functionality to it:

  • support for MP3 and OGG files, not just WAV

  • automatic alphabet generation from the transcripts

  • split sound files into chunks based on detected pauses

  • batch process audio files (can be continuous)

  • many Python utilities have been sym-linked

More information on the Docker image is available from Github:

github.com/waikato-datamining/tensorflow/tree/master/deepspeech

wai.annotations release 0.4.0

A new release of wai.annotations is out now: 0.4.0

This release represents a major refactoring, introducing domains other than object-detection in images. Another notable change is being able to add filters into the conversion pipeline. This will, e.g., allow the augmentation of datasets while converting them. In terms of object detection datasets, these augmentations can be rotation, cropping, brightening/darkening, etc. Though a lot of frameworks already support basic augmentation transformations, trying to automatically augment bounding boxes rather than object shapes can lead to strange and undesirable artifacts (e.g., overly large boxes, overlapping annotations).

YOLACT/YOLACT++ Docker images available

The first Docker images are available for the YOLACT (You Only Look At CoefficienTs) image segmentation model, using the 1.2 release:

github.com/waikato-datamining/yolact

YOLACT/YOLACT++, like MMDetection, is based on PyTorch. In contrast to MMDetection it is not a divers framework, but just a single model, which can make use of various base models (e.g., ResNet50). Similar to Mask R-CNN, it generates a mask with probabilities for each identified object, which requires post-processing to determine a polygon or minimal rectangle. However, it is much faster than the Mask R-CNN implementation that TensorFlow's Object Detection API offers (roughly 3x faster on 1000x300 images).

MMDetection Docker image available

The first Docker images are available for the MMDetection object detection framework, using the 2019-11-30 code base (close to 1.0rc1) of MMDetection:

github.com/waikato-datamining/mmdetection/tree/master/2019-11-30

MMDetection is based on PyTorch rather than TensorFlow and offers a larger number of pre-trained models and example configurations than TensorFlow's Object Detection API. It is developed by the Multimedia Laboratory of the Chinese University of Hong Kong.

First Releases

First release of our open-source Python library wai.annotations for converting various image annotation formats (licensed under Apache 2.0).

Our django plugin for managing teams and their projects, called simple-django-teams, is now publicly available as well (licensed under BSD-3).