Daimo

A Labeling System for Machine Learning Applications

Introduction

For machine learning or deep learning, you need train, validate and tune your model. Specially, you need labeling data in a fast, secure and high-quality way. But labeling big datasets takes time and effort, and human labor is expensive. That's why we developed DAIMO: semi-automated tool explicitly conceived to support the labeling phase of machine-learning projects reducing time, costs and errors.

DAIMO is not the first tool of this kind, we know. However, DAIMO provides a set of unique features that set it apart from the other ones:

Interface

DAIMO provides a simple and effective interface to explore pre-defined collections of samples to label. Users have at their disposal a rich set of controls to search, query and summarize the collection of samples. This is a crucial requirement for any training task that involves thousands of objects.

Extention

DAIMO allows users to add labels both to an entire instance, or to portions of it. This represents an important advantage with respect to other tools that only support the first option.

Collaboration

DAIMO is aimed at collaborative labeling by a group of experts. It provides sophisticated control to lock, unlock, save and review labels. Also, it supports both a single-step and clerical-review process

Vocabularies

DAIMO allows users to define label vocabularies, in order to standardize the way in which labels are assigned to samples. This because labels are not always known in advance: in large labeling applications they rather need to be constructed on the way.

Suggestion

DAIMO provides a suggestion-based labeling process. In essence, DAIMO incorporates an engine that is able to learn labeling strategies from examples. After some initial training, it does not only collect new labels from users, but actually suggests them, so that users need only to accept or refuse DAIMO’s suggestions.

Classifiers

DAIMO supports custom classifiers, i.e. classifiers generated by data practitioners, and generated classifiers, i.e. the ones generated automatically by itself.

This approach really transforms the labeling process from the inside out, since after a while it is DAIMO and not the user to do most of the work!