Machine learning backend

You can easily connect your favorite machine learning framework with Label Studio Machine Learning SDK.

That gives you the opportunities to use:

Tutorials

Create your own ML backend

Check examples in label-studio/ml/examples directory.

Quickstart

Here is a quick example tutorial on how to run the ML backend with a simple text classifier:

  1. Clone repo
    git clone https://github.com/heartexlabs/label-studio
  1. Setup environment
    cd label-studio
    pip install -e .
    cd label_studio/ml/examples
    pip install -r requirements.txt
  1. Create new ML backend
    label-studio-ml init my_ml_backend --script label_studio/ml/examples/simple_text_classifier.py
  1. Start ML backend server
    label-studio-ml start my_ml_backend
  1. Run Label Studio connecting it to the running ML backend:

    label-studio start text_classification_project --init --template text_sentiment --ml-backends http://localhost:9090

    You can confirm that the model has connected properly from the /model subpage in the Label Studio UI.

  2. Getting predictions
    You should see model predictions in the labeling interface. For example in an image classification task: the model will
    pre-select an image class for you to verify.

  3. Model training

    Model training can be triggered manually by pushing the Start Training button on the /model page, or by using an API call:

    curl -X POST http://localhost:8080/api/train

    In development mode, training logs will have an output into the console. In production mode, runtime logs are available in
    my_backend/logs/uwsgi.log and RQ training logs in my_backend/logs/rq.log

Start with docker compose

Label Studio ML scripts include everything you need to create production ready ML backend server, powered by docker. It uses uWSGI + supervisord stack, and handles background training jobs using RQ.

After running this command:

label-studio-ml init my-ml-backend --script label_studio/ml/examples/simple_text_classifier.py

you’ll see configs in my-ml-backend/ directory needed to build and run docker image using docker-compose.

Some preliminaries:

  1. Ensure all requirements are specified in my-ml-backend/requirements.txt file, e.g. place

    scikit-learn
  1. There are no services currently running on ports 9090, 6379 (otherwise change default ports in my-ml-backend/docker-compose.yml)

Then from my-ml-backend/ directory run

docker-compose up

The server starts listening on port 9090, and you can connect it to Label Studio by specifying --ml-backends http://localhost:9090
or via UI on Model page.

Active Learning

The process of creating annotated training data for supervised machine learning models is often expensive and time-consuming. Active Learning is a branch of machine learning that seeks to minimize the total amount of data required for labeling by strategically sampling observations that provide new insight into the problem. In particular, Active Learning algorithms seek to select diverse and informative data for annotation (rather than random observations) from a pool of unlabeled data using prediction scores.

Depending on score types you can select a sampling strategy

Read more about active learning sampling on the task page.