Getting started

Overview

Label Studio is a self-contained Web application for multi-typed data labeling and exploration. The backend is written in pure Python powered by Flask. The frontend part is a backend-agnostic React + MST app, included as a precompiled script.

Here are the main concepts behind Label Studio’s workflow:

Main modules

The main modules of LS are


Relations among tasks, completions and results

Here you can see relations among labeling objects: tasks, completions, results, etc.

One user provides one completion, it’s atomic, and it consists of the result items. Result items can have relations between themselves with the specified direction of three types: left-right, right-left, or bidirectional. Normalizations are additional information in the custom string format about the current result item.



Completions and Predictions are very similar. But predictions must be generated automatically by ML models.

Usage statistics

Label Studio collects anonymous usage statistics without any sensitive info about page request number and data types from the labeling config. It helps us to improve the labeling quality and gives a better understanding about the next development.

Quickstart

Prerequisites

Label Studio is supported for Python 3.5 or greater, running on Linux, Windows and MacOSX.

Note: for Windows users the default installation may fail to build lxml package. Consider manually installing it from unofficial Windows binaries e.g. if you are running on x64 with Python 3.8, run pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl.

Running with pip

To install Label Studio via pip, you need Python>=3.5 and run:

pip install label-studio

Then launch a new project which stores all labeling data in a local directory my_project:

label-studio start my_project --init

The default browser opens automatically at http://localhost:8080.

Running with Docker

Label Studio is also distributed as a docker container. Make sure you have Docker installed on your local machine.

Install and start Label Studio at http://localhost:8080 storing all labeling data in ./my_project directory:

docker run --rm -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest

Note: if ./my_project the folder exists, an exception will be thrown. Please delete this folder or use --force option.
Note: for Windows, you have to modify the volumes paths set by -v option

You can override the default startup command by appending any of available command line arguments:

docker run -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --force --template image_mixedlabel

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

Running from source

If you want to use nightly builds, or extend the functionality, consider to download the source code using Git and run Label Studio locally:

git clone https://github.com/heartexlabs/label-studio.git
cd label-studio
python setup.py develop

Then create a new project, it stores all labeling data in a local directory my_project:

label-studio start my_project --init

The default browser will open automatically at http://localhost:8080.

Multisession mode

You can start Label Studio in multisession mode - each browser session creates it’s own project with associated session ID as a name.

In order to launch Label Studio in multisession mode and keep all projects in a separate directory session_projects, run

label-studio start-multi-session --root-dir ./session_projects

Command line arguments

You can specify input tasks, project config, machine learning backend and other options via the command line interface. Run label-studio start --help to see all available options.

Auth with login and password

You can restrict the access for LS instance with the basic HTTP auth.

label-studio start my_project --username user --password pwd

Or put username and password in the project config.json.

{ 
 ...
 "username": "user", 
 "password": "pwd",
 ...
}

For docker you need to setup environment variables USERNAME and PASSWORD

It will be the same username and password for all the users.

WSGIServer instead of Flask

Use --use-gevent option on start to enable WSGI server. It wraps around app.run with gevent’s WSGIServer to enable the server to better handle concurrent requests.

label-studio start test --use-gevent

HTTPS & SSL

You can enable https protocol for Flask or WSGIServer. You need to generate SSL certificate and key for it, e.g.:

openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem

Than you need to use --cert and --key option on start:

label-studio start test --cert certificate.pem --key key.pem

Health check

LS has a special endpoint for health checks:

/api/health