diff --git a/site/content/en/docs/manual/advanced/automatic-annotation.md b/site/content/en/docs/manual/advanced/automatic-annotation.md index 543e85edf1f..3805509d7ad 100644 --- a/site/content/en/docs/manual/advanced/automatic-annotation.md +++ b/site/content/en/docs/manual/advanced/automatic-annotation.md @@ -2,54 +2,117 @@ title: 'Automatic annotation' linkTitle: 'Automatic annotation' weight: 16 -description: 'Guide to using the automatic annotation of tasks.' +description: 'Automatic annotation of tasks' --- -Automatic Annotation is used for creating preliminary annotations. -To use Automatic Annotation you need a DL model that can be deployed by a CVAT administrator. -You can find the list of available models in the `Models` section. +Automatic annotation in CVAT is a tool that you can use +to automatically pre-annotate your data with pre-trained models. -1. To launch automatic annotation, you should open the dashboard and find a task which you want to annotate. - Then click the `Actions` button and choose option `Automatic Annotation` from the dropdown menu. +CVAT can use models from the following sources: + +- [Pre-installed models](#models). +- Models integrated from [Hugging Face and Roboflow](#adding-models-from-hugging-face-and-roboflow). +- [Self-hosted models deployed with Nuclio](/docs/manual/advanced/serverless-tutorial/). + +The following table describes the available options: + +| | Self-hosted | Cloud | +| ------------------------------------------- | ---------------------- | ------------------------------------------------ | +| **Price** | Free | See [Pricing](https://www.cvat.ai/pricing/cloud) | +| **Models** | You have to add models | You can use pre-installed models | +| **Hugging Face & Roboflow
integration** | Not supported | Supported | + +See: + +- [Running Automatic annotation](#running-automatic-annotation) +- [Labels matching](#labels-matching) +- [Models](#models) +- [Adding models from Hugging Face and Roboflow](#adding-models-from-hugging-face-and-roboflow) + +## Running Automatic annotation + +To start automatic annotation, do the following: + +1. On the top menu, click **Tasks**. +1. Find the task you want to annotate and click **Action** > **Automatic annotation**. ![](/images/image119_detrac.jpg) -1. In the dialog window select a model you need. DL models are created for specific labels, e.g. - the Crossroad model was taught using footage from cameras located above the highway and it is best to - use this model for the tasks with similar camera angles. - If it's necessary select the `Clean old annotations` checkbox. - Adjust the labels so that the task labels will correspond to the labels of the DL model. - For example, let’s consider a task where you have to annotate labels “car” and “person”. - You should connect the “person” label from the model to the “person” label in the task. - As for the “car” label, you should choose the most fitting label available in the model - the “vehicle” label. - If the chosen model supports automatic attributes detection - (like facial expressions, for example: ``serverless/openvino/omz/intel/face-detection-0205``), - you can also map attributes between the DL model and your task. - The task requires to annotate cars only and choosing the “vehicle” label implies annotation of all vehicles, - in this case using auto annotation will help you complete the task faster. - Click `Submit` to begin the automatic annotation process. +1. In the Automatic annotation dialog, from the drop-down list, select a [model](#models). +1. [Match the labels](#labels-matching) of the model and the task. +1. (Optional) In case you need the model to return masks as polygons, switch toggle **Return masks as polygons**. +1. (Optional) In case you need to remove all previous annotations, switch toggle **Clean old annotations**. ![](/images/image120.jpg) -1. At runtime - you can see the percentage of completion. - You can cancel the automatic annotation by clicking on the `Cancel`button. +1. Click **Annotate**. + +CVAT will show the progress of annotation on the progress bar. + +![Progress bar](/images/image121_detrac.jpg) + +You can stop the automatic annotation at any moment by clicking cancel. + +## Labels matching + +Each model is trained on a dataset and supports only the dataset's labels. + +For example: + +- DL model has the label `car`. +- Your task (or project) has the label `vehicle`. + +To annotate, you need to match these two labels to give +CVAT a hint that, in this case, `car` = `vehicle`. + +If you have a label that is not on the list +of DL labels, you will not be able to +match them. + +For this reason, supported DL models are suitable only +for certain labels. + +To check the list of labels for each model, see [Models](#models) +papers and official documentation. + +## Models + +Automatic annotation uses pre-installed and added models. + +> For self-hosted solutions, +> you need to [install Automatic Annotation first](/docs/administration/advanced/installation_automatic_annotation/) +> and [add models](/docs/manual/advanced/models/). + +List of pre-installed models: + + + +| Model | Description | +| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Attributed face detection | Three OpenVINO models work together:

  • [Face Detection 0205](https://docs.openvino.ai/2022.3/omz_models_model_face_detection_0205.html): face detector based on MobileNetV2 as a backbone with a FCOS head for indoor and outdoor scenes shot by a front-facing camera.
  • [Emotions recognition retail 0003](https://docs.openvino.ai/2022.3/omz_models_model_emotions_recognition_retail_0003.html#emotions-recognition-retail-0003): fully convolutional network for recognition of five emotions (‘neutral’, ‘happy’, ‘sad’, ‘surprise’, ‘anger’).
  • [Age gender recognition retail 0013](https://docs.openvino.ai/2022.3/omz_models_model_age_gender_recognition_retail_0013.html): fully convolutional network for simultaneous Age/Gender recognition. The network can recognize the age of people in the [18 - 75] years old range; it is not applicable for children since their faces were not in the training set. | +| RetinaNet R101 | RetinaNet is a one-stage object detection model that utilizes a focal loss function to address class imbalance during training. Focal loss applies a modulating term to the cross entropy loss to focus learning on hard negative examples. RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks.

    For more information, see:
  • [Site: RetinaNET](https://paperswithcode.com/lib/detectron2/retinanet) | +| Text detection | Text detector based on PixelLink architecture with MobileNetV2, depth_multiplier=1.4 as a backbone for indoor/outdoor scenes.

    For more information, see:
  • [Site: OpenVINO Text detection 004](https://docs.openvino.ai/2022.3/omz_models_model_text_detection_0004.html) | +| YOLO v3 | YOLO v3 is a family of object detection architectures and models pre-trained on the COCO dataset.

    For more information, see:
  • [Site: YOLO v3](https://docs.openvino.ai/2022.3/omz_models_model_yolo_v3_tf.html) | +| YOLO v5 | YOLO v5 is a family of object detection architectures and models based on the Pytorch framework.

    For more information, see:
  • [GitHub: YOLO v5](https://github.com/ultralytics/yolov5)
  • [Site: YOLO v5](https://docs.ultralytics.com/#yolov5) | +| YOLO v7 | YOLOv7 is an advanced object detection model that outperforms other detectors in terms of both speed and accuracy. It can process frames at a rate ranging from 5 to 160 frames per second (FPS) and achieves the highest accuracy with 56.8% average precision (AP) among real-time object detectors running at 30 FPS or higher on the V100 graphics processing unit (GPU).

    For more information, see:
  • [GitHub: YOLO v7](https://github.com/WongKinYiu/yolov7)
  • [Paper: YOLO v7](https://arxiv.org/pdf/2207.02696.pdf) | + + - ![](/images/image121_detrac.jpg) +## Adding models from Hugging Face and Roboflow -1. The end result of an automatic annotation is an annotation with separate rectangles (or other shapes) +In case you did not find the model you need, you can add a model +of your choice from [Hugging Face](https://huggingface.co/) +or [Roboflow](https://roboflow.com/). - ![](/images/gif014_detrac.gif) +> **Note**, that you cannot add models from Hugging Face and Roboflow to self-hosted CVAT. -1. You can combine separate bounding boxes into tracks using the `Person reidentification ` model. - To do this, click on the automatic annotation item in the action menu again and select the model - of the `ReID` type (in this case the `Person reidentification` model). - You can set the following parameters: + - - Model `Threshold` is a maximum cosine distance between objects’ embeddings. - - `Maximum distance` defines a maximum radius that an object can diverge between adjacent frames. +For more information, +see [Streamline annotation by integrating Hugging Face and Roboflow models](https://www.cvat.ai/post/integrating-hugging-face-and-roboflow-models). - ![](/images/image133.jpg) +This video demonstrates the process: -1. You can remove false positives and edit tracks using `Split` and `Merge` functions. + - ![](/images/gif015_detrac.gif) + diff --git a/site/content/en/images/gif014_detrac.gif b/site/content/en/images/gif014_detrac.gif deleted file mode 100644 index a80bcc54c76..00000000000 Binary files a/site/content/en/images/gif014_detrac.gif and /dev/null differ diff --git a/site/content/en/images/gif015_detrac.gif b/site/content/en/images/gif015_detrac.gif deleted file mode 100644 index f711ce5aaf9..00000000000 Binary files a/site/content/en/images/gif015_detrac.gif and /dev/null differ diff --git a/site/content/en/images/image120.jpg b/site/content/en/images/image120.jpg index c0cca3fa145..3eb964fe2f9 100644 Binary files a/site/content/en/images/image120.jpg and b/site/content/en/images/image120.jpg differ diff --git a/site/content/en/images/image133.jpg b/site/content/en/images/image133.jpg deleted file mode 100644 index f869057fa1d..00000000000 Binary files a/site/content/en/images/image133.jpg and /dev/null differ