Reva: Create Your First Dataset | Nabrio

Goal

You will learn how to

Create a dataset and subsets for object detection tasks
Annotate your dataset with bounding boxes and category labels
Create a new version of your dataset for training

You will need a dataset for this guide, to make it easy for you to follow along, we have prepared a sample dataset for you. You can download and unzip the sample dataset from the link below:

Sample Dataset: Cats, Dogs & Monkeys

Home Page

Reva Home page

Home page is where you can see the ongoing, finished tasks, training status, and plugins.

From here let's click on the Dataset button on the left sidebar to go to the dataset page.

Datasets

Welcome to the dataset page! This is where you can create and manage your datasets in Reva, let's create a dataset!

Create a Dataset

Dataset page

Click on the New Dataset button in the top right corner to start creating a dataset.

Create dataset dialog

In the Create a new dataset dialog, enter the following:

Name: Name of your dataset (e.g. Cats Dogs & Monkeys)
Dataset Task: The task type of your dataset (e.g. Object Detection)

Then click on the Create button to create the dataset.

Dataset settings page

Above is the dataset settings page, where you can manage your dataset information, subset and version of your dataset, and view the dataset details.

Subset

Subset is a way to organize your dataset into a smaller group. For example, you can start your dataset with a subset of 100 images, and then add more subsets to the dataset later on.

Create a New Subset

To create a new subset, click on the Add New Subset button in the Subset section.

Create subset dialog

In the Create a new subset dialog, enter the following:

Name: Name of your subset (e.g. First)

Then click on the Create button to create the subset.

Subset Page

Subset page

After creating the subset, you will be directed to the subset page, where you can manage your subset information, add data to your subset, annotate your data, export data, and more.

Note

Depending on your screen size, the layout of the dataset settings page and subset page may be different.

Importing Images Data

Let's import the sample dataset we downloaded earlier to the subset. Click on the Import Image Data button in the subset page.

Import data dialog

In the Import image data dialog, you enter the following:

Annotation format: The annotation format of your dataset (for this step we will select No Annotation since we will annotate the data later)
Source Directory: The directory of your dataset (e.g. D:\datset_cats-dogs-monkeys\raw_data)

Then click on the Import data button to start importing the data.

Importing data

Once the data is imported, you will see the images in the subset page.

Annotate Data

Since we imported the data without annotation, let's annotate the data now! Click on the Annotate data button in the subset page to start annotating the data.

Annotation page

In the annotation page for object detection task, you can draw bounding boxes on the images and label them with the corresponding class names (e.g. cat, dog, monkey).

Annotation page with bounding boxes

Let's add a category for each class in the dataset first. Under the Categories section on the right panel, click on the Add Category button to add a new category. Enter the category name (e.g. cat) and click on the Add button to create the category. Repeat this step to create categories for dog and monkey.

Note

Category names are case-sensitive. So for example, cat and Cat will be considered as two different categories.

Let's start annotating the images now! Click on the button in the top left corner (or press W on your keyboard) to start drawing a bounding box on the image.

Draw bounding box

Click on the image to place the top-left corner of the bounding box, click again to place the bottom-right corner of the bounding box, and then select the category of the bounding box (e.g. cat) to label the bounding box.

Tip

For more efficient annotation, you can pre-select the category you want to annotate before drawing the bounding box by either:

Click the category in the Categories section
Press the corresponding number key on your keyboard (e.g. 1 for cat, 2 for dog, 3 for monkey).

Annotated image

You may find the default category color is not very visible on the image. To change this:

Click on the category menu button in the category list
Click the Set Category Color button, and then select a color you like for the category.

In the category dropdown menu, you can also Rename Category, Merge Category with another category, or Delete Category (which will also delete all the annotations with this category on all subsets in the dataset).

To save the annotation, click on the button in the top right corner, or press Ctrl + S on your keyboard.

You can navigate to the next image by clicking on the button in the top right corner, or by pressing D on your keyboard. Similarly, you can navigate to the previous image by clicking on the button, or by pressing A on your keyboard.

Note

Annotation progress is automatically saved when you navigate to the next/previous image, or when you navigate back to the subset page.

Let's annotate all the images in the subset now! Take your time to annotate the images, and feel free to explore the annotation tool and its features.

Import Pre-Annotated Data in a New Subset

For object detection task, if you already have annotations for your dataset, you can also import the annotations the subset.

Let's create a new subset for the pre-annotated data first. Click on the Add New Subset button in the Subset section, enter the name of the subset (e.g. Import), and then click on the Create button to create the subset.

Import annotation button

Like before, click on the Import Annotation button in the subset page to start importing the annotations. In the Import annotation dialog:

Annotation format: The annotation format of your dataset (for this step we will select Pascal VOC since the sample dataset has annotation in Pascal VOC format)
Source Directory: The directory of your dataset (e.g. D:\datset_cats-dogs-monkeys\pascal_voc)

Then click on the Import data button to start importing the annotations.

Importing annotation

Since this imported data is much larger than the previous subset, it may take a while to import the data and annotations, you will see this in the On-going Tasks section on similar to the image above.

Once the data and annotations are imported, you can click on the Annotate data button to view the imported annotations on the images.

Pre-annotated image

As you can see, the bounding boxes are already on the images after importing the annotations.

Dataset Statistics

In the dataset settings page, you can also view the statistics of your dataset in the Extra Info section.

Click on the Refresh dataset health info button to get the latest statistics of your dataset.

Dataset health 1

For object detection task, you can see multiple statistics of your dataset, such as the number of images, number of annotations, category distribution, and more.

Dataset health 2

At the bottom of the section, you can also find the Health Checks which will advice you on how to improve your dataset.

Version

Dataset version is a way to keep track and manage the changes in your dataset. Version is where the training data for your model will be referenced from.

Create a New Version

Dataset version

To add a new version, click on the Add New Version button in the Version section.

In the Create a new version dialog for object detection, enter the following:

Name: Name of your version (e.g. 01)
Train/ Validation/ Test Split Ratio: The split ratio of your dataset for training, validation, and testing (e.g. 70/20/10)
Subsets: The subset(s) you want to include in this version, this is default to all subsets in the dataset, but you can also choose to only include certain subset(s) in the version.

Then click on the Create button to create the version.

No support for Test split yet

With the current version of Reva, we are only using the Train and Validation split for training and validating the model, the Test split is not used for now but we will support it in the future for evaluating the model performance.

Therefore, when you create a new version, you can choose to only split the dataset into Train and Validation sets (e.g. 80/20), and leave the Test split empty.

Version Details

Version details

After creating the version, you can see on the version details page the information of the version, such as the split ratio and subsets included in the version.

Goal

You will learn how to

Create a dataset and subsets for object detection tasks
Annotate your dataset with bounding boxes and category labels
Create a new version of your dataset for training

You will need a dataset for this guide, to make it easy for you to follow along, we have prepared a sample dataset for you. You can download and unzip the sample dataset from the link below:

Sample Dataset: Cats, Dogs & Monkeys

Home Page

Reva Home page

Home page is where you can see the ongoing, finished tasks, training status, and plugins.

From here let's click on the Dataset button on the left sidebar to go to the dataset page.