Dataset – ICDAR2019-ORF

How to download?

ICDAR2019-ORF training data can be downloaded from below:

Labelled training data on images downloaded from internet is available here. Please register for the competition via “contact us” page for getting the password.

Description of data

In literature many datasets have been proposed but they were not sufficient and/or available for academic research. That’s why we produced home-built labelled datasets for our research on floorplan analysis and have recently started working on developing a huge labelled dataset that could be shared with the research community.

Data source
We have thousands of floorplan images from an architectural firm, that will be used for developing the dataset. However, using images from one source limits the diversity in the dataset. Because of this reason, we will use floorplan images downloaded from internet for developing the dataset. We have decided to follow this approach in order to add diversity in the dataset and also to keep the possibility of continuous extension of the data (with the help of the research community).
Below are some of the images that show the diversity of floorplan images in the dataset for the ICDAR2019-ORF competition.

Size of dataset
In literature the existing publicly available labelled datasets for “Object Detection & Recognition in Floorplans” have a few hundred images at most SESYD (1000 images from only 10 templates), French- CVC (90 images), Racuten (500 images), ROBIN-master (510images), ROBIN-master++ (510 images), FPLAN-POLY (38 images).
The dataset for ICDAR2019-ORF competition will be comprised of at least 600 images. The labelling is in progress and it is a time-consuming task. The ground-truthing (labelling) is performed using a home-built labelling tool. However, we will most likely go a little bit above this number to make the dataset more challenging. There will be around 13 categories (object classes) in the dataset (like bed, table, sink, etc.). In addition, we will augment the dataset by adding distortions (such as Kanungo noise) to the images.

Dataset will be made publicly available
After the announcement of results at ICDAR2019, the ground-truthed dataset of ICDAR2019-ORF competition will be made publicly available for the research community via the ICDAR2019-ORF competition’s website. So that the researchers working on the above stated research problems can make use of it and that they can benchmark and/or compare their methods on this dataset in upcoming years.
The dataset of ICDAR2019-ORF competition will also be contributed to the TC10 and TC11 dataset collections.