Simulating the world for autonomous applications

Visit Syncity

Big Data Management Platform for Computer Vision

From Dataset to Training Set

Explore Public Datasets

CVEDIA is a free, cloud-based service that simplifies image dataset preparation and management.

We give you direct access to standardized versions of public image datasets such as Open Images, COCO, ImageNet, and SpaceNet. Using our open-source CLI tool, you can easily export your filters and augmentations to your local server in the format of your favorite machine learning framework including Torch, Caffe, TensorFlow, MXNet, Digits, Theano, and Deeplearning4J.

Whether it’s still images, video and associated telemetry, satellite, or biomedical, our tools will reduce training set creation time from days to minutes. Our platform was designed to prepare and manage images in the millions.

Each dataset along with its metadata is normalized for your convenience. Image augmentations along with metadata manipulations are done in real-time and online.

Our Metadata Query Language (MQL) editor selectively filters your images to ensure only those that you want will be exported to your local disk in the machine learning library of your choice. Annotation tools support virtually every possible computer vision application.

The ability to train large models fast to push the boundaries of what is possible with computer vision begins with high quality datasets. CVEDIA lets you create training sets quicker than with any other dataset management platform.


We support many image sources including video, telemetry, biomedical, geospatial, multispectral, and SIFT.

Real-Time Image Augmentations

Choose from dozens of standard image transformations and apply them in real-time, in any order and quantity to a dataset.

Metadata Transformations

Track your error rate by writing your own code to redefine your output values or substrings.

Advanced Filters

Slice and filter datasets by any imaginable variable with our MQL editor. Search directly on image EXIF data or statistical values.

Direct Exports

CVEDIA-CLI tool is an open source project hosted on GitHub that directly exports your selections to your local server.


Annotation tools for applications including autonomous vehicles, medical, agriculture, object detection, facial recognition, etc.


Online Training Sets

Stream augmented images directly to your neural network framework with our CVEDIA API.

Adaptive Training Sets

Refine and filter your training in real-time with automatic reporting of the training loss back. Skip redundant images for a faster convergence.

Cloud Computing Integration

Integrate your work directly to Amazon’s AMI or Google’s Cloud ML service or use our internal-based cloud services for ultimate security.

Speed and Scalability

Our scalability is on demand. We reduce training times from weeks to a few hours. Our platform optimizes massive image datasets.


We support C++ and Python. Your selections are compatible with any of the machine learning libraries on the market.

Standardized Datasets

Our system ingests both public and private datasets in a standardized way, preserving all of the data for you, so you don’t have to.

Integrity Checking

Confirms data on disk has been downloaded. Safety feature during API changes or other system-based interruptions.

Resume Downloading

Exports that have been interrupted can be resumed from the last downloaded entry, while still retaining a valid subset on disk for testing.


Customize. Transform. Cut. Slice. Adapt. Rescale. Fine tune. Your computer vision datasets. Easily manage your most important asset with CVEDIA.

A platform for computer vision applications that addresses the gruelling process of collecting, annotating, cleaning and managing your datasets from beginning to end. Regardless of your data source or application, our tools let you build training sets quicker than any other platform.