Collecting and constructing training sets takes up a lot of your time and domain-specific knowledge of where and how to gather the relevant information adds to the already cumbersome process.
We lower the barrier of entry by normalizing the inputs from hundreds of public and private datasets to provide a fast and flexible output for dozens of machine learning frameworks.
Our productivity tools substantially reduce time and effort spent chasing and cleaning datasets, while reducing the need for expensive hardware requirements.


CVEDIA provides cloud-based public or private access to any image dataset. Browsable interface allows you to visually inspect and explore. Create and edit datasets through state-of-the-art tools that lets you work directly with satellite imagery, web-based photo repositories or your own material. Metadata tools efficiently add labels, crops or paths to any image. Support for exotic metadata like Triplet selection and Activation maps through our web-based drawing tools.

Select images from multiple datasets concurrently, apply selective augmentations and filters, and download them directly onto your server. Generate thousands of totally random image crops from one slide. Generated random crops are based on your specifications. Dataset statistics support random sampling of datasets and their subsets to identify any particular characteristics that otherwise may skew training results.

Run select existing neural networks on public datasets to automatically perform machine annotation into a new branch or dataset. Our platform has advanced clustering solutions allowing us to run detections or classifications on hundreds of millions of images in a matter of a few hours.

Learn more

Real-Time Image Augmentations

Flexible augmentations provide you with the ability to change 1 source image to any amount from 2 to hundreds of different outputs. Augmentations can be specified in any order that you choose and can be applied to one image or to a group of images simultaneously.

Supportive augmentations include resizing, random cropping, polygon cropping, hue and lightness modifications, rotating, flipping, drawing polygons and circles, infrared, snapshot, Gaussian Blur, clipping, inverting, and more.

Advanced features like video compression simulation with many configurables for codec and various bitrates to introduce encoding artifacts into your exports. Image compression artifacts through jpeg and png quality settings. Python (with NumPY) support in order to allow custom logic written by the user. Easy functions that work directly on 4d numpy arrays.

Metadata Augmentations

Flexible tools to normalize all of your metadata allowing you to filter and completely customize your data in ways that are currently unprecedented. Total flexibility and unlimited choice, tailor your datasets to suit your individual needs. Collaborate with colleagues and expand or clone metadata.
Metadata augmentation ensures coordinates (bounding boxes, paths, landmarks) remain valid after image augmentations by applying similar transformations to them.

Advanced Filters

Apply any imaginable kind of filter on the images with CVEDIA’S Metadata Query Language. Applied filters are taken into account when running our export tool, so you have direct control over the final dataset. Saved filters can be managed as collections, allowing you to fine tune them and apply augmentations.

Direct Exports

The CVEDIA-CLI tool is an Open Source project hosted on GitHub directly exporting your selected filters and augmentations to your local server in the format of your preferred machine learning framework.

Currently we support the following formats:


We are continually adding support for new outputs formats as well as custom templated outputs.


Annotations combined with data sources are the foundation for every deep learning application. While annotations include information about the identity, spatial extent, and viewpoint of the objects present in a depicted scene, their robustness occurs when paired with the corresponding source image.

We offer different types of services for image annotation including textual semantic, pixel-dense, semantic and instance-level segmentations, segmentation masks, telemetry data, and numbered/labeled landmarks combined with various image sources to create high quality training datasets for deep learning applications.

You can tag, classify, segment, share, and store your annotations on our system.

Learn more