When it comes to data collection, we are no ordinary technology company. If we are presented with a task that requires new, exotic, or otherwise unavailable data, we will work to obtain it at any cost.
Case Study: Home Data Enrichment Pipeline
For example, we once obtained and enriched over 10,000 high-resolution pictures of residential homes for a machine vision database. To collect the data, we put boots on the ground. The data was not available anywhere in the format we needed it, so we sent our data scientists out on the street, knocking door to door, paying home-owners to photograph the exteriors of their homes.
We needed the data manually enriched, but there were no software stacks available to do what we needed. We built a digital assembly line of data enrichment personnel across the globe and furnished them with custom markup software to deliver the data we needed.
Esteemed consultancies and academic researchers said it couldn’t be done with existing technology, and offered no solutions. They may have been right, but that doesn’t matter, because we used efficiently directed man-power to create our database. When technological solutions are unattainable due to resource availability, we build creative solutions to obtain those resources. Readers can learn more here: Case Study: Data Enrichment Pipeline.
The end result was a 500Gb+ machine-learning-ready database exteriors of residential homes. It has helped pave the way for our research in machine vision.
Why Should I Care?
Machine learning research is mature and highly democratized. Oftentimes, the only thing standing between an idea and revolutionary artificial intelligence project is data availability. Any firm will jump on the opportunity to work with that data to build a novel machine learning solution. We take it a step further by getting our hands dirty to build that database.