Images and videos have long been key to critical workflows carried out by government and law enforcement organizations. The ability to detect, recognize, match, correlate, and understand what is in a picture or a video sequence are areas where automated software solutions can save time and improve quality.
By necessity, analysts and investigators need to be creative in their work, often generating and following leads from different angles and points of view. While “point solutions” such as facial recognition are available, more is needed to truly facilitate investigative work. What is required is software that integrates not only facial recognition but also objects recognition, text recognition, automatic labeling, visual search, and other related computer vision and AI technologies, all working together to provide a multifaceted view of the data.
Search and Discovery
Investigators, analysts, and knowledge-workers often have to make sense of and discover patterns in data that comes from many different sources. Bits of information from structured and unstructured data can then form the basis for actionable conclusions, strengthened by supporting evidence. When the data sources are from images and videos—whether archived or collected in real time—the user is often not sure of exactly what they are searching for or trying to discover. They are looking for patterns that lead to information that is pertinent to the task at hand, but it’s hard to know what information will be needed to identify those patterns.
In many cases, software automation tools on the market today focus on “matching” things, such as faces. Some extend the capability to cover “search,” such as matching a face but only under user-defined selective conditions. Even fewer tools support “discovery” functions—using other clues to help figure out which face the user should be interested in searching for and eventually matching to. All of these capabilities need to be packaged in a productive interface that lets users effectively communicate with the software to express their search intent. Further, the software is required to work at scale.
Data Types and Technologies
The ability to detect and recognize faces is a helpful feature in many use case applications. Capabilities of facial recognition technology have grown rapidly over the past few years, ranging from the ability to identify people in passports or access control situations to identification at scale or in more general settings, including, more recently, the ability to identify people wearing COVID-style face masks. Requirements continue to evolve, but there are at least some vendors that are pushing the state-of-the-art in this regard.
Recognizing objects within images or video has improved in recent years through the application of deep learning techniques. This technology is most successful when applied to generic classification tasks such as detecting the presence of a “cat,” a “person,” or a “car.” Unfortunately, the technique requires significant quantities of pre-labeled data, which can be expensive and time consuming to acquire, and in some use-cases may not be possible to obtain. Due to these and other technical reasons, commercial suppliers of object detection services on the web tend to limit their offerings to a few hundred common, recognizable object classes.
Many use-cases require higher levels of specificity, such as a search for a specific type of car, not just any car. This specificity requirement is important in the context of discovery. Further, the scope of things or objects that a user may be interested in is far greater than a few hundred or a few thousand, and often it is not possible to know beforehand what will be of interest. Visual search technologies are required to bridge the gap and support these use-cases. This capability lets users search for things that are seen but for which a name is not available, such as an image of a person whose face is completely obscured—visual search can still find useful information in the image, such as matches to the clothes the person is wearing. There are very few vendors that provide solutions in this space.
The ability to recognize a string of text that appears in the image is also crucial to the search and discovery process. Text can often add specificity to the formulation of a search query, such as searching for a car with a certain set of characters in the license plate. Text recognition and partial-string text search is often used together with other search module types.
An Integrated Package Approach
The integration of the above technologies in a productive user interface provides significant benefits to the work done by investigators and analysts. The goal is to let them discover more, faster, and more thoroughly. Integrating the capabilities means that in the same environment a user can operate on visual data from multiple points of view. Looking down the road, an integrated approach will enable new functionalities, including predictive analytics. These tools will let users identify patterns that would otherwise be exceedingly difficult for humans to spot, and thereby enable proactive responses.
Visit our website to learn more about an integrated content discovery software package.