Jump to content

IBM's Watson AI used to develop multi-face tracking algorithm


Recommended Posts

System can track multiple people across scenes, despite changing camera angles


BIG BLUE IBM has used its Watson artificial intelligence (AI) tech to develop a new algorithm for multi-face tracking.


The system uses AI to track multiple individuals across scenes, despite changing camera angles, lighting, and appearances.


Collaborating with Professor Ying Hung of the Department of Statistics and Biostatistics in Rutgers University, IBM Watson researcher Chung-Ching Lin led a team of scientists to develop the technology, using a method to spot different individuals in a video sequence.

The system is also able to recognise if people leave and then re-enter the video, even if they look very different.


To create this innovation in AI, Lin explained that the team first made 'tracklets' for the people present in the source material.


"The tracklets are based on co-occurrence of multiple body parts (face, head and shoulders, upper body, and whole body), so that people can be tracked even when they are not fully in view of the camera - for example, their faces are turned away or occluded by other objects."


Lin added: "We formulate the multi-person tracking problem as a graph structure with two types of edges."


The first of these is 'spatial edges', which denote the connections of different body parts of a candidate within a frame and are used to generate the hypothesised state of a candidate.


The second is 'temporal edges', which refer to the connections of the same body parts over adjacent frames and are used to estimate the state of each individual person in different frames.


"We generate face tracklets using face-bounding boxes from each individual person's tracklets and extract facial feature for clustering," he added.


To see how well the technology could perform, Lin and his team compared it against state-of-the-art methods in analysing challenging datasets of unconstrained videos.


In one experiment, they used music videos, which feature high image quality but significant, rapid changes in the scene, camera setting, camera movement, makeup, and accessories, such as eyeglasses.


"Our algorithm outperformed other methods with respect to both clustering accuracy and tracking," Lin added. "Clustering purity was substantially better with our algorithm compared with the other methods [and] automatically determined the number of people, or clusters, to be tracked without the need for manual video analysis."


The algorithm and its performance are described in more detail in IBM's CVPR research paperA Prior-Less Method for Multi-Face Tracking in Unconstrained Videos.


< Here >

Link to comment
Share on other sites

  • Replies 0
  • Created
  • Last Reply


This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...