Columbia AI Summit poster contribution

What it is:

Poster presentation at Columbia AI Summit. Eye-Track-ML: our machine learning pipeline designed to automate the frame-by-frame coding of extensive eye-tracking video data.

What we did:

Develop an automated pipeline to analyze extensive eye-tracking video data from infant studies.
Employ computer vision models (YOLOv11 and SAM2.1) for precise object detection, segmentation, and event classification.
Attain 100% accuracy for event labeling and approximately 94% for object labeling.
Decrease manual coding labor by over 90%, ensuring high data consistency.

Eye-Track-ML Poster - Download PDF

Presenting at the Columbia AI Summit

Discussing the poster

AI Summit venue

Read Full Project Details...

Eye-Track-ML: A Machine Learning Pipeline for Automated Frame-by-Frame Coding of Eye-Tracking Videos

Our project, Eye-Track-ML, is a pipeline for automating eye-tracking video analysis using computer vision models, YOLOv11 and SAM2.1. We developed this solution to address the challenge of manually coding over 6+ hours of video data (600,000 frames) from our infant event representation study. The pipeline combines YOLOv11 for image classification and object detection with SAM2.1 for object segmentation. For event labeling our pipeline achieves 100% accuracy, and for object labeling we achieve ~94%.

We found that human verification remains necessary for detecting subtle patterns and edge cases. However, our system establishes a strong baseline for consistency, requiring human verifiers to correct only about ~6% of data points. This is a dramatic reduction in manual labor.

YOLO documentation Segment Anything Model (SAM) repository How to train YOLOv11 on custom data Fine-tuning SAM 2.1