I originally posted the IPL broadcast annotation dataset on Kaggle for the computer vision community. Kaggle is great for notebooks and competitions. But if you just want to pull a dataset into a Python script and start experimenting, the HuggingFace datasets library is the standard workflow — so I published it there too.
What’s in the dataset
1,005 images from IPL broadcast footage — actual match frames, not press photos. 800×600px JPGs. Each image has an 8×8 grid annotation (64 cells labeled 0 = empty, or 1–10 = IPL team ID), a player count (0–20), and a train/test flag.
Team IDs: CSK (1), DC (2), GT (3), KKR (4), LSG (5), MI (6), PBKS (7), RR (8), RCB (9), SRH (10). The annotation methodology and why I chose a grid over bounding boxes is in my earlier post on the Kaggle release.
Loading it
from datasets import load_dataset
ds = load_dataset("goyaljai/IPL-Player-Detection-IITB-PML")
train = ds["train"] # 793 examples
test = ds["test"] # 212 examples
# Each example: image (PIL), c01-c64 (int), count (int)
example = train[0]
print(example["count"]) # player count in this frame
print(example["c01"]) # top-left grid cell — team ID or 0
The image field is a PIL Image — pass it directly to your transform pipeline without intermediate file I/O. Grid labels are clean integers. Arrow format means fast random access across the full 1,005 examples.
Why broadcast footage matters for cricket ML
Most public cricket datasets are curated from press photography: clean lighting, staged poses, single player in frame. Models trained on it don’t generalize to broadcast conditions — partial occlusions, motion blur, multiple players in complex formations, score tickers in the corners.
This dataset is broadcast footage. The spatial grid annotations open tasks that curated datasets don’t support: formation analysis, spatial clustering by team, shot-type classification from player positions, team distribution modeling across different field configurations.
Suggested use cases
- Multi-label team classification from a broadcast frame
- Player count regression under real broadcast conditions
- Spatial team distribution modeling across shot types
- Few-shot learning benchmark for sports imagery
- Data augmentation base for IPL computer vision tasks
If you build something with it — a notebook, a model, a paper — I’d genuinely like to know. Dataset at huggingface.co/datasets/goyaljai/IPL-Player-Detection-IITB-PML.
Frequently Asked Questions
from datasets import load_dataset; ds = load_dataset(‘goyaljai/IPL-Player-Detection-IITB-PML’). The train split has 793 examples, test has 212. Each example includes an image field (PIL Image), c01–c64 grid labels, and a player count.
A cricket player detection dataset from IPL broadcast footage built at IIT Bombay for a Probabilistic Machine Learning course. 1,005 images with 8×8 team grid annotations and player counts across 10 IPL teams.
Broadcast footage has real conditions: motion blur, partial occlusions, broadcast overlays, variable lighting, crowds. Models trained on press photos don’t generalize to this. This dataset captures real match complexity.
Yes. The dataset is openly available on HuggingFace at huggingface.co/datasets/goyaljai/IPL-Player-Detection-IITB-PML and on Kaggle. No competition entry or account required.

Leave a Reply