Research Intern - Multimodal Foundation Model Vision
Listed on 2026-03-11
-
Software Development
Data Scientist, AI Engineer
Sony Corporation of America
, located in New York, NY, is the U.S. headquarters of Sony Group Corporation, based in Tokyo, Japan. Sony's principal U.S. businesses include Sony Electronics Inc., Sony Interactive Entertainment LLC, Sony Music Entertainment, Sony Music Publishing and Sony Pictures Entertainment Inc. With some 900 million Sony devices in hands and homes worldwide today, a vast array of Sony movies, television shows and music, and the Play Station Network, Sony creates and delivers more entertainment experiences to more people than anyone else on earth.
To learn more: .
Sony AI is seeking research interns to join us. Our team mainly focuses on fundamental and applied research, with a focus on building next-generation foundation models for vision in a responsible manner. The role of a research intern is to develop efficient and effective methodologies and prototype solutions. You will work with a productive team of world‑class scientists and engineers to tackle the most challenging problems in foundation models and generative AI, including low‑cost yet powerful vision foundation models, vision‑language models, unified models, automatic model compression, optimization and deployment on cloud and edge.
Your ideas will be published in papers and improve the experience of billions of customers.
- Conduct fundamental and innovative development in low-cost yet powerful vision‑language models (VLM), unified models, automatic model compression, optimization and deployment on cloud and edge.
- Design or implement state‑of‑the‑art techniques on model compression, inference speedup, deployment on hardware, and tool automation.
- PoC for various vision+text, generation relevant tasks (VQA, captioning, understanding, etc.) and hardware.
- Contribute to library and tool development to support business, or publish influential research in top‑tier conferences and journals.
- Currently has, or is in the process of obtaining, a master/PhD degree in computer science or related field.
- Be very self‑motivated and capable of proposing and implementing innovative ideas.
- Solid presentation and communication skills to internal and external audiences.
- Publications or expertise in compact foundation model development and deployment. Influential open‑source projects or paper publication at top conferences, e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ACL, etc.
- Better to have front‑end development experience.
- Solid coding skills in Python, PyTorch, etc.
Location flexible (Tokyo, Europe, US).
The target hourly rate for this internship is $50.00 per hour. The individual will be paid hourly and eligible for overtime.
All qualified applicants will receive consideration for employment without regard to any basis protected by applicable federal, state, or local law, ordinance, or regulation.
Disability Accommodation for Applicants to Sony Corporation of AmericaSony Corporation of America provides reasonable accommodation for qualified individuals with disabilities and disabled veterans in job application procedures. For reasonable accommodation requests, please contact us by email at or by mail to:
Sony Corporation of America, Human Resources Department, 25 Madison Avenue, New York, NY 10010. Please indicate the position you are applying for.
Right to Work (English/Spanish)
E-Verify Participation (English/Spanish)
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).