AI Resident - Learning From Videos (LFV)
Posted 2026-05-06
Remote, USA
Full-time
Immediate Start
Toyota Research Institute (TRI) is focused on improving the quality of human life through advanced AI and robotics. The AI Resident position is a year-long research role for early-career researchers and engineers to contribute to the development of foundation models in embodied AI, particularly in multi-modal learning and spatio-temporal reasoning.
Responsibilities
- Develop, integrate, and deploy algorithms for Multi-Modal and 4D reasoning targeting physical applications
- Handle the ingestion of large-scale datasets for training, including streaming, online, and continual learning
- Contribute innovative solutions at the intersection of machine learning, computer vision, and robotics to improve real-world task performance
- Work closely with robotics and machine learning researchers and engineers to understand theoretical and practical needs
- Follow best practices producing maintainable code, both for internal use as well as for open-sourcing to the scientific community
- Contribute to research publications and technical reports
Skills
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, Robotics, or a related technical field
- Exceptional candidates with equivalent research experience (e.g., strong publication record, open-source contributions, or industry research experience) are encouraged to apply
- Strong background in computer vision and its applications to robotics and embodied systems
- Demonstrated research experience through publications, technical projects, or open-source contributions
- Strong communication skills and a collaborative mindset, with the ability to learn quickly and contribute to team research efforts
- Passionate about assisting and amplifying older adults and those in need through dexterous manipulation, human-robot collaboration, and physical assistance innovation
- Spatio-temporal (4D) computer vision, including multi-view geometry, 3D/4D reconstruction, video generation, self-supervised learning, occlusion reasoning, etc
- Large-scale training of multi-modal deep learning methods, both in terms of dataset sizes and model complexity, context length extension, and efficient attention, distributed computing, etc
- Application of machine learning and computer vision to embodied applications
Benefits
- Medical, dental, and vision insurance
- Paid time off benefits (including holiday pay and sick time)
Company Overview
Company H1B Sponsorship