Data Scientist

Posted 2026-05-06
Remote, USA Full-time Immediate Start




This is a remote position.

Role Overview

We are looking for a highly skilled Data Scientist to design and deploy end-to-end production pipelines for multimodal data synthesis. You will focus on building sophisticated VAE architectures, advanced video processing modules, and scalable synthetic data generation systems using a phased, Agile delivery approach.

Key Responsibilities



  • Build standalone, testable modules for video metadata extraction, critical frame selection, and automated scene analysis.


  • Lead the design of VAE-based systems for complex data imputation and synthesis.


  • Execute a three-phase delivery model: 1) Extraction & Architecture, 2) Synthesis & Imputation, and 3) Cross-component optimization.


  • Implement rigorous integration testing and quality metrics to ensure the fidelity of synthetic outputs.







Requirements


Requirements



  • Deep expertise in VAE architecture design, training, and latent space manipulation for high-dimensional data synthesis and imputation.


  • Proven experience in 3D CNNs, scene change detection (inter-frame histograms), and motion analysis (optical flow/peak detection).


  • Ability to generate and fuse synthetic data across tabular, text, audio, and video formats using statistical modeling and Gaussian copulas.


  • Advanced Python skills with a focus on modular design, production-level pipelines, and ffmpeg integration for audio/video handling.


  • Familiarity with specialized techniques such as MDH/CDS/DSM exclusion and multimodal merger integration.









Similar Jobs

Back to Job Board