About the role
Who you are
- We are seeking a highly skilled and mission-driven Data Scientist to join our customer team and lead the development of our Patient Risk Scoring machine learning initiative
- Education: Bachelor’s or Master’s degree in Data Science, Computer Science, Statistics, Bioinformatics, Health Informatics, or a related quantitative field
- Professional Experience: 4+ years of experience developing and deploying machine learning solutions in production environments, preferably within healthcare or clinical data ecosystems
- Microsoft Fabric Experience: Hands-on experience building and deploying data and ML workflows within the Microsoft Fabric ecosystem (OneLake, Notebooks, Spark, Data Factory)
- Machine Learning Proficiency: Strong grasp of classical machine learning algorithms (e.g., XGBoost, Random Forests, Logistic Regression) and modern deep learning techniques, specifically for tabular and time-series data
- Programming Skills: Advanced proficiency in Python and SQL. Experience with data manipulation and ML libraries (Pandas, PySpark, Scikit-Learn, PyTorch, or TensorFlow)
- Communication: Excellent ability to translate complex technical and statistical concepts to non-technical clinical and business stakeholders
- Cloud Certifications: Relevant Microsoft Azure or Fabric certifications
What the job involves
- We are seeking a highly skilled and mission-driven Data Scientist to join our customer team and lead the development of our Patient Risk Scoring machine learning initiatives
- In this role, you will build predictive models that identify high-risk patients for interventions, ultimately improving patient outcomes and optimizing care delivery
- You will leverage Microsoft Fabric to manage end-to-end machine learning workflows, drawing insights from Electronic Health Record (EHR) data sources
- Model Development & Deployment: Design, train, evaluate, and deploy machine learning models to predict patient risk scores (e.g., medication refusal, non-compliance, decompensation etc.)
- EHR Data Engineering & Processing: Extract, clean, and transform healthcare data from EHR systems (e.g., Credible) to build robust feature sets for predictive modeling
- End-to-End Analytics with Microsoft Fabric: Utilize Microsoft Fabric’s unified analytics platform (including Data Engineering, Data Science, and Real-Time Analytics workloads) to orchestrate data pipelines, manage Lakehouse architectures, and scale ML training/inference
- Clinical Collaboration: Partner closely with clinical stakeholders, medical officers, and care teams to define risk cohorts, ensure the clinical validity of model features, and translate model outputs into actionable clinical workflows
- MLOps & Monitoring: Establish continuous integration, deployment, and monitoring of ML models to track data drift, model degradation, and fairness/bias over time
- Compliance & Privacy: Ensure all data handling and modeling practices strictly adhere to healthcare regulations (e.g., HIPAA, HITRUST) and maintain the highest standards of data security and patient privacy
- Full-time: 8 hours a day
- Schedule: Mon – Fri 9-5 (US EST) overlap with team at least 4 hours
- Team Structure: Work independently on assigned tasks with support from experienced team members
- Communication: Primarily asynchronous via email and MS Teams