Data Scientist. Mathematics Master's (MMath) graduate from the University of Oxford.

Experience

Remunerated

(August 2024 - Present) Data Scientist - Ocado Technology

Working on the Automated Storage & Retrieval System team using data from robotics sensors.

(June 2022 - August 2024) Data Scientist - BIOStress

First full-time data science hire at an early stage startup, responsible for all the models and data pipeline.

  • Developed a novel stress detection algorithm that demonstrated slight accuracy improvements over the state-of-the-art while significantly enhancing robustness and generalisability, leading to a patent submission
  • Migrated a manual data processing system requiring several hours per user to a fully automated, end-to-end pipeline, reduced processing time to less than a minute and deployed in compliance with ISO 27001 principles
  • Managed a 3-month intern, provided guidance and set projects to develop his skills

(July 2021 - September 2021) Research Intern - Oxehealth

10 week internship at a vision-based medical device company.

  • Gave insight to a new product area for the company by analysing literature, creating a module to handle polysomnography data, and training time series classifier models
  • Improved algorithm evaluation by building PySpark tools to audit large amounts of data leading to a scalable way of identifying misclassifications

Volunteering

(January 2024 - Present) Lead Statistician - Pulmonary Vascular Research Institute

Volunteering in association with the Royal Statistical Society as the primary statistician analysing data from a large-scale patient survey.

  • Conducting quantitative analysis across three manuscripts with medical researchers from the University of Cambridge
  • Developing data sharing framework to enable research reusability while ensuring patient privacy, reducing administrative overhead for PVRI

(August 2024 - September 2024) AI Safety Engineer - Arcadia Impact, AISI

  • Implemented a multimodal, zero- and multi-shot benchmark (MathVista) into AISI's Inspect framework and evaluated the performance against existing and novel models. Link.

Education

(2018 - 2022) University of Oxford, Integrated Master's in Mathematics (MMath)

Areas of focus:

  • Statistics
  • Machine learning and deep learning
  • Numerical methods
  • Computational biology

Projects:

  • Master's thesis: Statistical behaviour of protein folding and Markov modelling
  • Master's project: Impact of adversarial training on natural accuracy in CNNs
  • Bachelor's thesis: Novel algorithm for solving overdetermined systems of matricies

(2016 - 2018) King's College London Mathematics School

A-Levels: Mathematics (A*), Further Mathematics (A*), Physics (A)

AS-Levels: Computer Science - Python (A), Further Additional Mathematics (A)

Expertise

Python: 5+ years of experience with standards such as PEP8, type hinting, and testing. Selected libraries:

  • Data analysis and processing: Polars, Pandas, NumPy, SciPy, PySpark
  • Machine Learning: Scikit-Learn, TensorFlow, OpenAI, LangChain, PyTorch, Keras
  • Visualisations: Plotly, Matplotlib

Cloud: Set up end-to-end data pipelines from scratch and managed a migration from AWS to Azure.

  • Azure: Batch, Blob storage, App Services, Database (PostgresSQL)
  • AWS: Batch, S3, Lambda, EC2
  • GCP: BigQuery, Looker

General software engineering: Version control & CI/CD (git) with semantic versioning and conventional commits, Linux (Debian-based) on desktop and server, and containerisation (Docker).

General data science: SQL, multimodal data, particular aptitude for quantitative analysis of time series and sensor data, and building robust and explainable machine learning models.

Publications

  • J. Newman, S. Munagala, M. Fay, G. Fischer, M. Granato, L. Howard, M. Kurzyna, L. Macdonald, G. Meszaros, E. Otter, M. Stone, K. Bunclark, M. Toshner, M. Tschida, PVRI IDDI Patient Engagement & Empowerment Workstream, PH GPS Consortium, J. Pepke-Zaba. 2024. Pulmonary Hypertension Global Patient Survey: a preliminary overview. [Poster]. European Respiratory Society Congress 2024, 7 September - 11 September. Vienna, Austria.
    • Winner of European Respiratory Society & European Lung Foundation Travel Grant for Best Abstract in Patient-Centered Research
  • J. Newman, S. Munagala, M. Granato, M. Kurzyna, L. MacDonald, G. Meszaros, E. Otter, M. Stone, M. Toshner, M. Tschida, J. Pepke-Zaba. 2024. Pulmonary Hypertension Global Patient Survey: a preliminary overview. [Poster]. 7th World Symposium On Pulmonary Hypertension, 29 June - 1 July. Barcelona, Spain.
  • S. Munagala (inventor), T. Routledge (inventor), BIOStress Lab Ltd. (applicant) Measurement of Physical Stress Response. [Pending Patent] (Patent Application Number GB2402167.7) Patents Journal Number 7037, UK Intellectual Property Office, Lodged: 16 February 2024.