Research Projects

Here is a summary of the three main research projects I’m working on right now. Each one is a blend of physics goals, ML methods, and practical software work to make pipelines reliable and reproducible.

Pileup Mitigation with Flow Matching

Expand AI Research Group - University of Puerto Rico at Mayagüez

This project combines two ideas: machine-learning-based pileup mitigation and flow-based fast simulation. Pileup refers to additional unwanted proton-proton collisions that occur in the same beam crossing as the hard-scattering event of interest. To address this, we study pileup-cleaning methods such as PUMML , which uses convolutional neural networks to recover the leading-vertex neutral energy inside jets.

At the same time, we explore FlowSim , an end-to-end fast-simulation approach based on Flow Matching and Continuous Normalizing Flows. Instead of following the full conventional simulation chain — event generation, GEANT4-based detector simulation, digitization, reconstruction, and final analysis ntuples — the goal is to learn a direct mapping from generator-level information to realistic analysis-level observables.

Our direction is to make this end-to-end simulation more realistic by including pileup in the generated events and studying how well different mitigation strategies recover clean physics observables.

Pileup Mitigation Flow Matching / CFM Fast Simulation
Goal
Build a pileup-aware flow-based simulation pipeline
Generate realistic analysis-level observables while studying how pileup affects jet structure and event-level physics distributions.
What I Focus On
Compare pileup mitigation strategies
Validate observables with no pileup, with pileup, with PUPPI, with PUMML, and with FlowSim-based mitigation to quantify performance and failure modes.
Who is involved?
Guided by Dr. Arghya Chattopadhyay . Team members include Mateo E. Lisondo di Tada , Hugo Torres, and Iliomar Rodriguez Ramos.

Modeling CMS Run 3 Trigger Efficiency for Emerging Jets

Emerging Jets Analysis Research Group - CMS Collaboration

This project focuses on modeling the trigger efficiency for the CMS Run 3 Emerging Jets analysis. Emerging jets are a long-lived-particle signature motivated by dark-sector models, where dark hadrons decay back to Standard Model particles at displaced locations inside the detector, producing jets with unusual track and vertex structure. This signature was originally proposed in the Emerging Jets paper .

In earlier Emerging Jets strategies, the trigger efficiency could often be approximated using global event variables such as HT. For Run 3, the analysis uses a broader LLP trigger program instead of relying only on standard HT-based triggers. The CMS Run 3 detector and trigger upgrades are described in the CMS Run 3 detector paper , while the dedicated Run 3 long-lived-particle trigger strategy is described in the CMS LLP trigger program paper .

Because each trigger responds to different reconstructed features, the combined trigger efficiency becomes a high-dimensional function of correlated observables. The goal of this work is to use neural networks to model that efficiency in a way that is accurate, validated, and useful for the final physics analysis.

CMS Experiment Emerging Jets LLP Triggers Run 3 Trigger Efficiency
Goal
Build a physics-ready Run 3 trigger-efficiency model
Learn the combined efficiency of multiple LLP-sensitive triggers as a function of reconstructed event and jet-level observables.
Why neural networks?
Run 3 efficiency is high-dimensional
Unlike a simple one-dimensional HT turn-on curve, the Run 3 LLP trigger strategy depends on several correlated displaced-object, jet, calorimeter, and muon-system features.
What I focus on
Training, validation, and uncertainty-aware evaluation
I work on robust model training, overtraining checks, efficiency validation plots, comparison across signal regions, and strategies for estimating modeling uncertainties.

NMF-Based Anomaly Detection for CMS Tracking Monitoring Elements

ML4DQM Group

This project applies Non-negative Matrix Factorization (NMF) to CMS Tracking DQM monitoring elements in order to identify anomalous behavior in detector and reconstruction outputs. The focus is on tracking occupancy histograms, especially two-dimensional monitoring elements such as track distributions in the η–φ plane.

The method uses good-reference runs to learn a compact set of non-negative components that describe typical tracking behavior. Evaluation runs are then reconstructed using this learned basis, and anomalies are identified through reconstruction differences, residual patterns, and localized structures that are not well described by the reference model.

My work includes preparing DQMIO inputs, applying tracking DCS-on filters, normalizing histograms per lumisection, training the NMF model on reference runs, evaluating known bad runs, and testing the response of the method by injecting artificial holes or localized defects into occupancy maps.

CMS DQM Tracking NMF Anomaly Detection
Goal
Detect anomalous tracking behavior with NMF
Learn typical tracking occupancy patterns from good runs and use reconstruction residuals to flag unusual detector or reconstruction behavior.
What I focus on
Tracking occupancy histograms per lumisection
I process DQMIO monitoring elements, apply DCS-on selections, normalize each lumisection, and evaluate how well NMF reconstructs good and problematic runs.
Validation strategy
Known bad runs and injected defects
I test the method on bad runs and controlled artificial anomalies, such as injected holes, to study whether the residual maps highlight physically meaningful problem regions.