The Rutgers Artificial Intelligence and Data Science (RAD) Collaboratory

Undergraduate Diversity Research Experiences

Summer Undergraduate Research Fellowship

The Rutgers Artificial Intelligence and Data Science (RAD) Collaboratory in collaboration with the Office of the Vice Provost for Research (OVPR) invite Rutgers-New Brunswick undergraduate students to participate in the RAD Collaboratory Summer Undergraduate Research Fellowship program. The goal of this program is to provide Rutgers-New Brunswick rising junior and senior undergraduate students with opportunities to participate in hands-on, in-person research projects in artificial intelligence and/or data science during summer 2025. The program will run for ten weeks, from May 27, 2025 to August 1, 2025, and will include relevant programming, networking, and social events.

This program will be conducted in coordination with the Rutgers Aresty Research Center’s Summer Science Program with oversight by the RAD Collaboratory and the OVPR.

Each RAD Collaboratory Summer Undergraduate Research Fellow (Rutgers-New Brunswick rising juniors/rising seniors) will receive a $6,000 stipend. The OVPR will pay for on-campus student housing (as required). Rutgers Housing is available for selected Fellows for the duration of the program, with a move-in date of May 27, and a move-out date of August 1.  This housing would be at no cost to the selected Fellows.

Requirements:

  • In-person hands-on research project in artificial intelligence and/or data science during summer 2025 (May 27, 2025 to August 1, 2025). Research projects are to be conducted hands-on and in-person in the Faculty Mentor’s research laboratory. There is no virtual option.
  • Following the conclusion of the program, each RAD Collaboratory Summer Undergraduate Research Fellow is required to attend and present a poster about their research at the Aresty Research Center’s Summer Science Program Poster Reception (July 31, 2025).
  • Each Fellow is required to attend the 1st Annual RAD Collaboratory Research Symposium (Date TBD in September or October 2025), where they will showcase their work through lightning talks and/or posters and network across the RAD Collaboratory.
  • Students must be rising juniors and rising seniors at Rutgers-New Brunswick. To be eligible, students must be full-time and in good academic standing at Rutgers-New Brunswick.
  • Previous experience in research is a plus but not necessary.
  • If selected for this program, you cannot enroll in any summer courses or hold employment during the program period.
  • Selected students are required to attend all program events unless specified as optional.

Selection process:

  • Faculty mentors will evaluate applicants and select top candidates for interviews based on their academic records and the compatibility of their interests with faculty projects. Not all applicants will receive an interview.
  • Please refrain from contacting faculty mentors with questions or requests to discuss their research projects during the application period. Faculty mentors will conduct interviews in March, and all finalists will be notified by Friday, April 11, 2025.

Before applying, please make sure you have the following information ready:

  • Essay. PDF only. This typed 3-4 page essay should describe your qualifications for, and interest in, up to three (3) RAD Collaboratory Summer Undergraduate Research Fellowship Program AI/DS research projects (see below). The essay should also include your intended post-graduate plans and career goals.
  • Resume. PDF only. Your resume must include information about employment history, research experience, and any organizations and/or activities you have been involved in.
  • Rutgers unofficial transcript. By submitting your application, your unofficial transcript will be sent to the Rutgers Aresty Research Center’s office.

Applications to be considered as a RAD Collaboratory Summer Undergraduate Research Fellow must be submitted to the Rutgers Aresty Research Center’s Summer Science Program application portal (application is forthcoming) and are due on 11:59 pm EST on Wednesday, March 26, 2025. Applications will be reviewed by the RAD Collaboratory Executive Steering Committee, the OVPR, and Faculty Mentors. All applicants will be notified by Friday, April 11, 2025.

Please contact the Office of the Vice Provost for Research, Assistant Director for Research Development, Amy Mandelbaum (amy.mandelbaum@rutgers.edu), with any questions.

Artificial Intelligence/Data Science Research Projects

Physical AI
  • Faculty Mentor: Kostas Bekris, Computer Science, School of Arts and Sciences
  • Project Summary: Physical AI enables autonomous machines like robots and self-driving cars to perceive, understand, and perform complex actions in the real, physical world.The participating students will gain experience in working with physics-based simulation for physics-aware intelligent machines, such as robots.  They will be assisted in developing and evaluating algorithmic and machine learning approaches, e.g., such as those based on imitation and reinforcement learning, for generating autonomous motion. Any simulated systems will correspond to a physical machines that are also available in the Prof. Bekris’ lab, so solutions can also be tested on a real platform.
  • Project Website: https://pracsys.cs.rutgers.edu/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members,. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: GPA>3.6; Example majors but not a requirement: Computer Science, Electrical or Computer Engineering, Mechanical Engineering, Cognitive Science; Programming skills
  • Number of Fellowships Available: Two (2)
  • Location: Computer Science Robotics Lab, 1 Spring Street, 3rd Floor, College Avenue campus
Building RCSB.org software tools to enable breakthroughs in research and education
  • Faculty Mentor: Stephen Burley, RCSB Protein Data Bank, School of Arts and Sciences
  • Project Summary: RCSB PDB is a global online resource that provides access to atomic level information about proteins, nucleic acids, and complex macromolecular assemblies available in the PDB archive through development of tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.PDB data are crucial to users around the world; our website supports many millions of users each year. PDB data are also redistributed by ~500 external data resources, and stored for reuse inside the firewalls of all major biopharmaceutical companies and many biotechnology companies.

    Undergraduate researchers would be involved with Python-based data science software development projects that may include: creating and expanding upon Python wrappers for RCSB.org APIs (a project described in a recently published Journal of Molecular Biology paper https://doi.org/10.1016/j.jmb.2025.168970), developing tools for handing complex structural data produced by integrative experimental and computational methods (https://pdb-ihm.org/), or other data-intensive software development projects, such as tools supporting BinaryCIF.

  • Project Website: https://www.rcsb.org/
  • Fellow Responsibilities: Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: The RCSB Protein Data Bank (PDB) is looking for two rising juniors/seniors in Computer Science and/or Electrical and Computing Engineering to build data science bioinformatics tools for use by many millions of biological and biomedical researchers and their trainees worldwide. Python experience is required.
  • Number of Fellowships Available: Two (2)
  • Location: Proteomics Building, Busch campus

Never let a good crisis go to waste: AI-designed Enzymes for Sargassum deconstruction

  • Faculty Mentor: Sagar Khare, Chemistry and Chemical Biology, School of Arts and Sciences
  • Project Summary: Humans are faced with myriad pressing anthropogenic challenges for which Nature has not had sufficient evolutionary opportunity (i.e., time) to develop effective solutions, where AI-guided methods can potentially help. For example, in the last ~15 years there have been devastating blooms of the macroalga Sargassum in the Atlantic Ocean depositing millions of tons of biomass on the shores of many states and island nations, leading to untold ecological and economic damage. While we have developed highly effective enzyme-based biotechnologies for the degradation of land-based biopolymers like cellulose, there are few known naturally occurring enzymes for the degradation of marine materials like the polysaccharide polymers in Sargassum. Development of new enzymes that can effectively deconstruct these polymers would be critical components of Sargassum biorefineries that a team of us at Rutgers are building in collaboration with several institutions. These biorefineries would use Sargassum as an alternative feedstock, thereby converting an environmental crisis into an opportuity for sustainable development.Recent developments in protein structure prediction and design (recognized by the Nobel Prize in Chemistry 2024) are enabling the design of novel protein structures tailored towards new functions. However, both mechanistic understanding and tight integration with experimental feedback are essential to make the design of tailored enzymes routine and robust. High throughput experimental screens can be used to generate training data and validate design models. Effectively dealing with uncertainties and heterogeneous sources of data using multi-modal and agentic deep learning architectures remains an outstanding problem for future developments. This project will use a combination of biomolecular AI methods in conjunction with physics-based models to robustify candidate enzymes, discovered via bioprospecting efforts, for degrading the recalcitrant polymers that constitute marine macroalgae like Sargassum.

    We are developing AI-in-the-loop self-driving labs for automating the process of enzyme design. Robotic liquid handling (set up in CCB by Khare) will be used to generate data for ~100s of enzyme variants per day. Molecular simulations will bring new insights into how protein structure and dynamics enable catalytic functions of enzymes. These data will be used to develop active learning approaches for optimization using features from molecular designs and simulations. To accelerate enzyme design within our self-driving lab, we will utilize Reinforcement Learning (RL) algorithms to guide the iterative design-build-test-learn cycle, optimizing enzyme variants based on feedback from high-throughput experimental screens and physically meaningful simulations.

  • Project Website: https://sagardkharelab.org/ and https://www.schmidtsciences.org/viff/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites:Experience with programming in Python, PyTorch (or equivalent), and familiarity with the principles and architectures of deep learning models is required. Knowledge about or experience using biomolecular deep learning models e.g. AlphaFold, protein language models, protein design models and some understanding of protein structures/functions would be preferred but not required (we can help with learning those). No specific major is required or preferred. Resilience, excitement, and the eagerness/ability to learn are the most important skills we are looking for.
  • Number of Fellowships Available: One (1)
  • Location: Proteomics Building, Busch campus

Deep Learning-Driven Integration of Unpaired Single-Cell Transcriptomic and Epigenomic Data for Discovering Novel Therapeutic Targets in T Cells

  • Faculty Mentor: Jiekun Yang, Genetics, School of Arts and Sciences
  • Project Summary: Gene regulatory networks (GRNs) are crucial in understanding how genes are expressed in a cell—in response to intracellular and extracellular signals, gene transcription is dynamically regulated to coordinate cellular activities. GRNs are computational models of the regulation of gene expression, taking the form of a network or a graph, defined in mathematical terms. The basic interpretation of these models aims to capture the relationship between transcription factors (TFs) and their target genes—in a graphical model, the nodes are the genes or TFs, and the edges represent the relationship between them.From a historical context, GRNs took center stage due to the increase of experimental techniques and computational algorithms regarding GRN construction and inference. Based on the amount of transcriptomics data available, GRNs can be inferred to better understand the biological problem at hand, though the data may not capture any underlying regulatory mechanisms directly. By considering epigenetic aspects of gene regulation, such as chromatin conformation (e.g., Hi-C, HiChIP) and TF motif accessibility (e.g., chromVAR), we can generate GRNs that have the potential to better represent gene regulation in vivo. The idea of bulk profiling, or mixed measures across cell-types, can also cause problems as we cannot distinguish GRNs specific to one cell. Thus, single-cell technologies serve as a solution in allowing for the inference of GRNs across different cell types and states as well as the introduction of multi-modal profiling technologies.Single-cell RNA sequencing (scRNA-seq) is a method of measuring gene expression in single-cells – specifically for the detection and quantitative analysis of messenger RNAs, useful for studying cellular response. It allows for the assessment of transcriptional differences between individual cells in rare cell populations that would otherwise go undetected. Single-nucleus ATAC sequencing (snATAC-seq) or Assay for Transpose Accessible Chromatin allows for chromatin accessibility measurement across the genome through the transpose Tn5 to reveal regulatory genomic regions in a nuclei. While each modality can independently identify cell types and states, matching RNA and ATAC profiles remains challenging. RNA expression changes often lag behind chromatin accessibility alterations, and identical ATAC profiles can correspond to different RNA profiles due to variations in TF binding. Therefore, the joint-profiling of gene expression and chromatin accessibility is necessary for resolving bono fide GRNs and revealing new insights of the cells. However, joint-profiling data is significantly scarcer than single-modality data due to high costs and limited capture efficiency. This scarcity necessitates accurate computational methods to integrate scRNA-seq and snATAC-seq for effective GRN analysis.Integration methods aim to align cells profiled by separate technologies and project them into a shared low-dimensional space, enabling the analysis of open chromatin regions-primarily distal elements like enhancers-alongside their corresponding RNA expression levels, a challenge known as “linking”. Various integration methodologies have been developed, which can be categorized into five main approaches: (1) Matrix Factorization & Factor Analysis (e.g., LIGER); (2) Manifold Alignment & Optimal Transport (e.g., SCOT, UnionCom); (3) Deep Learning & Variational Autoencoders (VAEs) (e.g., BABEL, scJoint, GLUE); (4) Gene Activity Scoring & Feature Matching, where tools like Cicero predict gene activity scores from snATAC-seq peaks, and Seurat converts peaks to gene activity scores; and (5) Reference Mapping & Label Transfer, where tools like BindSC, FigR, and Seurat use feature correlation in various forms. Despite advancements, challenges remain in this rapidly evolving field. A key issue is the mismatch between feature spaces in different modalities, leading to potential information loss. As data volume grows, computational methods must be both scalable and accurate to capture the non-linear relationship between chromatin accessibility and gene expression. Additionally, they must be robust to high dropout rates and technical noise inherent to scRNA-seq and snATAC-seq. Addressing this situation, it is important to note the rapid increase in the application of deep learning algorithms that utilize neural networks that mimic the framework of the brain—they consist of processing layers that allow them to learn data in different layers of abstraction. Deep learning has shown improved performance with the analysis of bulk multi-omics data, as well as its ability to capture latent features from the combined high-dimensional omics feature space and flexible architecture. We prioritize scJoint and GLUE for their superior performance in benchmarking studies, particularly in handling unpaired data and addressing key integration challenges.

    In this proposal, we will evaluate scJoint and GLUE using scRNA-seq and snATAC-seq data from individual T cells in our metastatic melanoma cohort. T cells are pivotal in immunotherapy response, with “stem-like” and “exhausted” states critically influencing outcomes. We hypothesize that epigenetic regulators like TOX (linked to exhaustion) and TCF7 (associated with stemness) will emerge as key GRN hubs. Successful integration must preserve cell-subtype specificity (e.g., distinguishing effector vs. memory regulatory logic) and align with orthogonal datasets (e.g., CITE-seq). By rigorously evaluating scJoint and GLUE on our metastatic melanoma T cell dataset, this work will advance our understanding of how epigenetic regulation governs T cell functional states in tumor immunity. Successful integration will enable the identification of key TFs and enhancer-promoter interactions driving exhaustion and stem-like phenotypes, offering mechanistic insights into immunotherapy resistance.

    Furthermore, by incorporating prior knowledge (e.g., Cicero-predicted peak-gene links, chromatin conformation data or TF motif accessibility) and exploring time-series alignment or dynamical models (e.g., RNA velocity extensions) to resolve the lag between RNA and ATAC, we aim to refine these models to resolve distal regulatory connections and reduce spurious associations. We will prioritize identified targets for in vitro T cell assays and in vivo mouse models with collaborators. The resulting GRNs will not only shed light on T cell plasticity but also nominate combinatorial therapeutic targets to reinvigorate dysfunctional T cells. This approach establishes a framework for leveraging unpaired multi-omics data to decode regulatory biology in complex tissues, with broad applicability beyond cancer immunology.

  • Project Website: https://www.yangcompbio.org/research
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites:Minimum cumulative GPA of 3.0/4.0 (or equivalent), with preference for students demonstrating strong performance in relevant coursework (e.g., computer science, computational biology, immunology, cancer biology). Open to undergraduates in Computer Science, Data Science, or Bioinformatics. Students with interdisciplinary backgrounds (e.g., computational biology, systems biology) are strongly encouraged to apply. Proficiency in Python or R (required); familiarity with single-cell analysis tools (e.g., Seurat, Scanpy, scvi-tools) is a plus. Basic understanding of statistical concepts (e.g., hypothesis testing, dimensionality reduction) and machine learning principles. Exposure to genomics (e.g., RNA-seq, ATAC-seq) or coursework in molecular biology. Prior experience with omics data is advantageous but not mandatory. Exceptional candidates with gaps in specific areas but demonstrated enthusiasm for computational biology or cancer immunology will be considered, provided they commit to structured training (e.g., guided tutorials).

    Ability to troubleshoot technical challenges and interpret complex biological data. Strong written and verbal communication skills for collaborative work and presenting results. Self-motivated learners willing to engage with computational literature and rapidly acquire new skills (e.g., deep learning frameworks like PyTorch).

  • Number of Fellowships Available: One (1)
  • Location: Life Sciences Building, Busch campus

Mapping Local and Systemic Cell-Cell Communication in Mouse Embryogenesis: An Integrative Framework for Spatial Transcriptomics

  • Faculty Mentor: Jiekun Yang, Genetics, School of Arts and Sciences
  • Project Summary: All cells contain essential information in the form of messenger RNA (mRNA), which determines both individual cell function and overall tissue organization. The genetic information encoded in mRNA, known as the transcriptome, can be analyzed using single-cell and spatial transcriptomics.

    Single-cell transcriptomics provides high-resolution insights into gene expression by analyzing the transcriptome one cell at a time. However, since this method requires cell isolation, it results in a loss of spatial information, including the cell’s original location, neighboring cells, and the interactions contributing to tissue function. These spatial relationships are crucial for understanding tissue organization and cellular behavior, necessitating alternative techniques to preserve them.

    Spatial transcriptomics overcomes this limitation by mapping gene expression within intact tissues, offering insights into biological and pathological processes at both the cellular and tissue levels. Cell-cell interactions (CCIs) regulate key processes such as growth, differentiation, and tissue development. Traditional methods for studying CCIs were limited in scale and lacked spatial resolution. However, spatial transcriptomics enables a more comprehensive mapping of these interactions by leveraging ligand-receptor interaction (LRI) inference tools.

    Computational approaches for LRI inference involve identifying cell types and statistically assessing potential ligand-receptor (LR) pairs within spatially resolved tissue sections. These methods typically compare observed interactions against curated databases of known ligand-receptor pairs to determine their functional significance. Since CCIs primarily occur between proximal cells, additional filtering steps refine the dataset to identify biologically meaningful interactions.

    Several strategies exist for inferring LRIs from spatial transcriptomics data. Correlation-based methods, such as Spearman correlation, assess co-expression of ligand and receptor genes across spatial locations (e.g., ScHOT and SpatialCorr). Alternatively, optimal transport-based algorithms incorporate spatial distance as a cost factor, modeling numerous LR pairs simultaneously, identifying signaling gradients and summarizing directional communication (e.g., COMMOT). Other notable tools include: SpatialDM, testing LR co-expression using a bivariate Moran’s I statistic to measure spatial dependency; Niche-LR, identifying LR signaling that underlies genes differentially expressed in spatially defined niches; and Copulacci, a count-based model accounting for dependencies between the expression of ligands and receptors from nearby spatial locations even when the transcript counts are low.

    Each of these methods has distinct advantages and limitations. Correlation-based approaches, like Spearman correlation, are computationally efficient and straightforward, making them useful for rapid assessments of expression patterns. However, they may capture false associations and fail to account for spatial constraints. Optimal transport-based approaches, such as COMMOT, effectively model proximity-dependent interactions and capture directional signaling (e.g., source-to-target) but require precise spatial coordinates and high-quality data and are computationally intensive. Database-driven methods, like SpatialDM, provide results by scanning tissues for known ligand-receptor pairs, but their accuracy depends on the completeness of existing interaction databases, potentially leading to missed interactions.

    To systematically evaluate the performance of COMMOT, SpatialDM, Niche-LR, and Copulacci in capturing CCIs during mouse embryogenesis, we will apply these tools to our in-house spatial transcriptomics dataset, which spans critical developmental stages (e.g., gastrulation, organogenesis). Each method will be assessed on its ability to recover known developmental signaling pathways (e.g., Wnt, FGF, Hedgehog) and spatially resolved LR pairs validated in prior literature. For example, COMMOT’s optimal transport framework will be tested for modeling morphogen gradients (e.g., BMP4 in ectoderm patterning), while SpatialDM’s Moran’s I statistic will evaluate spatial autocorrelation of LR pairs like Notch-Delta in somite formation. Performance metrics will include spatial coherence (e.g., overlap with embryonic anatomical landmarks), consistency across biological replicates, and computational efficiency. Challenges such as false positives from auto-/juxtacrine signaling will be mitigated by filtering interactions based on spatial distance thresholds (e.g., excluding pairs within the same cell/spot).

    Building on these benchmarks, we will adapt the four tools to model distal interactions, including endocrine and metabolic signaling, which are critical in embryogenesis but poorly captured by existing spatial methods. For endocrine signaling (e.g., hormonal crosstalk between placenta and embryo), we will incorporate ligand diffusion models into COMMOT’s optimal transport framework and extend Niche-LR’s niche definitions to include distal cell types (e.g., hormone-producing cells in extraembryonic tissues). To address metabolic signaling (e.g., nutrient exchange between yolk sac and embryo), we will integrate metabolite spatial imaging data with Copulacci’s copula models to infer dependencies between metabolic enzyme expression and nutrient transporter genes. SpatialDM’s bivariate Moran’s I will be modified to account for long-range correlations (e.g., insulin-like growth factor signaling across tissues). Validation will leverage knockout mouse models with disrupted endocrine/metabolic pathways (e.g., Igf2 mutants) to test predicted interaction losses. By augmenting these tools with spatial multi-omics data and dynamic modeling, our framework will advance the study of both local and systemic interactions in development.

    This dual approach will advance spatial transcriptomics beyond niche-centric analysis, providing a unified framework to dissect both local and organism-wide communication driving development.

  • Project Website: https://www.yangcompbio.org/research
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites:Minimum cumulative GPA of 3.0/4.0 (or equivalent), with preference for students demonstrating strong performance in relevant coursework (e.g., computer science, computational biology). Open to undergraduates in Computer Science, Data Science, or Bioinformatics. Students with interdisciplinary backgrounds (e.g., computational biology, systems biology) are strongly encouraged to apply. Proficiency in Python or R (required); familiarity with single-cell analysis tools (e.g., Seurat, Scanpy, Squidpy) is a plus. Basic understanding of statistical concepts (e.g., hypothesis testing, dimensionality reduction). Exposure to genomics (e.g., RNA-seq) or coursework in molecular biology. Prior experience with omics data is advantageous but not mandatory. Exceptional candidates with gaps in specific areas but demonstrated enthusiasm for computational biology will be considered, provided they commit to structured training (e.g., guided tutorials).

    Ability to troubleshoot technical challenges and interpret complex biological data. Strong written and verbal communication skills for collaborative work and presenting results. Self-motivated learners willing to engage with computational biology literature and rapidly acquire new skills.

  • Number of Fellowships Available: One (1)
  • Location: Life Sciences Building, Busch campus

Understanding social media consumption on Telegram and YouTube

  • Faculty Mentor: Kiran Garimella, Library and Information Science, School of Communication & Information
  • Project Summary: We have collected Telegram and YouTube data from hundreds of users during the 2024 US elections. The aim of this project is to analyze the data and understand the types of information Americans consume during the elections. The project will require handling large amounts of social media data, and with advanced Natural language processing and ML skills to understand and categorize the content.
  • Project Website: https://comminfo.rutgers.edu/garimella-kiran
  • Fellow Responsibilities: Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: Understanding social media consumption on Telegram and YouTube
  • Number of Fellowships Available: One (1)
  • Location: School of Communication & Information, College Avenue campus

Evaluating the Robustness of Large Language Models Across Languages and Modalities

  • Faculty Mentor: Vivek Singh, Library and Information Science, School of Communication & Information
  • Project Summary: The robustness of Large Language Models (LLMs) is a critical area of study for advancing the reliability and versatility of AI applications. This project aims to comprehensively investigate the robustness of LLMs across different languages and modalities. By analyzing performance variations and identifying potential vulnerabilities, we seek to establish a framework for enhancing the robustness of future AI systems. Special attention will be given to lower-resourced languages, where the need for robust AI solutions is particularly significant. The insights gained from this study will provide valuable guidance for developing more resilient and adaptable AI technologies, ensuring their effectiveness in diverse linguistic and multimodal contexts.
  • Project Website: https://sites.comminfo.rutgers.edu/vsingh/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: Python programming. GPA> 3.2. Interest in Large Language Models, experience is optional. Interest in cyber security, experience is optional.
  • Number of Fellowships Available: One (1) or two (2)
  • Location: School of Communication & Information, 4 Huntington Street, College Avenue campus

Trustworthy and Scalable AI Model Training Through Robust Decentralized Learning

  • Faculty Mentor: Waheed Bajwa, Electrical and Computer Engineering, School of Engineering
  • Project Summary: As artificial intelligence (AI) applications continue to grow in scale, centralizing data for model training becomes increasingly impractical. This is especially true in domains where data is generated at the edge, such as autonomous vehicles, smart cities, healthcare monitoring, and large-scale sensor networks. In these applications, transmitting vast amounts of data to a central server for training is both inefficient and impractical due to bandwidth constraints, latency requirements, and privacy concerns. As a result, distributed and decentralized AI frameworks have emerged as essential alternatives, enabling AI models to be trained directly on edge devices while aggregating insights across a broader network.Despite their advantages, decentralized AI frameworks remain highly vulnerable to adversarial attacks. Unlike centralized learning, where data integrity can be more easily enforced, decentralized systems must aggregate model updates from multiple, often untrusted, sources. Research in robust decentralized learning has demonstrated that even a single malicious participant can significantly disrupt the learning process, skewing model updates and leading to arbitrary or harmful outcomes. These vulnerabilities pose serious challenges, particularly in safety-critical applications like autonomous driving, medical diagnostics, and financial forecasting. Ensuring the integrity of decentralized AI systems requires the implementation of rigorous screening mechanisms rooted in robust statistics, which help detect and mitigate adversarial manipulations without compromising efficiency or scalability.The INSPIRE Lab has been at the forefront of adversarially resilient AI, with support from the Army Research Office (ARO) and the National Science Foundation (NSF). This project builds on that foundation by exploring robust decentralized AI techniques that enhance both scalability and security in AI training. The research will focus on developing methodologies that ensure AI models trained in decentralized environments remain reliable, even in the presence of adversarial participants.

    Throughout the project, the participating student will first explore various distributed and decentralized AI algorithms, gaining a strong theoretical foundation in the mathematical and statistical principles that underpin these methods. This includes an introduction to key concepts in federated learning, consensus algorithms, Byzantine resilience, and robust aggregation techniques. Next, the student will implement these algorithms in Python using libraries such as TensorFlow, PyTorch, and Keras, gaining practical experience with training AI models in distributed environments.

    Once they develop a foundational understanding, the student will investigate different types of adversarial attacks that threaten decentralized AI systems, including model poisoning, data poisoning, and backdoor attacks. These insights will guide the application of robust statistical techniques, such as robust mean estimation, trimmed averaging, and median-based aggregation, to develop defense mechanisms that enhance system resilience. The student will also analyze trade-offs between different robustness techniques, learning how to balance computational efficiency, model accuracy, and security in real-world deployments.

    To bridge theory with practice, the student will implement their robust decentralized AI models in real-world testbeds and kits at Rutgers WINLAB, including ORBIT, COSMOS, and the Qualcomm Innovators Development Kit. These platforms will allow for experimentation in edge computing environments, where AI models are trained and deployed on distributed nodes with limited communication and processing capabilities. By simulating real-world challenges, such as unreliable network links, heterogeneous hardware, and adversarial interference, this phase will provide critical insights into the practical implementation of robust decentralized AI.

    The outcomes of this project will generate preliminary experimental data for real-world applications, directly contributing to the INSPIRE Lab’s next research proposal on adversarially resilient decentralized learning. The project will not only advance theoretical understanding but also produce practical methodologies that can be applied in domains such as autonomous systems, cybersecurity, and intelligent edge computing. Additionally, this hands-on experience will provide the student with valuable technical skills, exposure to state-of-the-art research, and professional growth opportunities in the field of AI security and distributed learning.

  • Project Website: http://www.inspirelab.us/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: The INSPIRE Lab is seeking an undergraduate researcher who meets the following qualifications: Majoring in Electrical and Computer Engineering, Computer Science, or Mathematics; Completed coursework in probability theory, linear algebra, and machine learning; Proficient in Python; experience with PyTorch and/or TensorFlow is a plus; Maintains a minimum GPA of 3.5.
  • Number of Fellowships Available: One (1)
  • Location: INSPIRE Lab, CoRE Building Room 729, Busch campus

Single-Cell Transcriptomic Cell Type Annotation Using AI for Spinal Cord Injury Study

  • Faculty Mentor: Li Cai, Biomedical Engineering, School of Engineering
  • Project Summary: This summer project focuses on applying artificial intelligence (AI) to annotate cell types from single-cell transcriptomic data in the context of spinal cord injury (SCI). The Cai lab has demonstrated that GSX1 gene therapy effectively restores locomotor and sensory functions in rodent models of SCI. The project is directly related to Biomedical Informatics and Health AI, aiming to enhance our understanding of cellular responses and mechanisms underlying SCI. In addition, we have generated extensive single-cell transcriptomic datasets for the proposed study and have developed a computational pipeline for functional gene discovery. The project seeks to identify and classify distinct cell populations that responded to GSX1 gene therapy by leveraging AI techniques, providing insights that could inform therapeutic strategies and improve patient outcomes.
  • Project Website: https://sites.rutgers.edu/cailab/research/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: GPA => 3.5 Background in Biomedical Sciences: Understanding basic concepts in biology, particularly cell biology and neuroscience. Programming Skills: Proficiency in programming languages, e.g., R and Python, with experience in data analysis and machine learning libraries. Data Analysis: Familiarity with bioinformatics tools and techniques for analyzing single-cell RNA sequencing and single-nucleus RNA-sequencing (sc/nRNA-seq) data. AI and Machine Learning: Knowledge of AI algorithms and their application in biological data analysis. Communication Skills: Effectively communicating scientific findings through written reports and presentations.
  • Number of Fellowships Available: One (1)
  • Location: Biomedical Engineering Building, 599 Taylor Road, Busch campus

AI for Good: Predictive Models for Environmentally Responsible Blue Economy Planning 

  • Faculty Mentor: Ahmed Aziz Ezzat, Industrial and Systems Engineering, School of Engineering
  • Project Summary: Blue economy applications, including offshore energy, marine energy, fishing and shipping, constitute an important sector of economic development off of US coastal waters, especially in the U.S. East Coast. Yet, the same region is home to a rich marine environment where several marine mammal species co-exist. Reconciling our economic development goals (including offshore energy, shipping, etc.) with ocean and biodiversity conservation goals can largely benefit from high-resolution models that can leverage the large and multi-modal datasets available about marine mammals and the ocean environment. To do so, we have formed a multidisciplinary team at Rutgers, which constitutes a mix of ML engineers and ocean scientists. We have already developed a suite of machine learning models that can learn the complex relationships between ocean and environmental covariates, and between high-resolution marine mammal data [1]. This project will consider exploring the merit of expanding those models to include richer and multi-resolution data types as inputs to the models, including prey distribution data collected using autonomous gliders, spatial-temporal oceanographic model outputs, satellite image data about key oceanographic/environmental information, among others.[1] J. Xi et al., Machine learning for modeling North Atlantic right whale presence to support offshore wind energy development in the U.S. Mid-Atlantic, Nature Scientific Reports, 2024, Link: https://www.nature.com/articles/s41598-024-80084-z 
  • Project Website: https://sites.rutgers.edu/azizezzat/research/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: Evidence of programming (Python or R), demonstrated through, for example, coursework, independent projects, Github repositories. Pursuing an undergraduate degree in Engineering, Computer Science, or a related field. GPA > 3.5.
  • Number of Fellowships Available: One (1)
  • Location: Richard Weeks Hall of Engineering, Busch campus

The role of astrocytes and myelin in regulating the functionality of injured neurons

  • Faculty Mentor: Assimina Pelegri, Mechanical and Aerospace Engineering, School of Engineering
  • Project Summary:This proposal is a novel investigation of the role of astrocytes and myelin in regulating the functionality of injured neurons that combines experimental data from Atomic Force Microscopy (AFM) measurements with in-silico computational (finite element and machine learning) studies. Injury to axons in white matter (WM) remains a primary cause of many of the functional deficits that follow traumatic brain injury (TBI). Understanding how axons are injured mechanically will pave the way for designing measures to prevent their injury.

    WM is, however, complex. Composed of oriented, wavy, fibrous axons that are embedded within a glial matrix. To date, much of the research concerned with modeling TBI considers the brain to be a homogenous bulk material even though it is injury to individual axons that collectively lead to loss of function. Furthermore, very little is known about the role of astrocytes in protecting and repairing the axons after a TBI event.

    Based on published AFM measurements of neurons and astrocytes, we will develop a finite element (FE) model of an axon surrounded by myelin and attached to a neuron and astrocyte. The axon will be modeled using Bézier curves which allow generating fiber paths with nonlinear angle variation. The material properties for the axon, astrocyte and neuron will be calibrated using force-displacement data from published AFM data. The material model comprises a hyperelastic network accounting for the instantaneous cell response and viscoelastic components capturing the strain rate effects at both short- and long-time scales. Using an optimization algorithm, the parameters of the material model will be tuned to match the measured force.

    Within the validated model, we will introduce variations in the diameter of the axon, modify its path by altering the Bezier curves, introduce and vary the amount of myelin, vary the astrocyte volume, among others. We will run FE simulations for each variation in the model and extract the output force. The measured output force should reflect the changes in the about parameters. The simulations will serve as training data for a deep learning algorithm that ultimately replaces the FE models. The objective is to explore the use of Recurrent Neural Networks (RNN), a type of artificial neural network to process time series data of the measured force mapped to the FE geometry. The RNN will serve as a surrogate model in predicting the mechanical response of the axons, neuron and astrocyte.

  • Project Website: https://pelegri.rutgers.edu/home
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites:Required: Coursework in Linear Algebra, Multivariate Calculus, Probability theory. Able to write code (Preferred Python)

    Nice to have: Previous experience or coursework on developing Neural Networks: Keras, TensorFlow, Pytorch. Experience is using Abaqus FEA.

  • Number of Fellowships Available: One (1)
  • Location: Engineering Building B-225, Busch campus

Atomistic tool for identification and characterization

  • Faculty Mentor: Ryan Sills, Materials Science and Engineering, School of Engineering
  • Project Summary:The purpose of this NSF-funded project is to develop a software tool to accelerate analysis of molecular dynamics simulation data. This software tool will be data-driven in nature, using training data obtained from targeted molecular dynamics simulations. The main challenge of the project is to develop a software workflow and machine learning approach that is flexible and extensible, so that additional analysis capabilities can be continuously added. The summer intern on this project would explore the performance of a variety a data-driven classification techniques, such as random forests, support vector machines, and neural networks, on chosen molecular dynamics analysis tasks.
  • Project Website: https://mmod.rutgers.edu/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: Any engineering major, physics, or chemistry. Must be comfortable with programming in MATLAB or Python.
  • Number of Fellowships Available: One (1)
  • Location: Richard Weeks Hall of Engineering, Busch campus

Transforming modeling with physics-informed neural networks 

  • Faculty Mentor: Ryan Sills, Materials Science and Engineering, School of Engineering
  • Project Summary: PI Sills has recently invented a technique for training neural networks using a simulation technique called the finite element (FE) method. This technique makes it possible to train neural networks that can then replace traditional simulations, thereby providing solutions in a fraction of the time. This will revolutionize engineering design by making simulations faster and cheaper than ever before. An important aspect of the method is the evaluation of the so-called inverse isoparametric map. The goal of this project is to development efficient neural network surrogates to evaluate this mapping, thereby accelerating the overall method.
  • Project Website: https://mmod.rutgers.edu/
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: Must be comfortable with coding in MATLAB and/or Python
  • Number of Fellowships Available: One (1)
  • Location: Engineering A-wing, Busch campus

AI-Driven Pipeline Corrosion Management for Digital Twin Integration 

  • Faculty Mentor: Hao Wang, Civil and Environmental Engineering, School of Engineering
  • Project Summary: Pipeline systems are critical components of infrastructures, ensuring the reliable energy transport for oil, oil products, natural gas, and hydrogen fuel. This project aims to develop an AI-driven method to advance pipeline integrity management on preventing and mitigating corrosion. It will develop deep learning algorithms with generative AI and transfer learning for analyzing remote sensing and computational modeling outputs in order to predict pipe corrosion potential and mitigate leakage or burst failure. The project outcome can help pipeline operators and energy sectors evaluate corrosion risk under variable environmental conditions more efficiently and reliably and facilitate the future development of digital twin for pipeline infrastructure.
  • Project Website: https://primis.phmsa.dot.gov/matrix/PrjHome.rdm?prj=1018&s=8A63DEFD2FF4492D99EEE1B552BD85B2&c=1
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: GPA above 3.5. Major in Data Science, Electrical and Computer Engineering, Computer Science, Civil and Environmental Engineering, Industrial Engineering, or Mechanical Engineering.
  • Number of Fellowships Available: Up to two (2)
  • Location: Richard Weeks Hall of Engineering, Busch campus

Remote sensing of air pollution powered by AI  

  • Faculty Mentor: Xiaomeng Jin, Environmental Sciences, School of Environmental and Biological Sciences
  • Project Summary: Air pollution is identified as a leading risk factor for global disease burden. A major limitation to advancing our understanding of the cause and impacts of air pollution, is the lack of observations with the spatial and temporal resolution needed to observe variability in emission, chemistry and population exposure. Satellite remote sensing measures the radiation reflected or emitted from the earth, which can inform the distributions of air pollutants by tracking their spectral signatures. There is growing interest to use AI techniques to translate satellite measurements to air pollution exposure. The project will address three main challenges that limit the applications of AI techniques for air pollution research: (1) sparsely and unequally distributed ground-based measurements for training and testing, (2) limited capability of AI models to predict extreme values, and (3) disagreements among different AI models. Students are encouraged to choose one of these three directions and dive deeper to explore how state-of-science AI techniques can be applied to allow better prediction of air pollution. Students may be offered with hourly research assistant positions after they successfully complete the summer fellowship.
  • Project Website: https://xjin49.github.io
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: Major: Computer Science, Environmental Sciences/Engineering, or Meteorology. GPA: >3.5. Programming: Python. Work on site, full time (40 hours/week) during the fellowship. Willingness to learn environmental sciences related domain knowledge. Experiences with big data and high-performance computing system are a plus.
  • Number of Fellowships Available: One (1)
  • Location: Environmental and Natural Resource Sciences building, Cook campus

Machine Learning Approaches to Climate Data Bias Correction and Downscaling

  • Faculty Mentors: Lili Xia, Environmental Sciences, School of Environmental and Biological Sciences and Zhao Zhang, Electrical and Computer Engineering, School of Engineering
  • Project Summary: It is essential to evaluate the impacts of future climate change and solar climate intervention on human societies and ecosystems. However, General Circulation Models (GCMs), such as the Community Earth System Model version 2 (CESM2), simulate future climate changes at a coarse spatial resolution, which limits our ability to assess regional climate impacts—particularly in areas like agriculture, water resources, and air quality.
    Traditional bias correction and downscaling methods include statistical techniques and dynamic downscaling using Regional Climate Models (RCMs). While RCMs can provide finer spatial details, they are computationally expensive. On the other hand, conventional statistical methods often lack the accuracy needed to capture extreme weather events, which are critical for understanding future climate risks.
    This project proposes applying recently published machine learning algorithms (Wang and Di, 2024; Sash and Ravela, 2024) to bias-correct and downscale CESM2 outputs. As a first step, we will focus on Africa, one of the most vulnerable regions to climate change impacts. The study will use the ERA5 dataset from the European Centre for Medium-Range Weather Forecasts (ECMWF), covering the period 1979–2014 at a spatial resolution of 0.25°, as the observational reference.
    The CESM2 simulations include historical runs (1979–2014) at 1° resolution and future projections under the SSP2-4.5 scenario (2015–2069). We will train the machine learning models using the observational ERA5 data and CESM2 historical simulations, then apply the trained algorithms to biascorrect and downscale CESM2 future climate projections. The outcome will be a high-resolution, bias-corrected dataset better suited for assessing regional impacts of future climate change in Africa.
    This project represents an important first step in applying machine learning methods to improve climate model outputs for regional impact studies. Building on this work, we aim to extend the application of these methods to produce high-resolution, high-quality climate data for future climate change scenarios as well as solar climate intervention simulations. The resulting datasets will be made freely available to the global research community, including impact modeling groups in the Global South and other developing countries. We hope this effort will contribute to establishing an open-access platform to support global assessments of climate change and solar climate intervention impacts, promoting more inclusive and equitable research and decision-making.
  • Project Website: https://people.envsci.rutgers.edu/lilixia/ and https://www.ece.rutgers.edu/zhao-zhang
  • Fellow Responsibilities: Read key background literature related to the project before the start date. Work under the supervision of the faculty mentor or graduate student/postdoctoral mentor. Meet with the faculty mentor on a weekly basis to discuss research progress. Participate in weekly group meetings with research group members. Complete final report summarizing their research activities.
  • Fellow Requirements/Pre-requisites: GPA: 3.5; Major: Computer Science, Electrical and Computer Engineering, Data Science, and other related majors; Specific skills: Linux system; Python; data analysis skills
  • Number of Fellowships Available: One (1)
  • Location: Environmental and Natural Resource Sciences building, Cook campus and Computing Research & Education (CoRE) building, Busch campus