Career Profile

  • Data Scientist with 12+ years of programming experience and 4+ years of experience in data analytics and visualization. The majority of my research projects involve analyzing over terabytes worth of biomedical text for downstream natural language processing (NLP) tasks such as corpora comparison and text extraction.

Education

  • Ph.D. in Genomics and Computational Biology

    June 2022
    University of Pennsylvania
  • Postbaccalaureate Program (Penn Prep)

    June 2016
    University of Pennsylvania
  • B.S. in Computer Science and Minor in Bioinformatics

    May 2015
    University of Maryland Baltimore County

Publications

Research Experience

  • Data Scientist

    June 2022 - Present
    Digital Science
    • Used textual analysis to help government funders and clients understand of research trends and emerging topics
  • Graduate Researcher

    August 2016 - May 2022
    University of Pennsylvania
    • Designed and implemented parallel processing pipelines that achieved 3x speed up on analyzing terabytes worth of biomedical text.
    • Used weak supervision for a 1.5x speedup on training deep learning models (recurrent neural networks and transformers) to extraction biomedical relationships from biomedical text.
    • Applied a k-nearest-neighbor model to provide scientists with a web service that identifies a listing of journals linguistically similar to a preprint of interest.
    • Applied time series analysis techniques to discover over 20,000 different timepoints where words have changed their semantic meaning.
  • Postbaccalaureate Researcher

    June 2015 - June 2016
    University of Pennsylvanias
    • Used hypothesis testing (hyper geometric test) to discover over 1000 protein domains that are easily targetable by small molecules and drugs.
    • Constructed a bioinformatics pipeline that efficiently discover novel motifs in the Golden Orb-weaver spider genome.
  • Undergrad Researcher

    September 2013 - May 2015
    University of Maryland Baltimore County (UMBC)
    • Characterized population-level transcriptional regulation by assisting with the creation of a bioinformatic pipeline that quantifies transcription factor enrichment in metagenomic data.
  • Summer Research Intern

    June 2014
    University of Pennsylvania (SUIP)
    • Created a Perl pipeline that utilized Mendelian Randomization and Approximate Bayesian Computation to discover if having an elevated level of triglycerides causes heart disease.
  • Summer Research Intern

    June 2013
    Harvard University and Massachusetts Institute of Technology Bioinformatics and Integrative Genomics (BIG)
    • Explored a more efficient measure to track DNA samples. Researched the use of SNP (single nucleotide polymorphisms) information to act as a DNA barcode to keep track of patient samples that have undergone different gene sequencing workflows.
  • Summer Research Intern

    June 2012
    University of Pittsburgh and Carnegie Mellon University (TecBio)
    • Conducted machine learning to assess algorithms to determine the best strategies to identify lung cancer in patients as early as possible.

Teaching Experience

Advised Rotation student for Research Lab

Sept 2020 - Dec 2020
University of Pennsylvania (Genomics and Computational Biology Program)
  • Guided rotation student on conducting and executing a research project that analyzed biomedical abstracts to model disease-gene trajectories through time.

Student Advising for 1st and 2nd Year Ph.D. Students

August 2018 - August 2020
University of Pennsylvania (Genomics and Computational Biology Program)
  • Advised first and second year Ph.D. students on which classes to take for the fall and spring semester
  • Advised first year Ph.D. students about the mechanics of lab rotations

Python BootCamp/Teaching Assistant

August 2019
University of Pennsylvania (Genomics and Computational Biology Program)
  • Assisted in teaching Ph.D. students how to program in python.

Advanced Computational Biology/Tutor

April 2019 - May 2019
University of Pennsylvania (Genomics and Computational Biology Program)
  • Assisted a Ph.D. student in learning advanced computational biology topics.
  • Topics ranged from machine learning algorithms to various statistical algorithms

Python BootCamp/Teaching Assistant

September 2017
University of Pennsylvania (Genomics and Computational Biology Program)
  • Assisted in teaching Ph.D. students how to program in python.

Data Structures/Tutor

Sept 2013 - May 2014
University of Maryland Baltimore County (UMBC)
  • Tutored and assisted students in studying/programming various data structures such as binary search trees to hash tables.

Math Tutor

Sept 2012 - May 2013
University of Maryland Baltimore County (UMBC)
  • Worked in walk-in tutoring sessions for university services.
  • Tutored students in math classes from Algebra I to Calculus II

Calculus I and II/Learning Assistant

Sept 2012 - May 2013
University of Maryland Baltimore County (UMBC)
  • Helped professor assist students in homework during office hours

Presentations/Workshops

Elsevier's Labs Online Lecture services

October 2021
  • 30 Minute Talk

Institute for Biomedical Informatics (IBI) Annual Retreat

December 2020
  • Poster Presentation

Cold Spring Harbor Biological Data Science Symposium

November 2020
  • Lightning Talk and Poster Presentation

National Human Genome Research Institute National Trainee Meeting

March 2020
  • Poster Presentation

Seminar at Elsevier

December 2019
  • Invited Speaker

International Society of Computational Biology (ISCB) Rocky

December 2019
  • Poster Presentation

Institute for Biomedical Informatics (IBI) Annual Retreat

June 2019
  • Poster Presentation

International Society of Computational Biology (ISCB) Rocky

December 2018
  • Poster Presentation

Computational Systems for Integrative Genomics (CSIG)

July 2017
  • Lightning Talk

UMBC’s Undergraduate Research and Creative Achievement Day

Spring 2014
  • Poster Presentation

Attended Harvard’s Biomedical Science Careers Student Conference

April 2014

Participated in MIT’s Quantitative Biology Workshop

Jan 2014

Annual Biomedical Research Conference for Minority Students

November 2013

Honors/Awards

Appointed trainee on T32 Computational Genetics

June 2019 - August 2021

National Human Genome Research Institute (NHGRI)

Meyerhoff Scholar (M23)

August 2011 - June 2015

University of Maryland Baltimore County

Marc U*Star Scholar

Sept 2014 -June 2015

University of Maryland Baltimore County

National Security Agency (NSA) Scholar

Sept 2012 - June 2014

University of Maryland Baltimore County

Thomson Reuters Award HackMIT

October 2014

ABRCMS Poster Presentation Award

November 2013

Skills & Proficiency

  • Github
  • Python
  • R
  • SQL
  • Data Visualization
  • Machine Learning
  • Deep Learning, Transformers
  • Natural Language Processing
  • Bayesian Modeling
  • Data Structures
  • Algorithms
  • Parallel Processing
  • XML parsing
  • Web Development, HTML/CSS