David Nicholson
Career Profile
-
Data Scientist with 12+ years of programming experience and 4+ years of experience in data analytics and visualization. Most of my research projects involved analyzing terabytes worth of biomedical text for downstream Natural Language Processing (NLP) tasks such as corpora comparison and text extraction. Recently, I have shifted focus onto projects designed to perform bibliometric analyses for government funding clients.
Education
Publications
Research Experience
-
- Uses textual analysis to help government funders and clients understand research trends and emerging topics
- Worked with machine learning models to perform topical analysis on research grants, publications, etc.
- Used Dash to build dashboards that narrate results for government funding clients
-
- Designed and implemented parallel processing pipelines that achieved a 3x speed-up when analyzing terabytes of biomedical text.
- Used weak supervision for a 1.5x speed-up when training deep learning models (recurrent neural networks and transformers) to extract biomedical relationships from biomedical text.
- Applied a k-nearest-neighbor model to provide scientists with a web service that identifies a listing of journals linguistically similar to a preprint of interest.
- Applied a time series analysis to discover over 20,000 different timepoints where words have changed their semantic meaning.
-
- Used hypothesis testing (hyper geometric test) to discover over 1000 protein domains that are easily targetable by small molecules and drugs.
- Constructed a bioinformatics pipeline that efficiently discovers novel motifs in the Golden Orb-weaver spider genome.
-
- Characterized population-level transcriptional regulation by assisting with the creation of a bioinformatic pipeline that quantifies transcription factor enrichment in metagenomic data.
-
- Created a Perl pipeline that utilized Mendelian Randomization and Approximate Bayesian Computation to discover if having an elevated level of triglycerides causes heart disease.
-
- Explored a more efficient measure to track DNA samples. Researched the use of SNP (single nucleotide polymorphisms) information to act as a DNA barcode to keep track of patient samples that have undergone different gene sequencing workflows.
-
- Conducted machine learning to assess algorithms to determine the best strategies to identify lung cancer in patients as early as possible.
Teaching Experience
- Guided rotation student on conducting and executing a research project that analyzed biomedical abstracts to model disease-gene trajectories through time.
- Advised first and second year Ph.D. students on which classes to take for the fall and spring semester
- Advised first year Ph.D. students about the mechanics of lab rotations
- Assisted in teaching Ph.D. students how to program in python.
- Assisted a Ph.D. student in learning advanced computational biology topics.
- Topics ranged from machine learning algorithms to various statistical algorithms
- Assisted in teaching Ph.D. students how to program in python.
- Tutored and assisted students in studying/programming various data structures such as binary search trees to hash tables.
- Worked in walk-in tutoring sessions for university services.
- Tutored students in math classes from Algebra I to Calculus II
- Helped professor assist students in homework during office hours
Presentations/Workshops
- 30 Minute Talk
- Poster Presentation
- Lightning Talk and Poster Presentation
- Poster Presentation
- Invited Speaker
- Poster Presentation
- Poster Presentation
- Poster Presentation
- Lightning Talk
- Poster Presentation
Honors/Awards
National Human Genome Research Institute (NHGRI)
University of Maryland Baltimore County
University of Maryland Baltimore County
University of Maryland Baltimore County
Skills & Proficiency
- Github
- Python
- R
- SQL
- Data Visualization
- Machine Learning
- Deep Learning, Transformers
- Natural Language Processing
- Bayesian Modeling
- Data Structures
- Algorithms
- Parallel Processing
- XML parsing
- Web Development, HTML/CSS