Alex Liebscher

I am a soon-to-be graduate of the University of California: San Diego. I am studying Cognitive Science and Mathematics and strive to contribute to the development of human-centered, data-driven technologies. I'm most inspired at the intersection of mathematical formalism and human behavior, especially surrounding language and music. Lastly, I'm keenly aware of the importance of ethical machine learning and data science, and uphold myself to high standards here. I also enjoy running, reading, backpacking, and gardening.


The best way to reach me is at where


San Jose, CA | June 2018 - September 2018

I was a data science intern for TextRecruit during the summer of 2018. I discovered usage and performance insights from the company's large collections of customer and user data. I focused my time on the 40+ million messages the company had transmitted through their platform. I built a sentiment analysis model which improved F1 scores over 1.7x compared to other out-of-the-box solutions. I wished to model the flow of text message conversations with the company's artificial intelligence chatbot to determine retention and confusion rates, along with drop off points. I engineered two classification models to tag inbound user text messages and outbound chatbot messages as one of 30 or 34 classes, respectively. These were applied to conversations to generalize the structure and present insights on weaknesses in the platform's NLP capabilities. I also provided KPIs about chatbot usage, and customer activity and growth, which helped company leadership validate business decisions.

Tools: Python (Keras, Tensorflow, nltk,, Java, Github & git, Jira & Atlassian

Peak Landscape, Inc.

Truckee, CA | July 2017 - August 2017

Maintained and upkept the yards and gardens of large residential and commercial properties. Developed a keen sense of detail, an indefatigable work ethic, and the character to place myself in uncomfortable situations and drive myself toward the company’s idea of success.


La Jolla, CA | September 2016 - June 2020

Courses of Interest

Cognitive Science Computer Science and Math Electives
Supervised Machine Learning Probability Race, Gender, and Artificial Intelligence
Unsupervised Machine Learning Mathematical Statistics I & II Public Rhetoric & Practical Communication
Computational Models of Cognition Vector & Multivariable Calculus Hip-Hop
Language Data Structures & Object-Oriented Design
Cognitive Consequences of Technology Intro to CompSci: Java I & II
Neuroanatomy and Physiology Linear Algebra
Sensation and Perception Linear Optimization
Research Methods and Statistical Analysis Differential Equations


Effects of Battle and Journey Metaphors on Charitable Donations for Cancer Patients

Fall 2018 - Present

Patients with cancer often describe their experience metaphorically as a battle (“my fight against cancer”) or as a journey (“my path through cancer treatment”). Experimental work has demonstrated that these metaphors can influence people's reasoning and emotional inferences about experiences with cancer (Hendricks, Demjen, Semino, & Boroditsky, 2018; Hauser & Schwarz, 2019). However, it is currently unknown how the use of these metaphorical frames translate into behavioral changes, such as the likelihood and magnitude of charitable giving.

Using hand-labeled data from more than 5,000 GoFundMe cancer-related campaigns in a regression framework, we asked whether or not a campaign’s usage of metaphor predicts several measures of donation behavior beyond what other control variables predict (e.g. shares on Facebook). We found that both metaphor families (battle or journey) have a positive effect on campaign success and donation behavior.

To establish whether these relationships are causally meaningful, we designed an online experiment simulating the experience of donating to a crowdfunding campaign. We manipulate the metaphorical framing and recipient gender in the campaign. We will be measuring real donations to the campaigns from participants to determine if an effect of metaphor exists on charitable donations.

Tools: Python (pandas), R (lme4, ggplot, pwr), Github & git

Music Collection

Read the article for details

February 2020 – Present

As an avid music fan, I love discussing music and bonding with people over a mutual interest in a band or genre. However, it's difficult to keep track of all the music I've listened to and enjoyed (and not enjoyed). For this reason, I designed and developed a locally stored and run browser application for recording, logging, and rating my music collection. Through trial and error, I taught myself the React-Redux ecosystem for the front-end, constructed a REST API in Flask on the back-end, and hold all my music data on a local MongoDB instance. I've incorporated a system for comparing and ranking albums, so we may definitevly answer questions like, "What're your top three albums?" A recommendation heuristic is also built-in to suggest which album I should listen to next. Lastly, I've made it open source and maintain it on GitHub.

Tools: Python (Flask, spotipy, pymongo), React-Redux, MongoDB, Github & git

Nonliteral Semantic Edge Probing: Structure in Contextual Word Embeddings

Browse the code or read the paper

Fall 2019

The introduction of contextual word embedding (CWE) models has led to improvements on a wide variety of tasks. Yet, the black-box nature of deep learning language models may be inhibiting further progress. Tenney et al (2019) introduced a novel edge probing framework to explore the syntactic and semantic information encoded within contextual embeddings. They assessed the degree to which these types of information are encoded in the embeddings through a series of traditional linguistic tasks. Here, I expand this framework and study how nonliteral meaning may be also encoded within these embeddings. Nonliteral meaning is often highly abstract, conceptual, and cultural. I find that contextual embeddings do encode some level of nonliteral meaning, as distinguished by our probing of metaphor and metonymy detection tasks.

Tools: Python (pytorch)

Lyft Driver Analysis

Read my article

June 2019 - September 2019

During the summer of 2019, I started driving for Lyft. Read my article to learn more about how I use data to optimize my experience and earnings.

Tools: R, Github & git

A Revised Empirical Comparison of Supervised Learning Algorithms

Read the paper or browse the code

Fall 2018

For any classification problem, choosing a proper classifier and its parameters is critical for success. This paper is my attempt at pushing for methodical supervised machine learning. I evaluate the performance of seven classification algorithms across four data sets. For thoroughness, each classifier was tested over three independent trials, where each trial was subject to three partitions of the data. For each partition of the data, cross validation was performed. For each CV fold, an optimal set of hyper-parameters was found for the classifier using Bayesian Search. Contrary to traditional grid search, this method improves performance, takes advantage of the underlying parameter space with specific priors, and reduces redundant and insignificant searches. Performance overall proved that Random Forests, Gradient Boosted Trees, and RBF-SVM achieve the highest results. K-Nearest Neighbors may also be a viable solution but should be treated with care and precision.

Tools: Python (pandas, numpy, scikit-learn, xgboost, scikit-optimize, multiprocessing), Github & git

Simile Generation with Gaussian Mixture Models

Watch my presentation or browse the code

Fall 2018

For this course project, I was interested in building a model with the ability to "fill in the blanks" when presented with a statement such as "Her hair is as red as ____". For this, I created my own dataset and annotated certain components of each simile. These similes were embedded in a vector space with Word2Vec. I then went through a model selection process in which I tried to minimize the model complexity and data complexity, while maximizing the prediction output entropy. I reduced the word embedding space (for better generalization), tuned the number of components of the Gaussian Mixture Model (representing the number of semantic topic groups), decreased the probability of the model fitting randomness, and maximized the entropy of the bin counts of predicted GMM components (to prevent most similes from being clustered under one component). Given partial data (an incomplete simile), the model permutates over all possible combinations of latent components and vocabulary, searching for the maximum log probability solution. This solution is the "blank" in which to fill the simile. Some results are intriguing, creative, and plausible, whereas others are (fun sounding) gibberish.

Tools: Python (, Word2Vec, scikit-learn, seaborn/matplotlib), Github & git

Music Listening Behavior Analysis

Jupyter Notebook

December 2017 - December 2018

I love talking about music: learning about new artists and songs, discussing albums, and debating the lyrics of music. I might argue that my taste in music defines my friend-circle to some degree. When I meet new people, often times one of the first questions I'll ask is, "So what kind of music do you listen to?" or "Who are your top three artists right now?" Music can tell you a lot about a person, and I take this to heart. Hence, this project is an attempt to understand me a little better.

This project exlores my music listening behavior, including how much music I listen to, listening timeframes, the diversity of my music, and out of all the music I listen to (which averages about 5 hours per day!), what music I actually enjoy.