Team Lime: Spotify Podcast Recommender

written by

Elizabeth Campolongo

Thursday, February 16, 2023

Congratulations to Team Lime on winning The Erdős Institute’s Fall 2022 Data Science Boot Camp with their project: Spotify Podcast Recommender!

Composed of Music Theory Ph.D. Candidate Aditya Chander (Yale University), Economics Postdoc Ritika Khurana (University of Delaware), Sociology Ph.D. Student Yuchen Luo (New York University), and recent Linguistics Ph.D. and Market Researcher Taylor Mahler (The Ohio State University), Team Lime successfully utilized Spotify’s podcast dataset to build a podcast recommendation system. Their model is designed to take one of two inputs and suggest similar podcast episodes for the user: either the name of a podcast episode or a description of a podcast/podcast episode of interest. The relevance and relatability of suggestions was confirmed by measuring the similarity between podcasts within user-tagged categories. With more time, the team would also like to add in a user feedback option to continuously retrain their model for improved recommendations. Team Lime further suggests that the applications of this system are not limited to simply maintaining user engagement, but could also be employed by advertisers to increase revenue by targeting connected podcasts to advertise diverse products, avoiding repetitive advertising to the same listeners. Ultimately, of the two models they tried, the pre-trained transformer model resulted in 88.3% of the ordered category pairs maintaining lower similarity scores between-category than within, as compared to 75.1% with the other. Thus, they selected the pre-trained transformer model for their recommendation app.

When discussing how the team settled on this dataset and specific project, Ritika explained that she wanted to try something different; Aditya has a music background and also wanted to expand his horizons. Since he had worked with Spotify’s API in the past and had some familiarity with Natural Language Processing (NLP), this project was a natural extension of both their interests. Taylor and Yuchen joined their group later, both drawn to the NLP aspect of the project. Taylor’s Ph.D. is in linguistics, but she had not previously worked with NLP, and Yuchen’s experience was more theoretical for her sociology studies—she was excited to apply her NLP knowledge to something practical that a company would like.

At the end of the project, they were excited to have a finished product. “When we see the finished project and we realize, wait, it actually works, that I think the recommended episodes make a lot of intuitive sense,” Yuchen thought that was the most rewarding part. For Ritika it was “learning new skills, and definitely–at the end–when we realized that we won the project,” was great, “but for me, the biggest or most rewarding part was that this was my first Python project.” Taylor found that “to sit down and think about what this would mean for an actual business and actual users, because I have very limited experience outside of academia, [to] realize that it actually has business value, I think was rewarding.” Aditya agreed that it was exciting to have a product at the end: “From that perspective, knowing that—what Yuchen said—we had a product at the end of it, it wasn’t just a series of insights that maybe would have led to something else, we had a concrete deliverable app.”

The team noted that with more time and computational resources, they envision adding more features to the model and improving their app. For instance, they would like to continually retrain the model by having users provide feedback on the generated recommendations and include descriptions of the episodes (in both the results and for the modeling process as well). Following the completion of their project, though Aditya still mostly listens to music, he now listens to more podcasts and has utilized their app for recommendations. Taylor plans to try it to help her husband find a new podcast now that his favorite one has ended; she mostly listens to interviews or podcasts on topics she’s interested in learning. Ritika likes to listen to Hidden Brain and Trained, whose topics vary widely on the speaker, from science to philosophy. Yuchen enjoys podcasts about anime and book summaries since she doesn’t have much time to read outside of work.

Team Lime attributes much of their success to organization and clear delegation of tasks. They highly recommend having weekly meetings to help hold each other accountable and making clear to-do lists following the meetings so that everyone knows their task(s). Furthermore, though it is good to consider small details, it is important to not lose sight of the big picture or the end goals and deliverables of the project. Two other factors of their success that they highlight were paired-programming and great advice from their project mentor, Gleb Zhelezov.

Congratulations again to Team Lime as well as all of the other teams who completed a Fall 2022 Data Science Boot Camp project!

TEAM

How many eras of Taylor Swift's music are there really?

This project is an idea that I don't intend to work on but would love to see someone tackle. I also want to preface this by stating that I enjoy Taylor Swift's music and this project is in no way intended to diminish her work.

Taylor Swift is currently on her Eras Tour, a tour comprised of 3.25 hour-long concerts that feature sets broken up by each of her first ten studio albums, which she refers to as "eras." However, her first ten albums were released between 2006 and 2022, meaning her "eras" lasted 1.6 years each on average -- hardly long enough to be called era! Moreover, some of her albums sound pretty darn similar to me, and I wonder how musically different these eras truly are.

This project might seem a bit silly in these, but analyzing Taylor Swift's music could use some pretty serious machine learning. My thought was to consider two aspects of each album: the music itself and her lyrics. Using a Python package like librosa (https://librosa.org/doc/latest/index.html), one could feasibly use clustering analysis or something similar to measure how different her albums are musically. Then one could use something like sentiment analysis on her lyrics to determine how much lyrical themes shift between albums. One could even consider these together by looking at a weighted average of the two measures.

Beyond simply analyzing Taylor Swift's music, similar techniques could be used on chart-topping pop music leading up to and immediately following the release of each of her albums to assess if her albums truly define eras or if they fit into more general pop music trends. This could potentially motivate economic questions about charting a pop star's path to enduring success.

THE ERDŐS INSTITUTE

Helping PhDs get and create jobs they love at every stage of their career.

Team Lime: Spotify Podcast Recommender

TEAM

How many eras of Taylor Swift's music are there really?

THE ERDŐS INSTITUTE

Helping PhDs get and create jobs they love at every stage of their career.

Team Lime: Spotify Podcast Recommender

TEAM

How many eras of Taylor Swift's music are there really?

​