• Explore
    • Contact Us
  • Faculty
  • Research
    • Research Areas
    • Research Centers
  • Graduate Degrees
    • Computer Science Programs
    • Current Graduate Students
  • Undergraduate Degrees
  • News & Events
    • News
    • Seminar Series
    • Distinguished Lecture Series
    • Research Showcase
  • Apply Now
    • Undergraduate Admissions
    • Graduate Admissions
    • Faculty Candidates

Machine Learning ‘Street Talk’ Podcast Features Reality Check from Sameer Singh and Yasaman Razeghi

April 12, 2022

A popular machine learning podcast, ML Street Talk, recently featured Computer Science Professor Sameer Singh and Ph.D. student Yasaman Razeghi discussing their paper, “Impact of Pretraining Term Frequencies on Few-Shot Reasoning.” Co-authored with Robert Logan, also a Ph.D. student in UCI’s Donald Bren School of Information and Computer Sciences (ICS), and Matt Gardner, a principal researcher at Microsoft, the paper suggests that large language models perform well on reasoning tasks not because the models can reason well but maybe because they’ve memorized the dataset.

Sameer Singh headshot, taken at the beach. He has short, dark wavy hair and a short beard. He is wearing a maroon t-shirt.
Sameer Singh
Yasaman Razeghi headshot, with greenery in the background. She has long, straight black hair and is smiling.
Yasaman Razeghi
Robert Logan headshot. He is standing in front of a painting, and he has curly dark brown hair that falls to his shoulders. He is smiling and has facial hair on and around his chin.
Robert Logan

“Thank you for writing this paper,” says co-host Keith Duggar during the podcast (available on YouTube and Anchor). “You bring up this absolutely crucial point that the pre-training data has to be considered when you’re talking about the performance of the model [and] this paper directly strikes at and proves it’s not doing reasoning.”

Co-host Tim Scarfe agrees, “I think your work is a bit of a reality check.”

Duggar and Scarfe aren’t the only people in the machine learning world to praise the paper. “Incredibly important result,” tweeted Gary Marcus, founder and CEO of Robust.AI. “Quite interesting analysis that shows large language model few-shot performance for arithmetic is correlated with the training set term frequency of the numbers in the arithmetic expression,” tweeted Jeff Dean, senior fellow and senior VP of Google AI.

“We are among the first to show the effect of large pre-trained corpus on the model’s performance,” says Razeghi. “Our analysis shows that their performance is pretty sensitive to statistics from the pre-training data, and raising serious concerns about the robustness of their reasoning capabilities.”

Razeghi, who will be interning with the Blueshift at Google Research this summer, goes on to say that she hopes “more people take the effect of pre-training corpus into account while they are evaluating the language models.”

— Shani Murray

« ICS Researchers Win Best Paper Award at Eurosys 2022
Multidisciplinary Collaborators Set their Sights on Color Vision in the Dark »

Latest news

  • Identifying the Building Blocks of Attention in Deep Learning March 21, 2023
  • Faculty Spotlight: Jennifer Wong-Ma and the Power of Community March 20, 2023
  • Computer Science Ph.D. Candidate Takami Sato Named Public Impact Fellow March 14, 2023
  • Irani Builds New Collaborations as Associate Director of the Simons Institute March 6, 2023
  • UC Irvine Partners With Linux Foundation to Welcome New Open Source Projects from Peraton Labs to Scale 5G Security March 3, 2023
  • © 2023 UC Regents
  • Feedback
  • Privacy Policy