Marzieh Fadaee

As a scientist, I’m broadly interested in all aspects of natural language understanding, and particularly in multilingual learning, data-conscious learning, robust and scalable models, compositionality, and interpretability.

Previously I was the NLP/ML research lead at Zeta Alpha Vector working on smarter ways to discover and organize knowledge. I did my PhD at the Language Technology Lab (originally part of the ILPS group), University of Amsterdam, working on developing models to understand and utilize interesting phenomena in the data. During my PhD I was advised by Christof Monz and Arianna Bisazza. I received my B.Sc. from Sharif University majoring in Computer Engineering and M.Sc. from University of Tehran majoring in Artificial Intelligence.

news

Feb 22, 2024	Our paper Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs is available on Arxiv now.
Feb 13, 2024	We release Aya a massively multilingual dataset and language model. For more details checkout our Data and Model paper.
Nov 5, 2023	Our paper Elo Uncovered: Robustness and Best Practices in Language Model Evaluation is accepted to GEM Workshop at EMNLP 2023!
Nov 1, 2023	Thrilled to be nominated for Top 5 AI Researcher in the company of some outstanding women
Oct 22, 2023	Our paper Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation is available on arxiv now!
Oct 20, 2023	During the month of September and October, I had the opportunity to give several talks. Feel free to explore them right here
Sep 11, 2023	Our paper When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale is accepted to Attributing Model Behavior at Scale workshop at Neurips!
Jan 20, 2023	I’m excited to announce that I have joined Sara Hooker’s team at Cohere For AI as a Senior Research Scientist
Jan 12, 2023	InPars-v2 is SoTA on the BEIR Leaderboard in zero-shot Information Retrieval
Jan 4, 2023	Our paper InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval is available on arxiv now.
Dec 12, 2022	Our paper In Defense of Cross-Encoders for Zero-Shot Retrieval is available on arxiv now.
Feb 22, 2022	Selected as an ICLR Highlighted Reviewer.
Feb 10, 2022	Our paper InPars: Data Augmentation for Information Retrieval using Large Language Models got accepted at SIGIR.
Jan 1, 2022	Our paper mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset is available on arxiv now.
Sep 10, 2021	Invited speaker at “Transformers at Work: 2nd edition” workshop.