Marzieh Fadaee

Senior Research Scientist @ Cohere For AI

prof_pic_moi.jpg

As a scientist, I’m broadly interested in all aspects of natural language understanding, and particularly in multilingual learning, data-conscious learning, robust and scalable models, compositionality, and interpretability.

Previously I was the NLP/ML research lead at Zeta Alpha Vector working on smarter ways to discover and organize knowledge. I did my PhD at the Language Technology Lab (originally part of the ILPS group), University of Amsterdam, working on developing models to understand and utilize interesting phenomena in the data. During my PhD I was advised by Christof Monz and Arianna Bisazza. I received my B.Sc. from Sharif University majoring in Computer Engineering and M.Sc. from University of Tehran majoring in Artificial Intelligence.

news

Feb 22, 2024 Our paper Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs is available on Arxiv now.
Feb 13, 2024 We release Aya :herb: a massively multilingual dataset and language model. For more details checkout our Data and Model paper.
Nov 5, 2023 Our paper Elo Uncovered: Robustness and Best Practices in Language Model Evaluation is accepted to GEM Workshop at EMNLP 2023!
Nov 1, 2023 Thrilled to be nominated for Top 5 AI Researcher in the company of some outstanding women :star2:
Oct 22, 2023 Our paper Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation is available on arxiv now!
Oct 20, 2023 During the month of September and October, I had the opportunity to give several talks. Feel free to explore them right here :tv:
Sep 11, 2023 Our paper When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale is accepted to Attributing Model Behavior at Scale workshop at Neurips!
Jan 20, 2023 I’m excited to announce that I have joined Sara Hooker’s team at Cohere For AI as a Senior Research Scientist :purple_heart:
Jan 12, 2023 InPars-v2 is SoTA on the BEIR Leaderboard in zero-shot Information Retrieval :stars:
Jan 4, 2023 Our paper InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval is available on arxiv now.
Dec 12, 2022 Our paper In Defense of Cross-Encoders for Zero-Shot Retrieval is available on arxiv now.
Feb 22, 2022 Selected as an ICLR Highlighted Reviewer.
Feb 10, 2022 Our paper InPars: Data Augmentation for Information Retrieval using Large Language Models got accepted at SIGIR.
Jan 1, 2022 Our paper mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset is available on arxiv now.
Sep 10, 2021 Invited speaker at “Transformers at Work: 2nd edition” workshop.