Publications

An up-to-date list is available on my Google Scholar.

2025


  1. ArXiv
    Aya Vision: Advancing the Frontier of Multilingual Multimodality
    Saurabh Dash, Yiyang Nan, John Dang, Arash Ahmadian, Shivalika Singh, Madeline Smith, Bharat Venkitesh, Vlad Shmyhlo, Viraat Aryabumi, Walter Beller-Morales, Jeremy Pekmez, Jason Ozuzu, Pierre Richemond, Acyr Locatelli, Nick Frosst, Phil Blunsom, Aidan Gomez, Ivan Zhang, Marzieh Fadaee, Manoj Govindassamy, Sudip Roy, Matthias Gallé, Beyza Ermis, Ahmet Üstün, and Sara Hooker
    2025
  2. ArXiv
    The Leaderboard Illusion
    Shivalika Singh, Yiyang Nan, Alex Wang, Daniel D’Souza, Sayash Kapoor, Ahmet Üstün, Sanmi Koyejo, Yuntian Deng, Shayne LongpreNoah A. Smith, Beyza Ermis, Marzieh Fadaee, and Sara Hooker
    2025
  3. ArXiv
    A Post-trainer’s Guide to Multilingual Training Data: Uncovering Cross-lingual Transfer Dynamics
    Luisa Shimabucoro, Ahmet Ustun, Marzieh Fadaee, and Sebastian Ruder
    2025
  4. ArXiv
    Déjà Vu: Multilingual LLM Evaluation through the Lens of Machine Translation Evaluation
    Julia Kreutzer, Eleftheria Briakou, Sweta Agrawal, Marzieh Fadaee, and Kocmi Tom
    2025
  5. ArXiv
    Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation
    Israfel Salazar, Manuel Fernández Burda, Shayekh Bin Islam, Arshia Soltani Moakhar, Shivalika Singh, Fabian Farestam, Angelika Romanou, Danylo Boiko, Dipika Khullar, Mike Zhang, Dominik Krzemiński, Jekaterina Novikova, Luísa Shimabucoro, Joseph Marvin Imperial, Rishabh Maheshwary, Sharad Duwal, Alfonso Amayuelas, Swati Rajwal, Jebish Purbey, Ahmed Ruby, Nicholas Popovič, Marek Suppa, Azmine Toushik Wasi, Ram Mohan Rao Kadiyala, Olga Tsymboi, Maksim Kostritsya, Bardia Soltani Moakhar, Gabriel Costa Merlin, Otávio Ferracioli Coletti, Maral Jabbari Shiviari, MohammadAmin fard, Silvia Fernandez, María Grandury, Dmitry Abulkhanov, Drishti Sharma, Andre Guarnier De Mitri, Leticia Bossatto Marchezi, Setayesh Heydari, Johan Obando-Ceron, Nazar Kohut, Beyza Ermis, Desmond Elliott, Enzo Ferrante, Sara Hooker, and Marzieh Fadaee
    2025
  6. ArXiv
    Command A: An Enterprise-Ready Large Language Model
    Team Cohere,  Aakanksha, Arash Ahmadian, Marwan Ahmed, Jay Alammar, Milad Alizadeh, Yazeed Alnumay, Sophia Althammer, Arkady Arkhangorodsky, Viraat Aryabumi, Dennis Aumiller, Raphaël Avalos, Zahara Aviv, Sammie Bae, Saurabh Baji, Alexandre Barbet, Max Bartolo, Björn Bebensee, Neeral Beladia, Walter Beller-Morales, Alexandre Bérard, Andrew Berneshawi, Anna Bialas, Phil Blunsom, Matt Bobkin, Adi Bongale, Sam Braun, Maxime Brunet, Samuel Cahyawijaya, David Cairuz, Jon Ander Campos, Cassie Cao, Kris Cao, Roman Castagné, Julián Cendrero, Leila Chan Currie, Yash Chandak, Diane Chang, Giannis Chatziveroglou, Hongyu Chen, Claire Cheng, Alexis Chevalier, Justin T. Chiu, Eugene Cho, Eugene Choi, Eujeong Choi, Tim Chung, Volkan Cirik, Ana Cismaru, Pierre Clavier, Henry Conklin, Lucas Crawhall-Stein, Devon Crouse, Andres Felipe Cruz-Salinas, Ben Cyrus, Daniel D’souza, Hugo Dalla-Torre, John Dang, William Darling, Omar Darwiche Domingues, Saurabh Dash, Antoine Debugne, Théo Dehaze, Shaan Desai, Joan Devassy, Rishit Dholakia, Kyle Duffy, Ali Edalati, Ace Eldeib, Abdullah Elkady, Sarah Elsharkawy, Irem Ergün, Beyza Ermis, Marzieh Fadaee, Boyu Fan, Lucas Fayoux, Yannis Flet-Berliac, Nick Frosst, Matthias Gallé, Wojciech Galuba, Utsav Garg, Matthieu Geist, Mohammad Gheshlaghi Azar, Ellen Gilsenan-McMahon, Seraphina Goldfarb-Tarrant, Tomas Goldsack, Aidan Gomez, Victor Machado Gonzaga, Nithya Govindarajan, Manoj Govindassamy, Nathan Grinsztajn, Nikolas Gritsch, Patrick Gu, Shangmin Guo, Kilian Haefeli, Rod Hajjar, Tim Hawes, Jingyi He, Sebastian Hofstätter, Sungjin Hong, Sara Hooker, Tom Hosking, Stephanie Howe, Eric Hu, Renjie Huang, Hemant Jain, Ritika Jain, Nick Jakobi, Madeline Jenkins, JJ Jordan, Dhruti Joshi, Jason Jung, Trushant Kalyanpur, Siddhartha Rao Kamalakara, Julia Kedrzycki, Gokce Keskin, Edward Kim, Joon Kim, Wei-Yin Ko, Tom Kocmi, Michael Kozakov, Wojciech Kryściński, Arnav Kumar Jain, Komal Kumar Teru, Sander Land, Michael Lasby, Olivia Lasche, Justin Lee, Patrick Lewis, Jeffrey Li, Jonathan Li, Hangyu Lin, Acyr Locatelli, Kevin Luong, Raymond Ma, Lukáš Mach, Marina Machado, Joanne Magbitang, Brenda Malacara Lopez, Aryan Mann, Kelly Marchisio, Olivia Markham, Alexandre Matton, Alex McKinney, Dominic McLoughlin, Jozef Mokry, Adrien Morisot, Autumn Moulder, Harry Moynehan, Maximilian Mozes, Vivek Muppalla, Lidiya Murakhovska, Hemangani Nagarajan, Alekhya Nandula, Hisham Nasir, Shauna Nehra, Josh Netto-Rosen, Daniel Ohashi, James Owers-Bardsley, Jason Ozuzu, Dennis Padilla, Gloria Park, Sam Passaglia, Jeremy Pekmez, Laura Penstone, Aleksandra Piktus, Case Ploeg, Andrew Poulton, Youran Qi, Shubha Raghvendra, Miguel Ramos, Ekagra Ranjan, Pierre Richemond, Cécile Robert-Michon, Aurélien Rodriguez, Sudip Roy, Sebastian Ruder, Laura Ruis, Louise Rust, Anubhav Sachan, Alejandro Salamanca, Kailash Karthik Saravanakumar, Isha Satyakam, Alice Schoenauer Sebag, Priyanka Sen, Sholeh Sepehri, Preethi Seshadri, Ye Shen, Tom Sherborne, Sylvie Shang Shi, Sanal Shivaprasad, Vladyslav Shmyhlo, Anirudh Shrinivason, Inna Shteinbuk, Amir Shukayev, Mathieu Simard, Ella Snyder, Ava Spataru, Victoria Spooner, Trisha Starostina, Florian Strub, Yixuan Su, Jimin Sun, Dwarak Talupuru, Eugene Tarassov, Elena Tommasone, Jennifer Tracey, Billy Trend, Evren Tumer, Ahmet Üstün, Bharat Venkitesh, David Venuto, Pat Verga, Maxime Voisin, Alex Wang, Donglu Wang, Shijian Wang, Edmond Wen, Naomi White, Jesse Willman, Marysia Winkels, Chen Xia, Jessica Xie, Minjie Xu, Bowen Yang, Tan Yi-Chern, Ivan Zhang, Zhenyu Zhao, and Zhoujie Zhao
    2025
  7. From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions
    Nathanaël Carraz Rakotonirina, Mohammed Hamdy, Jon Ander Campos, Lucas Weber, Alberto Testoni, Marzieh Fadaee, Sandro Pezzelle, and Marco Del Tredici
    2025
  8. ArXiv
    Towards Best Practices for Open Datasets for LLM Training
    Stefan Baack, Stella Biderman, Kasia Odrozek, Aviya Skowron, Ayah Bdeir, Jillian Bommarito, Jennifer Ding, Maximilian Gahntz, Paul Keller, Pierre-Carl Langlais, Greg Lindahl, Sebastian Majstorovic, Nik Marda, Guilherme Penedo, Maarten Van Segbroeck, Jennifer Wang, Leandro Werra, Mitchell Baker, Julie Belião, Kasia Chmielinski, Marzieh Fadaee, Lisa Gutermuth, Hynek Kydlíček, Greg Leppert, EM Lewis-Jong, Solana Larsen, Shayne Longpre, Angela Oduor Lungati, Cullen Miller, Victor Miller, Max Ryabinin, Kathleen Siminyu, Andrew Strait, Mark Surman, Anna Tumadóttir, Maurice Weber, Rebecca Weiss, Lee White, and Thomas Wolf
    2025

2024


  1. ArXiv
    Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
    John Dang, Shivalika Singh, Daniel D’souza, Arash Ahmadian, Alejandro Salamanca, Madeline Smith, Aidan Peppin, Sungjin Hong, Manoj Govindassamy, Terrence Zhao, Sandra Kublik, Meor Amer, Viraat Aryabumi, Jon Ander Campos, Yi-Chern Tan, Tom Kocmi, Florian Strub, Nathan Grinsztajn, Yannis Flet-Berliac, Acyr Locatelli, Hangyu Lin, Dwarak Talupuru, Bharat Venkitesh, David Cairuz, Bowen Yang, Tim Chung, Wei-Yin Ko, Sylvie Shang Shi, Amir Shukayev, Sammie Bae, Aleksandra Piktus, Roman Castagné, Felipe Cruz-Salinas, Eddie Kim, Lucas Crawhall-Stein, Adrien Morisot, Sudip Roy, Phil Blunsom, Ivan Zhang, Aidan Gomez, Nick Frosst, Marzieh Fadaee, Beyza Ermis, Ahmet Üstün, and Sara Hooker
    2024
  2. Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation
    Shivalika Singh, Angelika Romanou, Clémentine Fourrier, David I. Adelani, Jian Gang Ngui, Daniel Vila-Suero, Peerat Limkonchotiwat, Kelly Marchisio, Wei Qi Leong, Yosephine Susanto, Raymond Ng, Shayne Longpre, Wei-Yin Ko, Madeline Smith, Antoine Bosselut, Alice Oh, Andre F. T. Martins, Leshem Choshen, Daphne Ippolito, Enzo Ferrante, Marzieh Fadaee, Beyza Ermis, and Sara Hooker
    2024
  3. ICLR
    INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
    Angelika Romanou, Negar Foroutan, Anna Sotnikova, Zeming Chen, Sree Harsha Nelaturu, Shivalika Singh, Rishabh Maheshwary, Micol Altomare, Mohamed A. Haggag, Snegha A, Alfonso Amayuelas, Azril Hafizi Amirudin, Viraat Aryabumi, Danylo Boiko, Michael Chang, Jenny Chim, Gal Cohen, Aditya Kumar Dalmia, Abraham Diress, Sharad Duwal, Daniil Dzenhaliou, Daniel Fernando Erazo Florez, Fabian Farestam, Joseph Marvin Imperial, Shayekh Bin Islam, Perttu Isotalo, Maral Jabbarishiviari, Börje F. Karlsson, Eldar Khalilov, Christopher Klamm, Fajri Koto, Dominik Krzemiński, Gabriel Adriano Melo, Syrielle Montariol, Yiyang Nan, Joel Niklaus, Jekaterina Novikova, Johan Samir Obando Ceron, Debjit Paul, Esther Ploeger, Jebish Purbey, Swati Rajwal, Selvan Sunitha Ravi, Sara Rydell, Roshan Santhosh, Drishti Sharma, Marjana Prifti Skenduli, Arshia Soltani Moakhar, Bardia Soltani Moakhar, Ran Tamir, Ayush Kumar Tarun, Azmine Toushik Wasi, Thenuka Ovin Weerasinghe, Serhan Yilmaz, Mike Zhang, Imanol Schlag, Marzieh FadaeeSara Hooker, and Antoine Bosselut
    2024
  4. M-RewardBench: Evaluating Reward Models in Multilingual Settings
    Srishti Gureja, Lester James V. Miranda, Shayekh Bin Islam, Rishabh Maheshwary, Drishti Sharma, Gusti Winata, Nathan Lambert, Sebastian RuderSara Hooker, and Marzieh Fadaee
    2024
  5. Neurips Wokshop
    Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
    Aakanksha, Arash Ahmadian, Seraphina Goldfarb-Tarrant, Beyza Ermis, Marzieh Fadaee, and Sara Hooker
    2024
  6. ArXiv
    Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement
    Simon Yu, Liangyu Chen, Sara Ahmadian, and Marzieh Fadaee
    2024
  7. ICLR
    To Code, or Not To Code? Exploring Impact of Code in Pre-training
    Viraat Aryabumi, Yixuan Su, Raymond Ma, Adrien Morisot, Ivan Zhang, Acyr Locatelli, Marzieh FadaeeAhmet Üstün, and Sara Hooker
    2024
  8. LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives
    Luísa Shimabucoro, Sebastian RuderJulia KreutzerMarzieh Fadaee, and Sara Hooker
    2024
  9. The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
    Aakanksha, Arash Ahmadian, Beyza Ermis, Seraphina Goldfarb-Tarrant, Julia KreutzerMarzieh Fadaee, and Sara Hooker
    2024
  10. ArXiv
    Aya 23: Open Weight Releases to Further Multilingual Progress
    Viraat Aryabumi, John Dang, Dwarak Talupuru, Saurabh Dash, David Cairuz, Hangyu Lin, Bharat Venkitesh, Madeline Smith, Jon Ander Campos, Yi Chern Tan, Kelly Marchisio, Max Bartolo, Sebastian Ruder, Acyr Locatelli, Julia Kreutzer, Nick Frosst, Aidan Gomez, Phil BlunsomMarzieh FadaeeAhmet Üstün, and Sara Hooker
    2024
  11. Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
    Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh FadaeeJulia KreutzerAhmet Üstün, and Sara Hooker
    2024
  12. Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
    Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D’souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid, Freddie Vargus, Phil BlunsomShayne Longpre, Niklas Muennighoff, Marzieh FadaeeJulia Kreutzer, and Sara Hooker
    2024
  13. Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
    Shivalika Singh, Freddie Vargus, Daniel Dsouza, Börje F. Karlsson, Abinaya Mahendiran, Wei-Yin Ko, Herumb Shandilya, Jay Patel, Deividas Mataciunas, Laura OMahony, Mike Zhang, Ramith Hettiarachchi, Joseph Wilson, Marina Machado, Luisa Souza Moura, Dominik Krzemiński, Hakimeh Fadaei, Irem Ergün, Ifeoma Okoh, Aisha Alaagib, Oshan Mudannayake, Zaid Alyafeai, Vu Minh Chien, Sebastian Ruder, Surya Guthikonda, Emad A. Alghamdi, Sebastian Gehrmann, Niklas Muennighoff, Max Bartolo, Julia KreutzerAhmet ÜstünMarzieh Fadaee, and Sara Hooker
    2024

2023


  1. Neurips
    Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
    Meriem BoubdirEdward Kim, Beyza Ermis, Sara Hooker, and Marzieh Fadaee
    2023
  2. ArXiv
    Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
    Meriem BoubdirEdward Kim, Beyza Ermis, Marzieh Fadaee, and Sara Hooker
    2023
  3. ArXiv
    When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
    2023
  4. ArXiv
    InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval
    Vitor Jeronymo, Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Jakub Zavrel, and Rodrigo Nogueira
    2023

2022


  1. ArXiv
    In Defense of Cross-Encoders for Zero-Shot Retrieval
    Guilherme Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, and Rodrigo Nogueira
    2022
  2. InPars: Data Augmentation for Information Retrieval using Large Language Models
    Luiz Henrique Bonifacio, Hugo Abonizio, Marzieh Fadaee, and Rodrigo Nogueira
    In SIGIR, Feb 2022
  3. ArXiv
    No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval
    Guilherme Moraes Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, and Rodrigo Nogueira
    In arXiv, Feb 2022

2021


  1. ArXiv
    mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset
    Luiz Bonifacio, Vitor Jeronymo, Hugo Queiroz Abonizio, Israel Campiotti, Marzieh Fadaee, Roberto Lotufo, and Rodrigo Nogueira
    In arXiv, Feb 2021

2020


  1. Thesis
    Understanding and Enhancing the Use of Context for Machine Translation
    Marzieh Fadaee
    Oct 2020
  2. A New Neural Search and Insights Platform for Navigating and Organizing AI Research
    Marzieh Fadaee, Olga Gureenkova, Fernando Rejon Barrera, Carsten Schnober, Wouter Weerkamp, and Jakub Zavrel
    In Proceedings of the First Workshop on Scholarly Document Processing, Nov 2020
  3. The Unreasonable Volatility of Neural Machine Translation Models
    Marzieh Fadaee, and Christof Monz
    In Proceedings of the Fourth Workshop on Neural Generation and Translation, Jul 2020

2018


  1. Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation
    Marzieh Fadaee, and Christof Monz
    In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Jul 2018
  2. Examining the Tip of the Iceberg: A Data Set for Idiom Translation
    Marzieh FadaeeArianna Bisazza, and Christof Monz
    In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), May 2018

2017


  1. Data Augmentation for Low-Resource Neural Machine Translation
    Marzieh FadaeeArianna Bisazza, and Christof Monz
    In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), Jul 2017
  2. Learning Topic-Sensitive Word Representations
    Marzieh FadaeeArianna Bisazza, and Christof Monz
    In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), Jul 2017

2013


  1. CICLING
    Automatic WordNet Construction Using Markov Chain Monte Carlo
    Marzieh FadaeeHamidreza GhaderHeshaam Faili, and Azadeh Shakery
    Polibits, Jul 2013