All times are with respect to the Central timezone. Please find links to the papers and SlidesLive videos at the above link.

9:30 a.m. - 9.45 a.m. Opening Remarks Jessica Huynh
9:45 a.m. - 10:15 a.m. Invited Keynote by Jason Weston (Keynote) Jason Weston
10:15 a.m. - 10:25 a.m. Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets (Oral) Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
10:25 a.m. - 10:35 a.m. Are GAN Biased? Evaluating GAN-Generated Facial Images via Crowdsourcing (Oral) Hangzhi Guo, Lizhen Zhu, Ting-Hao Huang
10:35 a.m. - 10:45 a.m. Towards Credible Human Evaluation of Open-Domain Dialog Systems Using Interactive Setup (Oral) Sijia Liu, Patrick Lange, Behnam Hedayatnia, Alexandros Papangelis, Di Jin, Andrew Wirth, Yang Liu, Dilek Hakkani-Tur
10:45 a.m. - 11:30 p.m. Panel on Technical Challenges Associated with Reliable Human Evaluations of Generative Models (Discussion Panel) Long Ouyang, Tongshuang Wu, Zachary Lipton
11:30 p.m. - 1:00 p.m. Lunch Break (Break)  
1:00 p.m. - 1:50 p.m. Discussion on Policy Challenges Associated with Generative Models (Discussion Panel) Irene Solaiman, Russell Wald, Yonadav Shavit
1:50 p.m. - 2:00 p.m. Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark Vitali Petsiuk, Alexander E. Siemenn, Saisamrit Surbehera, Zad Chin, Keith Tyser, Gregory Hunter, Arvind Raghavan, Yann Hicke, Bryan Plummer, Ori Kerret, Tonio Buonassisi, Kate Saenko, Armando Solar-Lezama, Iddo Drori
2:00 p.m. - 2:10 p.m. Can There be Art Without an Artist? (Oral) Avijit Ghosh, Genoveva Fossas
2:10 p.m. - 2:20 p.m. Best Prompts for Text-to-Image Models and How to Find Them (Oral) Nikita Pavlichenko, Fedor Zhdanov, Dmitry Ustalov
2:20 p.m. - 2:30 p.m. Evaluation of Synthetic Datasets for Conversational Recommender Systems (Oral) Harsh Lara, Manoj Tiwari
2:30 p.m. - 2:45 p.m. Coffee Break (Break)  
2:45 p.m. - 3:35 p.m. Panel and QnA with Science Funders Interested in Reliable Human Evaluation of Generative Models (Panel) Brittany Smith, Eric Sears, Yonadav Shavit
3:35 p.m. - 3:45 p.m. Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models (Oral) Vikas Raunak, Matt Post, Arul Menezes
3:45 p.m. - 3:55 p.m. Sensemaking Interfaces for Human Evaluation of Language Model Outputs (Oral) Katy Gero, Jonathan Kummerfeld, Elena Glassman
3:55 p.m. - 4:05 p.m. The Reasonable Effectiveness of Diverse Evaluation Data Lora Aroyo, Mark Diaz, Christopher M. Homan, Vinodkumar Prabhakaran, Alex Taylor, Ding Wang
4:05 p.m. - 4:15 p.m. Closing Remarks Jennifer Hsia