PhD- A Generic and Model-Agnostic Evaluation Framework for Decision-Making Tasks(Task-Oriented) F/M
Orange
about the roleOrange has implemented various bot solutions, and records large amounts of human-machine and human-to-human conversations for customer care. Often, these conversations are not rigorously evaluated.
Large language models (LLMs) are a breakthrough in many natural language processing (NLP) tasks, including the development of agents capable of solving complex tasks [6]. Moreover, developing chatbots has become democratized. It is likely that there will be a proliferation of solutions in the near future. The boundaries between various NLP tasks and the domains (e.g. tourism, restaurant, retail, technical support, etc.) are blurring.
The evaluation of the various solutions is becoming a real need, it is necessary to broaden the scope of evaluation and to make it transposable.
Previous work studied the correlation between objective and subjective metrics (indicators) to evaluate conversations [3] and for text generation [6]. Others predicted the quality of the conversation [1]. Model-agnostic scores are proposed in [4] to compare the behavior of the two dialogue systems. More recently, the DSTC (Dialogue System Technology Challenge) is focused this year on the evaluation of dialogues [5].
Inspired by the game theory [7], interpretability [2] and self-driving cars [8], we will infer the strategy that has been followed by the dialogue system [10]. We will work both on public (WebShop, ALFWorld,…) and private Orange (Technical Assistance, Commercial Bots, etc.) datasets.References:[1] Rojas-Barahona Lina M. (2020). Is the User Enjoying the Conversation? A Case Study on the Impact on the Reward Function. In proceedings of NeurIPS workshop HLDS2020.
[2] Michele Cafagna, Lina M. Rojas-Barahona, Kees van Deemter, Albert Gatt. Interpreting Vision and Language generative models with semantic visual priors. in Frontiers in AI. Special issue : Explainable AI in Natural Language.
[3] Marilyn A. Walker, Diane J. Litman, Candace A. Kamm, and Alicia Abella. 1997a. PARADISE: A framework for evaluating spoken dialogue agents. ACL and EACL, pages 271-280, Madrid, Spain.
[4] Ultes, Stefan, and Wolfgang Maier. “Similarity scoring for dialogue behaviour comparison.” SIGDIAL. 2020.
[5] Mehri, Shikib, et al. “Interactive evaluation of dialog track at DSTC9.” arXiv preprint arXiv:2207.14403 (2022).
[6] Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 [7] Yu, Xiaopeng, et al. “Model-based opponent modeling.” Advances in Neural Information Processing Systems 35 (2022): 28208-28221.
[8] Teng, Siyu, et al. “Motion planning for autonomous driving: The state of the art and future perspectives.” IEEE Transactions on Intelligent Vehicles (2023).about you– You have experience in the fields of Artificial Intelligence, Machine Learning and particularly in deep learning.
– You have a good level of mathematics (numerical optimization, statistics, probability, etc.).
– You are proficient in software development
– You are proficient in read, written and spoken English
– You are curious, attracted by new technologies, and ready to keep up with their evolutions
– You enjoy working in a team, within multidisciplinary projects, and contributing to a common goal, while being autonomous in your activities
– You have good analytical and synthesis skills
– Proficiency in one of the following deep learning tools: Torch, pyTorch, TensorFlow, MXNet would be a plus
– You like to communicate the results of your work through written reports and oral presentations preferable in English
Lannion, Côtes-d’Armor
Fri, 05 Jul 2024 22:56:34 GMT
To help us track our recruitment effort, please indicate in your email/cover letter where (vacanciesineu.com) you saw this job posting.
vacanciesineu.com Zellinger ist Teil der Unternehmensgruppe von gourmetfein mit 9 Feinkostläden in Oberösterreich und Salzburg…
vacanciesineu.com At a glance Are you interested in the semiconductor industry and would like to…
vacanciesineu.com Fleischer/in 1 Metzger/in 1 Metzger/in VollzeitMITZUBRINGEN IST:Abgeschlossene AusbildungTeamfähigkeitGepflegtes Auftreten und sauberes ArbeitenKompetenter und professioneller…
vacanciesineu.com Auf einen Blick Sie haben großes Interesse, globale Projekte gemeinsam mit Ihrem Team voranzutreiben…
vacanciesineu.com Elektriker/in - Energietechnik Unser Geschäftsfeld umfasst Reparatur und Service sowie Verkauf von Neu- und…
vacanciesineu.com Snap Inc is a technology company. We believe the camera presents the greatest opportunity…