|𝔻⟩irac's Student: Fine-tuning GPT-3 for a LAMMPS or VASP AI chatbot

Thursday, March 2, 2023

Fine-tuning GPT-3 for a LAMMPS or VASP AI chatbot

AI Computational Science Language Models Machine Learning Thinking Out loud

The GPT API enables fine-tuning of the GPT model for your specific application. I'm interested in utilizing this to create a new tool that would allow a user to query a software user manual to generate macros or scripts to perform operations. The idea is to put together a series of prompts and completions that are extracted from user forms like stack overflow/exchange, discourse, etc. as well as from domain users who are willing to contribute. For example, I'm curious about creating a GPT chatbot that can provide users with LAMMPS or VASP scripts based on text prompts about the problem. At the moment, ChatGPT tries to do this but fails to get enough of the specific commands and parameters correct.

What I'm thinking is if you have a dataset with prompts and completions like:


  {"prompt": "What is the command for\n 
 
    computing thermal conductivity in LAMMPS",\n 

   "completion": "In order to calculate the\n 

   thermal conductivty using the Green-Kubo formulas,\n 

   the heat flux needs to be calculated.\n

   The command to do so is:\n 

   compute ID group-ID heat/flux ke-ID pe-ID stress-ID"}

My hope is that if you fine-tune the GPT model with these examples the user can just ask a AI chatbot more broadly something like:

Please create a LAMMPS input script to calculate the thermal conductivity of graphite at 300K.

Would this approach work for fine-tuning a GPT model? I don't really know, I'm planning on giving it a go. I need to also be cognizant that the number of tokenizations in the dataset for fine-tuning doesn't make it a costly disaster.

I'm wondering is if there is a way to grab the questions and answers in a json format from the LAMMPS discourse community and likewise sources to create the fine-tuning dataset. If not it would be very time-consuming for domain knowledge from individuals. I guess could create some kind of community input form where users provide this. Would do the same for VASP and hopefully most of the other mainstream atomistic packages. I have a name for a LAMMPS AI chatbot but need to ask the person first if it's okay to eponymize the chatbot after them.

Reuse and Attribution

|𝔻⟩irac's Student

Search Blogs

Thursday, March 2, 2023

Fine-tuning GPT-3 for a LAMMPS or VASP AI chatbot

No comments:

Post a Comment