Tutorial: Responsible use of large language models
Large language models (LLMs), like GPT, are powerful tools for generating text that can be used for coding and doing data analysis. This is at once empowering (LLMs are powerful and can save you time) and risky (LLMs can make mistakes that are hard to detect).
Our general view is that LLMs are powerful tools that you will encounter and use when you leave this classroom, so it’s important to learn how to use them responsibly and effectively. As described in the syllabus, you are generally permitted to use LLMs in this course, but ultimately, you are responsible for guaranteeing, understanding, and interpreting your results, and you can’t do this if you don’t understand the code that you are running (this isn’t exclusive to code generated by LLMs – it also applies to code that you copy from the internet!).
The use of LLMs for coding is a new and rapidly evolving area. Rather than provide a lesson plan for you, this page will provide some resources for self-learning.
Links and resources
- GitHub Copilot is an extension for VS Code that can provide suggestions for code completion and editing. It is free for students and educators.
- Blog: “Bob Carpenter thinks GPT-4 is awesome”: this post highlights how GPT-4 is able to write a program in Stan, a statistical programming language, and also the mistakes that it makes. Finding and correcting these mistakes requires knowing the Stan language and having a deep understanding of the statistical model, but someone with this expertise could potentially use GPT-4 to accelerate their coding workflow. The comments are also interesting and insightful.
- AI Snake Oil is a blog that seeks to dispel hype, remove misconceptions, and clarify the limits of AI. The authors are in the Princeton University Department of Computer Science.
- ChatGPT has both free and paid tiers and can be helpful with writing and interpreting code, though care is needed as described above