30  Reference Materials

These materials / links were last checked on October 2, 2025. All apologies for links that no longer work. Please email me at CRunyon@nbme.org if you have the time to let me know something’s wrong. (Thanks in advance!)

30.1 Background Information on GPTs

The 3Blue1Brown YouTube Channel provides several good videos on the some of the technical aspects of large language models.

Anthropic’s paper On the Biology of a Large Language Model is particularly interesting.

This post on lesswrong provides a nice high-level summary for understanding LLMs.


30.2 Reference Guides (including Prompt Engineering)

promptingguide.ai is a great resources for learning more about prompt engineering techniques

Anthropic’s Prompt Engineering Guide - Some directions on uploading files via API

OpenAI Cookbook OpenAI’s Prompt Engineering Guide

Google Gemini’s Prompt Engineering Guide - Includes directions on uploading a file via API as part of a prompt.


30.3 Newsletters

The following 3 newsletters are sent out M-F. They’re broad in nature (industry trends, policy information, etc) but also include useful information related to assessment (model updates, new model features, etc).

30.3.1 The Neuron

  • Has a searchable archive of previous posts that is really useful.

30.3.2 Superhuman

  • Also has a searchable archive.

30.3.3 TL:DR AI

30.3.4 The Signal Substack

  • Only published on Sunday, highlights the 3-5 most important AI news stories of the week.

30.3.5 Gary Marcus Substack

  • Professor Emeritus of Psychology and Neural Science at NYU
  • A bit of an AI pessimist, but it’s helpful to offset the deluge of AI excitement

30.3.6 Jack Clark Substack

  • Co-founder of Anthropic
  • He was on the Rick Rubin podcast (Tetragrammaton) and it’s a fun listen

30.3.7 Michael Jabbour Substack

  • AI Innovation Officer at Microsoft

30.3.8 One Useful Thing

  • by Ethan Mollick, author of the

30.4 Podcasts

30.4.1 The AI Daily Brief

  • “A daily news analysis show on all things artificial intelligence. NLW looks at AI from multiple angles, from the explosion of creativity brought on by new tools like Midjourney and ChatGPT to the potential disruptions to work and industries as we know them to the great philosophical, ethical and practical questions of advanced general intelligence, alignment and x-risk.”
  • Short (usually < 30 min.) daily podcasts on various topics. Some are more applicable than others.
  • Spotify , Apple
  • Also a YouTube Channel

30.4.2 Hard Fork

  • Part of the New York Times collection of podcasts: “‘Hard Fork’ is a show about the future that’s already here. Each week, journalists Kevin Roose and Casey Newton explore and make sense of the latest in the rapidly changing world of tech.”
    • Requires an account (maybe free is sufficient? I’m a NYT subscriber anyway.)
    • This episode - “AI School is in Session: Two Takes on the Future of Education” was cool.
  • New York Times, Spotify, Apple

30.5 Online Training / YouTube Channels

30.5.1 DeepLearning.AI

30.5.2 DataCamp

30.5.3 Anthropic YouTube Channel

30.5.4 OpenAI YouTube Channel

30.5.5 Cursor YouTube Channel

  • Cursor is a fantastic AI coding assistant

30.6 Books

AI Engineering by Chip Huyen is a slightly more advanced read for those interested in building AI products.

Prompt Engineering for Generative AI: Future-Proof Inputs for Reliable AI Outputs by James Phoenix and Mike Taylor is also a good resource for learning more about prompt engineering. Focused more on earlier (non-reasoning) models, but some important parts carry through.

Brave New Words by Sal Khan (of Khan Academy fame) is an interesting perspective on how AI will change education.

Co-Intelligence: Living and Working with AI by Ethan Mollick. More accessible discussion of AI, and might not be too helpful to those that have been interested in or working in the field for a bit.


30.7 LLM-specific R Packages

A number of packages have been developed to more easily facilitate interacting with LLMs via R. Many of these packages are useful (we’ll cover some of those in the workshop), whereas other packages include some developer design decisions that don’t work particularly well for my usual workflows. I’ve also found that some packages aren’t often updated / maintained. The syntax to interact with API models can change as new models are released (e.g., ChatGPT5), which can render some of the package functionality obsolete.

Below is a non-exhaustive list of packages that I’ve found to interact with LLMs. This is not meant to be exhaustive or a curated list; it’s only to provide you with information about the packages you’ll be using in the workshop (and others) in the case you find them helpful for your workflow. All package summaries were initially generated with AI. Some summaries have been edited, some have not.

30.7.1 ellmer

ellmer Overview CRAN Documentation

ellmer is an R package that provides a unified interface for interacting with large language models from over 17 providers including OpenAI, Anthropic, Google Gemini, and AWS Bedrock. It supports advanced features like streaming outputs, tool/function calling, structured data extraction, and multimodal inputs. Chat objects are stateful and maintain conversation context, enabling both interactive console-based conversations and programmatic use in R scripts and applications.

30.7.2 tidyprompt

tidyprompt Overview CRAN Documentation

tidyprompt is an R package that provides a compositional framework (“prompt wraps”) for building prompts enriched with logic, validation, and extraction functions when interacting with LLMs. It supports structured output, retry/feedback loops, reasoning strategies (e.g. ReAct or chain-of-thought), and even autonomous R code or function calling as part of an LLM dialogue. The package is provider-agnostic, meaning its features can layer on top of any chat completion API (e.g. via ellmer) to produce more robust, predictable interactions.

30.7.3 tidyllm

tidyllm Overview CRAN Documentation

tidyllm provides a tidy, pipeline-friendly interface for interacting with multiple LLM APIs (e.g. Claude, OpenAI, Gemini, Mistral) and local models via Ollama. It supports multimodal inputs (text, images, PDFs), maintains conversational history, handles batching and rate limits, and allows structured schema-based extraction of responses. The design emphasizes composability and integration into typical R data workflows.

30.7.4 chattr

chattr Overview CRAN Documentation

chattr is an R package that enables interactive communication with large language models directly within RStudio using a Shiny gadget or from the console. It enriches prompts with contextual information (e.g. loaded data frames) and integrates with various back-ends (e.g. OpenAI, Copilot, local LlamaGPT) via the ellmer interface. The package is geared toward exploratory workflows and rapid prototyping of LLM-assisted analysis.

30.7.5 LLMAgentR

LLMAgentR Overview CRAN Documentation

LLMAgentR is an R package for constructing language model “agents” using a modular, graph-based execution framework inspired by LangChain/LangGraph architectures. It offers a suite of agent types (e.g. code generation, data wrangling, SQL agents, document summarization) that iteratively reason, generate R code, execute, debug, and explain results. The package aims to support reproducible AI workflows for analysis, research, and automation by integrating LLM reasoning and domain logic.

30.7.6 PacketLLM

PacketLLM Overview CRAN Documentation

PacketLLM offers an interactive RStudio gadget interface for chatting with OpenAI LLMs (e.g. GPT-5 and variants) directly within the R environment. It supports multiple simultaneous conversation tabs, file upload (e.g. .R, PDF, DOCX) as contextual input, and per-conversation system message configuration. API calls are handled asynchronously (via promises + future) to avoid blocking the R console during model interactions.