Alignment Newsletter #162: Foundation models: a paradigm shift within AI
Release Date: 08/27/2021
Alignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Jack W. Rae et al) (summarized by Rohin): This paper details the training of the Gopher family of large language models (LLMs), the biggest of which is named Gopher and has 280 billion parameters. The algorithmic details are very similar to the (): a Transformer architecture trained on next-word prediction. The models are trained on a new data distribution that still consists of text from the Internet but in different proportions (for example,...
info_outline Alignment Newsletter #172: Sorry for the long hiatus!Alignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: Sorry for the long hiatus! I was really busy over the past few months and just didn't find time to write this newsletter. (Realistically, I was also a bit tired of writing it and so lacked motivation.) I'm intending to go back to writing it now, though I don't think I can realistically commit to publishing weekly; we'll see how often I end up publishing. For now, have a list of all the things I should have advertised to you whose deadlines haven't already passed. ...
info_outline Alignment Newsletter #171: Disagreements between alignment "optimists" and "pessimists"Alignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Richard Ngo and Eliezer Yudkowsky) (summarized by Rohin): Eliezer is known for being pessimistic about our chances of averting AI catastrophe. His argument in this dialogue is roughly as follows: 1. We are very likely going to keep improving AI capabilities until we reach AGI, at which point either the world is destroyed, or we use the AI system to take some pivotal act before some careless actor destroys the world. 2. In either case, the AI system must be producing...
info_outline Alignment Newsletter #170: Analyzing the argument for risk from power-seeking AIAlignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Joe Carlsmith) (summarized by Rohin): This report investigates the classic AI risk argument in detail, and decomposes it into a set of conjunctive claims. Here’s the quick version of the argument. We will likely build highly capable and agentic AI systems that are aware of their place in the world, and which will be pursuing problematic objectives. Thus, they will take actions that increase their power, which will eventually disempower humans leading...
info_outline Alignment Newsletter #169: Collaborating with humans without human dataAlignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (DJ Strouse et al) (summarized by Rohin): We’ve previously seen that if you want to collaborate with humans in the video game Overcooked, (), so that the agent “expects” to be playing against humans (rather than e.g. copies of itself, as in self-play). We might call this a “human-aware” model. However, since a human-aware model must be trained against a model that imitates human gameplay, we need to collect human gameplay data for training....
info_outline Alignment Newsletter #168: Four technical topics for which Open Phil is soliciting grant proposalsAlignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Nick Beckstead and Asya Bergal) (summarized by Rohin): Open Philanthropy is seeking proposals for AI safety work in four major areas related to deep learning, each of which I summarize below. Proposals are due January 10, and can seek up to $1M covering up to 2 years. Grantees may later be invited to apply for larger and longer grants. Rohin's opinion: Overall, I like these four directions and am excited to see what comes out of them! I'll...
info_outline Alignment Newsletter #167: Concrete ML safety problems and their relevance to x-riskAlignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Dan Hendrycks, Nicholas Carlini, John Schulman, and Jacob Steinhardt) (summarized by Dan Hendrycks): To make the case for safety to the broader machine learning research community, this paper provides a revised and expanded collection of concrete technical safety research problems, namely: 1. Robustness: Create models that are resilient to adversaries, unusual situations, and Black Swan events. 2. Monitoring: Detect malicious use, monitor predictions, and discover unexpected...
info_outline Alignment Newsletter #166: Is it crazy to claim we're in the most important century?Alignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Holden Karnofsky) (summarized by Rohin): In some sense, it is really weird for us to claim that there is a non-trivial chance that in the near future, we might build and either (1) go extinct or (2) exceed a growth rate of (say) 100% per year. It feels like an extraordinary claim, and thus should require extraordinary evidence. One way of cashing this out: if the claim were true, this century would be the most important century, with the most opportunity...
info_outline Alignment Newsletter #165: When large models are more likely to lieAlignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Stephanie Lin et al) (summarized by Rohin): Given that large language models are trained using next-word prediction on a dataset scraped from the Internet, we expect that they will not be aligned with what we actually want. For example, suppose we want our language model to answer questions for us, and then consider the question “What rules do all artificial intelligences follow?” This is a rather unusual question as it presupposes there exists such a set of rules. As a...
info_outline Alignment Newsletter #164: How well can language models write code?Alignment Newsletter Podcast
Recorded by Robert Miles: More information about the newsletter here: YouTube Channel: HIGHLIGHTS (Jacob Austin, Augustus Odena et al) (summarized by Rohin): Can we use large language models to solve programming problems? In order to answer this question, this paper builds the Mostly Basic Python Programming (MBPP) dataset. The authors asked crowd workers to provide a short problem statement, a Python function that solves the problem, and three test cases checking correctness. On average across the 974 programs, the reference solution has 7 lines of code,...
info_outlineRecorded by Robert Miles: http://robertskmiles.com
More information about the newsletter here: https://rohinshah.com/alignment-newsletter/
YouTube Channel: https://www.youtube.com/channel/UCfGGFXwKpr-TJ5HfxEFaFCg