Libsyn Directory

"«Boundaries», Part 1: a key missing concept from utility theory" by Andrew Critch

Release Date: 07/28/2022

LessWrong Curated Podcast

Things I believe about making surveys, : If you write a question that seems clear, there’s an unbelievably high chance that any given reader will misunderstand it. (Possibly this applies to things that aren’t survey questions also, but that’s a problem for another time.) A better way to find out if your questions are clear is to repeatedly take a single individual person, and sit down with them, and ask them to take your survey while narrating the process: reading the questions aloud, telling you what they think the question is asking, explaining their thought process in answering...

"Toni Kurz and the Insanity of Climbing Mountains" by Gene Smith

LessWrong Curated Podcast

Content warning: death I've been on a YouTube binge lately. My current favorite genre is disaster stories about mountain climbing. The death statistics for some of these mountains, especially ones in the Himalayas are truly insane. To give an example, let me tell you about a mountain most people have never heard of: Nanga Parbat. It's a 8,126 meter "wall of ice and rock", sporting the tallest mountain face and the fastest change in elevation in the entire world: the Rupal Face. I've posted a picture above, but these really don't do justice to just how gigantic this wall is. This single face...

"Deliberate Grieving" by Raemon

LessWrong Curated Podcast

This post is hopefully useful on its own, but begins a series ultimately about grieving over a world that might (or, might not) be . It starts with some pieces from a previous post, but goes into more detail. At the beginning of the pandemic, I didn’t have much experience with . By the end of the pandemic, I had gotten quite a lot of practice grieving for things. I now think of grieving as a key life skill, with ramifications for epistemics, action, and coordination. I had read , which gave me footholds to get started with. But I still had to develop some skills from...

"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope

LessWrong Curated Podcast

Crossposted from the . May contain more technical jargon than usual. TL;DR: To even consciously consider an alignment research direction, to locate it as a promising lead. As best I can tell, many directions seem interesting but do not have strong evidence of being “entangled” with the alignment problem such that I expect them to yield significant insights. For example, “we can solve an easier version of the alignment problem by first figuring out how to build an AI which maximizes the number of real-world diamonds” has intuitive appeal and plausibility, but this claim...

"Changing the world through slack & hobbies" by Steven Byrnes

LessWrong Curated Podcast

Introduction In EA orthodoxy, if you're really serious about EA, the three alternatives that people most often seem to talk about are (1) “direct work” in a job that furthers a very important cause; (2) ; (3) earning that will help you do those things in the future, e.g. by getting a PhD or teaching yourself ML. By contrast, there’s not much talk of: (4) being in a job / situation where you have extra time and energy and freedom to explore things that seem interesting and important. But that last one is really important!

"«Boundaries», Part 1: a key missing concept from utility theory" by Andrew Critch

LessWrong Curated Podcast

Crossposted from the . May contain more technical jargon than usual. This is Part 1 of my on LessWrong. Summary: «Boundaries» are a missing concept from the axioms of game theory and bargaining theory, which might help pin-down certain features of multi-agent rationality (this post), and have broader implications for effective altruism discourse and x-risk (future posts). 1. Boundaries (of living systems) Epistemic status: me describing what I mean. With the exception of some relatively recent and isolated pockets of research on embedded agency (e.g., ), most attempts at formal...

"ITT-passing and civility are good; "charity" is bad; steelmanning is niche" by Rob Bensinger

LessWrong Curated Podcast

I often object to claims like "charity/steelmanning is an argumentative virtue". This post collects a few things I and others have said on this topic over the last few years. My current view is: ("the art of addressing the best form of the other person’s argument, even if it’s not the one they presented") is a useful niche skill, but I don't think it should be a standard thing you bring out in most arguments, even if it's an argument with someone you strongly disagree with. Instead, arguments should mostly be organized around things like: Object-level learning and truth-seeking, with...

"What should you change in response to an "emergency"? And AI risk" by Anna Salamon

LessWrong Curated Podcast

Related to: Epistemic status: A possibly annoying mixture of straightforward reasoning and hard-to-justify personal opinions. It is often stated (with some justification, IMO) that AI risk is an “emergency.” Various people have explained to me that they put various parts of their normal life’s functioning on hold on account of AI being an “emergency.” In the interest of people doing this sanely and not confusedly, I’d like to take a step back and seek principles around what kinds of changes a person might want to make in an “emergency” of different sorts. ...

"On how various plans miss the hard bits of the alignment challenge" by Nate Soares

LessWrong Curated Podcast

Crossposted from the . May contain more technical jargon than usual. (As usual, this post was written by Nate Soares with some help and editing from Rob Bensinger.) In my, I described a “hard bit” of the challenge of aligning AGI—the sharp left turn that comes when your system slides into the “AGI” capabilities well, the fact that alignment doesn’t generalize similarly well at this turn, and the fact that this turn seems likely to break a bunch of your existing alignment properties. Here, I want to briefly discuss a variety of current research proposals in the field, to explain...

"Humans are very reliable agents" by Alyssa Vance

LessWrong Curated Podcast

Over the last few years, deep-learning-based AI has progressed in fields like natural language processing and image generation. However, self-driving cars seem stuck in perpetual beta mode, and aggressive predictions there have repeatedly been . Google's self-driving project started four years AlexNet kicked off the deep learning revolution, and it still isn't deployed at , thirteen years later. Why are these fields getting such ? Right now, I think the biggest answer is that judge models by average-case performance, while self-driving cars (and many other applications) require matching...

More Episodes

https://www.lesswrong.com/posts/8oMF8Lv5jiGaQSFvo/boundaries-part-1-a-key-missing-concept-from-utility-theory

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is Part 1 of my «Boundaries» Sequence on LessWrong.

Summary: «Boundaries» are a missing concept from the axioms of game theory and bargaining theory, which might help pin-down certain features of multi-agent rationality (this post), and have broader implications for effective altruism discourse and x-risk (future posts).

1. Boundaries (of living systems)

Epistemic status: me describing what I mean.

With the exception of some relatively recent and isolated pockets of research on embedded agency (e.g., Orseau & Ring, 2012; Garrabrant & Demsky, 2018), most attempts at formal descriptions of living rational agents — especially utility-theoretic descriptions — are missing the idea that living systems require and maintain boundaries.

When I say boundary, I don't just mean an arbitrary constraint or social norm. I mean something that could also be called a membrane in a generalized sense, i.e., a layer of stuff-of-some-kind that physically or cognitively separates a living system from its environment, that 'carves reality at the joints' in a way that isn't an entirely subjective judgement of the living system itself. Here are some examples that I hope will convey my meaning:

1. Boundaries (of living systems)

TOPICS