Reinforcement gradients seem to be neurologically fundamental
and they seem to me to be very underrated
I’m trying to write shorter blog posts. I have a bunch of thoughts that I’d be more satisfied to have in blog post form than in tweet or twitter thread form, but that I think are ultimately pretty compact. Or they would be if I knew how to write them that way! which I’m still figuring out. Maybe I’ll successfully make this blog post short. If not, at least I am not burying the lede with my title. (ETA: not as short as I hoped. Oh well.)
When I say reinforcement gradients, I mean the thing that in the academic behaviorist literature tends to be described as “conjugate reinforcement schedules”, or “synchronous reinforcement schedules”. Here, I’m going to call it “conjugate reinforcement” because that’s the term I’ve mostly heard behaviorists use1.
Conjugate reinforcement is the thing where the effect I get varies as I do more or less of a thing. Like if I’m banging on a pot with a stick, the sound will be louder as a bang more. If I bang just a little, the sound will be less. Or like, if I push a box around, the box moves pretty exactly as my hand pushes it—not a in a discrete burst half a second later.
My take on conjugate reinforcement schedules occupies a pretty special epistemic status in my mind (and heart):
I’m pretty sure I’m right about it.
I think it’s sort of an obvious point, and in line with published academic research on the topic.
~Nobody seems to care?
But insofar as it’s true, it seems pretty relevant to a lot of people’s interests?
Color me confused. I notice I’m confused.
I learned about conjugate reinforcement from the CAAWT people. They have a multipart procedure called “constructional affection”, and the first phase of it is about giving dogs affection on a conjugate schedule.2 I think their procedure is very cool, and if you talked to me a few years ago, you may have heard me go one about it at length. If you’re interested, I recommend watching some of their videos. Also, please drop me a line if their work intrigues you, since I love discussing it.
My mostly failed literature review
When I was writing this, I tried to use o3 to do a literature reviews of the studies comparing conjugate and episodic reinforcement schedules. That was basically a mess! I’ve now spent a few hours trying to sort through it, and I give up. It cited a bunch of real papers, but mostly with wrong links until I specifically asked for the right links for each paper, and its summaries were mostly wrong. I think one issues is that many of the relevant papers are old, and their full text isn’t easily available. I think another big issue is that, while there are plenty of papers about conjugate reinforcement schedules, there are almost none that compare conjugate schedules to episodic schedules. Also, o3 told me it couldn’t find a single study comparing these schedules in animals! Why not???
The one I found that seems to be the real thing has six 16-20 week old babies as its subjects. It got them to activate a light bar wind chime thing, and found:
compared to the continuous CRF schedule, the conjugate-reinforcement schedule generated higher peak responding.
Here are some other quotes I saved because they seemed sort of relevant, even though they didn’t address the conjugate vs. episodic thing directly.
Two from Conjugate reinforcement of infant exploratory behavior:
the rapidity with which even the youngest infants acquire conjugately reinforced responses has eliminated the necessity for lengthy shaping or training periods.
All infants receiving the conjugate procedure showed performance sensitive to the reinforcement contingencies for the 27-minute test period.
Based on an examination of studies that were published during the past 10 years, I identified only one article authored by a behavior-analytic researcher that referenced one or more studies from the conjugate reinforcement literature.
My own life experience says the conjugate thing works better
Papers aside, my own life experience has been that conjugate schedules seem to be doing something way more consistently than episodic schedules do.
I’ve tried a bunch of behaviorist style training projects with my dogs, my kids, and myself, and the diff has not been subtle.
I may have updated too much, but I think my top point estimate guess about what’s going on is that not only are conjugate schedules more neurologically fundamental, episodic schedules can only work when they are built on top of conjugate schedule duct tape, and that most of the time when the teacher thinks that the learner is handling the episodic abstraction on their end, instead the teacher is telegraphing that reinforcement is coming in a way that is more conjugate than they are aware of.
One of my more casual hobbies is reading parenting Facebook groups and getting mad when people get (godawful, by my standards) “advice” about how to get their kids to stop doing stuff that the parents don’t like or think is bad. I also like to talk to parents about this stuff in person, or in more personal online spaces. I like to think this gives me a halfway decent sense of what sorts of things people tend to recommend.
I was going to say that I had literally never, if I exclude myself, heard anyone talk about how much more possible3 it is for kids, especially little kids, to learn from things from gradients. But I think that’s wrong, and a good friend of mine did bring this up one in a discussion. Maybe no one brings it up because it’s not actually helpful way to think about it for almost anyone?
In many ways, I think the thing I’m trying to talk about is remedial, is the sense tons if not most people get it intuitively, even if they couldn’t put it in words. But, for me, a bunch of the point of blogging is to write up things that I wish I had been able to tell my past self. And I’m quite sure that the conjugate vs. episodic framing would have been extremely useful to past me. It’s extremely useful to present me.
Here’s a real example from my life. I’m on my 5th (and final) kid. She’s 18 months old, and she doesn’t mess with our dog water bowl. She hasn’t messed with it in months. Just as I was writing this I put some water down for my dog and walked away, even though she’s in the same room. I still mostly pick it up when it’s full and the dogs aren’t actively using it, because I don’t want to tempt fate, but I’m still kind of shocked that she doesn’t mess with it. My first three messed with the dog water bowl for years, and age-adjusted my 3 year old messed with it more than she does but not as much as the older ones had. And the thing that made the difference was that I learned a bunch about how reinforcement worked, including and especially about conjugate schedules.
As my daughter would go to mess with the dog water, I would block her. And as she withdrew from her attempts to mess with the dog water, I would unblock her. Not “after”, or “when”, but “as”. And over (a little!) time, we did a shaping process where I could eventually telegraph all this without being right down on the floor next to her.
I did some other stuff too, like trying to give her other opportunities to play with water, explaining that this water was for the dog and that was bad for the floor to get it wet, letting her play with the empty dog water bowl (since the bowl itself seemed to be about half the appeal anyway), letting her see that I didn’t like it when she did pour out the dog water, telling her not to, and taking the dog water away when I couldn’t get her to stop messing with it. But I did all that type of stuff with my older kids too, and it took way longer for them to stop messing with the dog water.
Don’t get me wrong—I haven’t usually gotten as-seen-on-TV style results when I try to teach my little kids stuff this way. Maybe one day :-). But even when my experiments ~fail, when I’m manipulating a contingency that they actually care about, I typically get a strong felt sense that “something is happening”. I think I get that sense because the people I’m trying to communicate with are getting that sense.
Which makes sense to me because I think ~everyone, including babies, is seeking actual causal models of their environment, and things varying together in realtime is one of the most reliable signals out there that there’s a causal connection. Susan Friedman says “control is a primary reinforcer for behavior,” which seems very true me. Conjugate schedules are good evidence that I’m actually controlling something.
If I’m right about conjugate schedules, then why do people focus so much on episodic schedules? Could be something about how language is pretty discrete... Meditators talk about how much confusion the discreteness of language creates. Maybe separately from language there’s something about how modernism likes things to be discrete? I think there’s also some view from nowhere/central planner’s eye view/trust the contrived plan over your own lying eyes type of distributed psy op that likes episodic reinforcement more because it’s often easier to make contrived reinforcers episodic, and natural environmental ones will almost never be.
But that’s more me trying to connect this up to my other weird interests. I think my mild conspiracy theorizing is pretty separable from my main claim, which even has a bunch of academic legitimacy, afaict.
When I’m trying to do any sort of nonverbal communication with someone, one of the first things I check if I’m getting stuck is whether I’m trying to do an episodic thing instead of a conjugate thing. Since I started doing that I’ve gotten way better results. YMMV, as with all things, but I’d be quite surprised if the opposite advice were good for anyone. (And trying to imagine the opposite advice being true makes me pretty twitchy.)
Some other terms people use include, afaict:
schedule of covariation
moment-to-moment contingency
graded feedback schedule
continuous-feedback schedule
A confusing thing about trying to search this literature is that researchers used to use “continuous” to mean FR1, an episodic schedule where every response is reinforced. More recently, researchers have used “continuous” to mean “conjugate”.
Sometimes they describe the conjugate schedule they use in a pretty discrete way, but I think that’s misleading, both about what’s effective and about how Sean and Maasa actually think about it. IIRC on their podcasts they are pretty clear that the person should be reacting in time with the dog as much as possible.
I tried writing “much easier” instead of “much more possible”, but that didn’t seem true enough so I changed it.

