Source: calendar: public domain; http://morguefile.com/archive/display/9472
Hello, class. So in today's lesson, we're going to expand on the idea of operant conditioning to refer to what schedules of reinforcement are. OK. Now remember, operant conditioning explains how learning occurs due to the consequences of behaviours, things like reinforcement and punishment, which makes behaviors more or less likely to occur over time. But remember that learning occurs over time. It's not generally something that's the result of a single instance of a reward or punishment. So just because you get a piece of candy doesn't make you absolutely likely to learn and to do that behavior again later.
So when and how a behavior is rewarded can determine whether it's learned as well, as well as the strength of that learning. So a schedule of reinforcement means a plan that determines when a behavior will be reinforced and when it won't be reinforced. And these plants can affect exactly how strong that learning will be. OK. So we're going to talk about different ones today.
Now there's several different kinds of schedules of reinforcement under operative conditioning. They fall under two basic categories. OK. So the first one is continuous reinforcement. And this means giving a reinforcement every time a behavior is performed. So every time you see what you're supposed to do or what you're supposed to be seeing in a child, you'd give them a piece of candy. OK. Now this is useful particularly when a behavior is first being learned.
However, it's not necessarily realistic because, let's face it, you don't have an unlimited supply of candy. And also outside or extrinsic rewards aren't necessarily continually rewarding in the same way. Eventually a child might get tired of it. They might get sick of candy or they might literally get sick. And so that kind of reinforcement also isn't necessarily the most helpful.
So we're also going to be talking about partial reinforcement, which tends to produce more resistance to extinction, which is to say it's less likely to go away since a subject is less likely to get tired or satiated by the reinforcer. Now there are four basic types of partial reinforcement that are used. And we're going to look over each one. But as you can see, it's definitely a very regular pattern of the name. So it should be easy for you to remember after we're done with a little bit of the explanation.
So first we have our fixed ratio reinforcement, or FR, which is a reinforcer is given after a specific, predetermined number of correct behaviors. For example, every third or fourth time you see the behavior you want, you give somebody a reward. OK. And this generally leads to some very quick, consistent responses. It's something that's easy for the learner to catch onto. For example, a rat that's placed within a box with a button will press the button many times one after the other. It'll continually press it because it realizes, after a certain number of times, they'll always get that reward. OK.
Now on the other side of that, we have a variable ratio schedule of reinforcement, or VR, which is a reinforcement is given after a variable, unpredictable number of correct behaviors. Usually they're given within some kind of range, like between every two to five times on average, the subject will receive the reinforcement, or on an average of every four times, they would get it. But the fact is we don't know necessarily each time when the subject will receive the reward. And this generally leads to some very high, consistent levels of responses. Think of this kind of like gambling or like a slot machine, where you're actually very likely to continue playing the slot machine because you don't know when it's going to pay off.
Now, as opposed to the ratios we have, we also have the intervals. A fixed interval is when a reinforcer is given after a specific amount of time has passed since the last reinforcement. So, for example, after 30 seconds have passed from the last time the subject performed the behavior, they'll be able to get a reward. OK. And this leads to some very slow responses from the subject. Generally more responses towards the end of the interval, because they begin to predict exactly when they can get it. So they only perform the behavior when they know they're going to receive a reinforcement.
Now this is one that's kind of rare to use. It's one that we definitely learn about, but there isn't really a great example of a fixed interval. One example might be getting paid a salary at the end of every week. OK. But the problem with that is that you don't work harder at the end of the week before you get paid. Right? You generally tend to work harder at the beginning of the week. So it's not a great example to use. But remember that that is one of the possible schedules we can use.
And then the final one we're going to talk about is a variable interval, or VI schedule of reinforcement, which is when a reinforcer is given after an unpredictable, variable amount of time has passed since the last reinforcement. So it might be on average of every 30 seconds, but the subject doesn't realize that. And the experimenter generally doesn't realize it as well. And this generally leads to some very slow but also very steady responses that don't tend to go away very easily, again, because the subject doesn't realize when they're going to be getting it. So they have to give the behavior at regular intervals because they need to be ready to receive the reinforcement, the thing that they want to get. OK.
Now as a recap again, the important things to remember are fixed and variable, which is to say it's a set amount of something or it's unpredictable, and it changes. And remember ratio refers to the number of behaviors, whereas interval refers to the amount of time. And that should help you remember exactly what we're talking about with schedules of reinforcement.
Reinforcement is delivered based on the same amount of time elapsed.
Reinforcement is delivered based on the same number of responses.
A pattern or timetable that determines when a behavior will be reinforced and when it will not.
Reinforcement is delivered based on a varying amount of time elapsed.
Reinforcement is delivered based on a changing number of responses.