First, please create an account

Already have a Sophia account?

Operant Conditioning: Reinforcement and Punishment

Author: Sophia

what's covered

This tutorial expands upon Skinner's original ideas about operant conditioning, focusing on reinforcement and punishment. You will also reflect on using your self and social awareness skill when it comes to reinforcement. You will learn about:

1. Positive and Negative Reinforcement
2. Punishment
- 2a. Issues with Punishment
3. Reinforcers
- 3a. Primary Reinforcers
- 3b. Secondary Reinforcers
4. Schedules of Reinforcement
- 4a. Continuous Reinforcement
- 4b. Partial Reinforcement
5. Types of Partial Reinforcement

1. Positive and Negative Reinforcement

You may recall that operant conditioning is when learning occurs through the association of consequences to behaviors. Operant conditioning is concerned with what happens after and as a result of a person's actions, which makes it more or less likely for them to repeat that action later.

Reinforcement is anything that follows a behavior and makes it more likely for a response to be repeated.

Reinforcement is grouped into two different categories: positive reinforcement and negative reinforcement. It is important not to think of “positive” and “negative” as good and bad. Rather, these terms refer to the effects or the consequences that change behaviors; thus, one isn't better than the other.

Positive reinforcement is anything that is presented or given to the subject that makes it more likely for the response to be repeated.

EXAMPLE

Examples of positive reinforcement would include such things as candy, stickers, a high five, or even verbal praise, given to a child after they do a good job on a math test, for instance. In turn, the child is more likely to perform well in math later, because they received that positive reinforcement.

In contrast, negative reinforcement is anything that is taken away from a subject that makes it more likely for a response to be repeated. Remember, reinforcement is all about making it more likely for something to occur; therefore, negative reinforcement is generally taking away something that is unpleasant to compel someone to want to perform an action more.

EXAMPLE

Suppose there is an annoying sound in your room, and it can be turned off by pushing a button. The negative reinforcement is the annoying sound, which makes you more likely to perform the behavior of pushing the button. Similarly, you may perform a chore so that your roommate doesn’t nag at you. In either case, you're more likely to do what you're supposed to be doing.

term to know

Positive Reinforcement

Any consequence that is given to the subject that increases the likelihood of repeating the behavior.

Negative Reinforcement

Any consequence that is taken from the subject that increases the likelihood of repeating the behavior.

2. Punishment

Punishment follows a response and makes it less likely for a behavior to be repeated.

In other words, punishments are trying to suppress a behavior. There has been much research, particularly in the field of childhood development and education, that has shown how effective punishment is.

Under punishment, there is also positive and negative reinforcement:

Positive reinforcement of punishment is something that you give to a person that makes it less likely for the behavior to occur, such as a loud noise or an electric shock.
Negative reinforcement of punishment is something you take away that makes it less likely for something to occur, like taking away a toy after a child does something bad. This might make it less likely for the child to repeat that bad behavior.

Punishment, as well as negative reinforcement, is what is known as an aversive consequence. This means that it is something following an action that is painful or uncomfortable, that somebody doesn't like.

The difference between these two types of aversive consequences is that negative reinforcement makes it more likely for a behavior to repeat, whereas punishment makes it less likely for the behavior to repeat. Both of these concepts deal with unpleasant things, but the consequences, or results, are different.

term to know

Punishment

A consequence that decreases the likelihood of repeat behavior.

2a. Issues with Punishment

Punishment can be more powerful than reinforcement, especially in the short term. It is possible to change somebody's behavior rather quickly as a result of punishment, but it can have some negative or unwanted results.

For instance, when you're using punishment, you don't teach a person to be positive, or to learn prosocial kinds of behaviors. Generally, you're simply teaching them either escape learning or avoidance learning, which are ways to end an aversive stimulus quickly.

Escape learning is learning to respond in a way that ends an aversive stimulus quickly, by trying to get away from the aversive stimulus. Avoidance learning, similarly, is learning to respond in a way that postpones or prevents an aversive stimulus from happening.

Instead of learning to deal with problems, like bullying, for example, a child instead learns to escape or avoid. This, in turn, rewards them with some relief so that they feel better about the situation. This makes them more likely to escape or avoid again. Therefore, punishment isn't necessarily teaching somebody in the best possible way.

There are a few other notable issues or problems with punishment:

Punishment tends to increase aggression in people and makes them more likely to react violently towards others. Because punishment deals with aversive consequences, or things a person doesn’t like, people tend to react a lot stronger. Thus, the important thing regarding punishment is that it needs to be used in the best possible way. In particular, the punishment should be equal or appropriate to the behavior it is responding to, not disproportionately harsh.
Punishment needs to follow the behavior immediately so that there's a short amount of time in between the two events.
Punishment needs to be followed or combined with reinforcement so that there is a positive aspect that goes along with the child's learning and not constant punishment.

terms to know

Aversive Stimulus

Unpleasant event in the environment.

Escape Learning

Learning whose goal is to get away from the stimulus as soon as possible.

Avoidance Learning

Learning whose goal is to avoid the stimulus.

3. Reinforcers

Reinforcers can be used in more abstract and complex ways to explain how all human behavior occurs. This tutorial will explain how some of those higher-level reinforcers help to create a wide range of human behaviors.

There are different levels of reinforcers, which have gradually become more abstract and more complex in the way that they are applied. This tutorial will focus mainly upon positive reinforcers as examples, but keep in mind that negative reinforcers also apply to these concepts.

3a. Primary Reinforcers

Primary reinforcers are basic types of reinforcers that are rewarding and desirable in and of themselves.

These are things that people don't have to learn to like, but rather things that people typically innately like.

Primary reinforcers are generally very basic things related to biological needs, such as:

Food
Water
Sex

EXAMPLE

If somebody offers you a glass of water on a hot day, you don’t have to wonder if you need it. You’re programmed biologically to need hydration.

term to know

Primary Reinforcers

A reward that fulfills a biological need/desire.

3b. Secondary Reinforcers

Secondary reinforcers are reinforcers that people have to learn to value. These are things that are rewarding and desirable generally because they're related to a primary reinforcer. They're not necessarily something that, as a child, you would automatically know that you should like and want.

There are different categories of secondary reinforcers:

Tokens: Things that are not valuable in and of themselves, but can be used by a person to enable them to get primary reinforcers.

EXAMPLE
For example, money is a secondary reinforcer because it's not something that you necessarily value in any way on its own, because it's just paper. However, you know that you can use it to buy things like food or other things that you want.

Social reinforcers: This is reinforcement from other people, like praise or attention or affection.

EXAMPLE
Children tend to associate attention with biological primary reinforcers like food or physical contact. However, it's not necessarily something that they innately know they should like or want.

Feedback: Feedback is any information that is given to a person about the results of their behavior. This isn't necessarily rewarding in and of itself, but it is a type of social reinforcer.

EXAMPLE
If you’re playing a video game, elements like the background music or the flashing colors let you know exactly how you’re performing in the game. Thus, this feedback improves the likelihood of you modifying your behavior to do what you're supposed to be doing.

It is important to note that secondary reinforcers are not less powerful than primary reinforcers; the desire for things like money or social attention can oftentimes be more powerful.

Self and Social Awareness: Skill Reflect

Think about the reinforcers that motivate you at work. Maybe it’s praise from a colleague or a smile and nod from your boss.

terms to know

Secondary Reinforcer

Reward that the subject has learned has value to them.

Feedback

Information offered to the subject regarding the results of a behavior.

4. Schedules of Reinforcement

When and how a behavior is rewarded can determine whether it is learned as well as the strength of that learning. A schedule of reinforcement is a plan that determines when a behavior will or will not be reinforced.

There are several different kinds of schedules of reinforcement under operant conditioning. They fall under two basic categories:

Continuous reinforcement.
Partial reinforcement.

term to know

Schedule of Reinforcement

A pattern or timetable that determines when a behavior will be reinforced and when it will not.

4a. Continuous Reinforcement

1. The first category is continuous reinforcement. This means giving reinforcement every time a behavior is performed.

EXAMPLE

Every single time a child performs a behavior you're hoping for, you give them a piece of candy.

There are some pros and cons of this type of schedule:

It is particularly useful when a behavior is first being learned.
However, it is not necessarily realistic in the long term because there likely is not an unlimited supply of the reward or reinforcement.
Extrinsic rewards aren't necessarily continually rewarding in the same way. Eventually, for example, a child might get tired of the candy or literally sick because of it.

4b. Partial Reinforcement

The second category is partial reinforcement, which tends to produce more resistance to extinction. This means that the learning is less likely to go away since a subject is less likely to get tired or satiated by the reinforcer.

5. Types of Partial Reinforcement

There are four basic types of partial reinforcement that are used:

Fixed ratio
Variable ratio
Fixed interval
Variable interval

5a. Fixed Ratio

The first kind of partial reinforcement is fixed ratio reinforcement or FR. In FR, reinforcement is given after a specific, predetermined number of correct behaviors.

EXAMPLE

Every third or fourth time a child performs the behavior you want, you give them a reward.

This generally leads to some very quick, consistent responses; it is easy for the learner to catch on.

EXAMPLE

A rat that is placed within a box with a button that dispenses a treat will press the button many times, one after the other. It will continually press the button because it realizes that after a certain number of times, it will get the reward.

term to know

Fixed Ratio (FR)

Reinforcement is delivered based on the same number of responses.

5b. Variable Ratio

In contrast, the second type of partial reinforcement is the variable ratio (VR) schedule of reinforcement. In VR, reinforcement is given after a variable, unpredictable number of correct behaviors.

Usually, reinforcements are given within a range, such as between every two to five times on average. Each time the subject performs the behavior, they won't know whether or not they will receive the reward. This generally leads to very high, consistent levels of responses.

hint

Think of a variable ratio schedule of reinforcement like gambling with a slot machine, where you're very likely to continue playing the slot machine because you don't know when it's going to pay off.

term to know

Variable Ratio (VR)

Reinforcement is delivered based on a changing number of responses.

5c. Fixed Interval

In addition to ratios, there are also intervals. A fixed interval (FI) is when a reinforcer is given after a specific amount of time has passed since the last reinforcement.

EXAMPLE

A subject receives a reward after 30 seconds have passed from the last time the subject performed the desired behavior.

This leads to some very slow responses from the subject. Generally, there will be more response towards the end of the interval, because the subject begins to predict exactly when they can get the reward. Thus, they only perform the behavior when they know they're going to receive reinforcement.

EXAMPLE

One example of FI might be getting paid a salary at the end of every week. You don't necessarily work harder at the end of the week before you get paid. Instead, you generally tend to work harder at the beginning of the week.

term to know

Fixed Interval (FI)

Reinforcement is delivered based on the same amount of time elapsed.

5d. Variable Interval

A variable interval (VI) schedule of reinforcement is when a reinforcer is given after an unpredictable, variable amount of time has passed since the last reinforcement. It might be on average every 30 seconds, but the subject doesn't realize this, nor does the experimenter generally realize it.

This generally leads to some very slow, but also very steady responses that don't tend to go away very easily, because the subject doesn't realize when they're going to get the reward. Therefore, they have to produce the behavior at regular intervals because they need to be ready to receive the reinforcement that they desire.

term to know

Variable Interval (VI)

Reinforcement is delivered based on a varying amount of time elapsed.

summary

Positive and negative reinforcement deals with rewards and punishment, both of which can be positive and negative. Positive and negative does not mean good and bad, but rather refers to adding or taking away stimuli. Positive and negative reinforcement are both about reducing or suppressing behaviors.

There are potential issues with punishment, however, in that it can teach escape learning and avoidance learning, instead of teaching a person to be positive or to learn pro-social kinds of behaviors.

Reinforcers are part of operant conditioning. They refer to specific experiences that follow a behavior, making that behavior more likely to be repeated. There are positive reinforcers and negative reinforcers, each with specific aspects. These reinforcers can be specific or abstract but they are always reinforcing behavior and making it more likely for that behavior to reoccur. A primary reinforcer is a reward that fulfills a biological need or desire, whereas a secondary reinforcer is a reward that one must learn has value.

There are two categories of schedules of reinforcement, which are plans that determine when a behavior will or will not be reinforced: continuous and partial. There are four basic types of partial reinforcement: fixed ratio (FR), variable ratio (VR), fixed interval (FI), and variable interval (VI). Your self and social awareness skill can be used to recognize which reinforcement is meaningful to you and which is not.

In fixed ratio (FR), a reinforcement is given after a specific, predetermined number of correct behaviors. In variable ratio (VR) schedules of reinforcement, a reinforcement is given after a variable, unpredictable number of correct behaviors. A fixed interval (FI) refers to when a reinforcer is given after a specific amount of time has passed since the last reinforcement, while a variable interval (VI) schedule of reinforcement is when a reinforcer is given after an unpredictable, variable amount of time has passed since the last reinforcement. Each schedule of reinforcement typically has varying levels of success.

Good luck!

Source: THIS TUTORIAL WAS AUTHORED BY SOPHIA LEARNING. PLEASE SEE OUR TERMS OF USE.

Terms to Know

Aversive Stimulus: Unpleasant event in the environment.
Avoidance Learning: Learning whose goal is to avoid the stimulus.
Escape Learning: Learning whose goal is to get away from the stimulus as soon as possible.
Feedback: Information offered to the subject regarding the results of a behavior.
Fixed Interval (FI): Reinforcement is delivered based on the same amount of time elapsed.
Fixed Ratio (FR): Reinforcement is delivered based on the same number of responses.
Negative Reinforcement: Any consequence that is taken from the subject that increases the likelihood of repeating the behavior.
Positive Reinforcement: Any consequence that is given to the subject that increases the likelihood of repeating the behavior.
Primary Reinforcer: A reward that fulfills a biological need/desire.
Punishment: A consequence that decreases the likelihood of repeat behavior.
Schedule of Reinforcement: A pattern or timetable that determines when a behavior will be reinforced and when it will not.
Secondary Reinforcer: Reward that the subject has learned has value to them.
Variable Interval (VI): Reinforcement is delivered based on a varying amount of time elapsed.
Variable Ratio (VR): Reinforcement is delivered based on a changing number of responses.

First, please create an account

Operant Conditioning: Reinforcement and Punishment

Table of Contents

1. Positive and Negative Reinforcement

2. Punishment

2a. Issues with Punishment

3. Reinforcers

3a. Primary Reinforcers

3b. Secondary Reinforcers

4. Schedules of Reinforcement

4a. Continuous Reinforcement

4b. Partial Reinforcement

5. Types of Partial Reinforcement

5a. Fixed Ratio

5b. Variable Ratio

5c. Fixed Interval

5d. Variable Interval