Negative Reinforcement And The Curse Of Sisyphus by Tyler Muto, tylermuto.com
Sisyphus was the King of Ephyra, and he had a reputation for defying the Gods and being a bit of a trickster. One of his best known exploits came at the end of his life when Hades, the God of the Underworld came to claim him, bringing along a pair of handcuffs. Sisyphus, in all his cunning and mischief, managed to persuade Hades to demonstrate the handcuffs on himself. Sisyphus further took advantage of this turn of events by locking the handcuffed Hades in his closet.
Eventually Sisyphus’ shenanigans caught up with him and he was brought to the underworld to receive his eternal punishment. For all his transgressions, he was condemned to an eternity of rolling a massive boulder up a hill. What made this especially torturous was not that the hill was infinitely tall; in fact by exerting all his strength Sisyphus was able to reach the top. However, the moment he reached the peak and was ready to rest and rejoice in his accomplishments, the darn boulder rolled right back down to the bottom. Sisyphus, tired and frustrated, had to start the process all over again. And on it went for eternity….
Now, for lack of a clever segue, I’m going to abruptly shift gears. But don’t let the tale of King Sisyphus slip too far from your mind.
Negative reinforcement is one of the most widely used and versatile aspects of how animals learn. Technically speaking, negative reinforcement refers to the elimination of a stimulus (generally unpleasant), for the purposes of encouraging or strengthening of behavior. In dog training, negative reinforcement refers to when the dog learns to turn off (or escape) an unpleasant sensation, and later learns to avoid the unpleasant sensation altogether by responding to a specific cue.
Used properly, negative reinforcement can strengthen and solidify your dog’s response to known commands, and make that response far more reliable and resistant to extinction. The key, however, is to learn to use negative reinforcement properly. An incorrect understanding of negative reinforcement can make training stressful for the dog. At best, using negative reinforcement incorrectly can simply slow down your training progress and limit the overall reliability of the results.
While there are many mistakes that are commonly made when it comes to the use of negative reinforcement (which I will refer to as R-), I would like to use the story of King Sisyphus illustrate one of the most common ones: During the initial conditioning, or instructional phases of training, when the dog is learning how their actions can control the stimulus (or pressure), no sooner than the dog completes the task asked of it, then they are instantly released and/or given another command and the dog has to escape the pressure again.
To illustrate by way of example, let’s take the early stage of remote collar conditioning where the dog learns to go his bed in response to the stimulation*. The trainer presses the button on the transmitter on a low setting (only a mild tickle or annoyance to the dog), and then guides the dog to his bed. As the dog goes to his bed, the trainer releases the button and the dog is praised and rewarded. Then, after only a brief moment, the dog is released and the exercise is started again (the trainer presses the button, guides the dog etc.).
What we must remember is that it is the cessation of the collar pressure that is reinforcing to the dog. In order to really take advantage of this reinforcement, the dog needs a moment to enjoy his accomplishment and the sense of relief and relaxation that comes with it. In other words, when the dog successfully removes the stimulation, give them a minute to savor it.
When we drill our dogs with a rapid succession of commands during R- training, we are essentially giving our dogs the same fate as Sisyphus. However, training should be a fun and enjoyable experience for the dog. The “curse of Sisyphus” erodes the value of the reinforcement, thus eroding the dog’s desire to work with us, causing them undue frustration, and slowing down our progress.
Don’t give the dog the curse of Sisyphus.
Moreover, the more motivating the stimulus or pressure is, the more important it is to give the dog this extra bit of time.
After all, if Sisyphus was given a chance to sit down and catch his breath between boulder rolls, perhaps an extended break at lunch for a Panini and a glass of wine, and two solid days off on the weekend, maybe his fate wouldn’t have been so torturous (heck, it’s just a solid days work!).
In addition to potentially causing undue stress during training, we may also be missing out on one of the potential benefits of negative reinforcement training.
For those with just a casual interest in training, you can probably stop here. For those dog nerds like myself, you may want to read on, I’m going to get all sciency for a moment.
As stated earlier, negative reinforcement training ultimately has two components. First, the dog must learn to turn off, or “escape” the pressure when they feel it. Second, they learn to avoid it all together by responding to a predictive cue (i.e. our command). One of the unique and desirable qualities of this later avoidance learning is that once the dog learns how to avoid the pressure, they continue to do so for many repetitions without needing to be exposed to the pressure again. In fact, done properly, this type of learning is one of the most resistant to extinction.
Early researchers postulated that what was maintaining the dog’s response in the absence of actual pressure was a classically conditioned fear response when the cue is given. This seems to make sense. The dog hears a command, and responds out of fear of the consequence for not responding. The problem was that the evidence simply did not support this theory. Dogs wear their emotions on their sleeves, and they are terrible liars. What researchers observed was that when dogs were properly conditioned through negative reinforcement and avoidance learning, not only did they respond reliably, but they did so with very happy and relaxed dispositions.
More research and a new theory were needed to explain this phenomenon. Along came the safety signal hypothesis. Several researchers (see M.R Denny, R.G Wiesman/J.S Litner, and D.F Tortora) recognized that after the removal of pressure, the dogs experienced a sense of relief and relaxation. Further, as the dogs learned to successfully avoid pressure, any potential unpleasant emotions faded quickly, but the sense of relief and relaxation remained. Thus it is the pleasant emotions of relief and relaxation which act as reinforcement, and account for the dog’s disposition and the continued maintenance of the desired behavior.
In fact, M.R Denny noted that the experience of relief occurs 3-5 seconds after the cessation of pressure, and lasts for 10-15 seconds, whereas relaxation requires approximately 2-5 minutes to produce full benefits**. He also noted that the effects appear to double when the dog experiences both relief and relaxation as opposed to just relief by itself.
In other words, if you give at least 2-15 seconds between reps, the dog experiences some reinforcement, but it if you give a full 2-5 minutes, the experience of reinforcement can effectively double.
What this means is that by giving ample time between repetitions during escape/avoidance training, not only are you avoiding giving your dog the curse of eternal damnation (a bit of an exaggeration I know), but you are doubling the pleasurable aspects of the training.
We can take advantage of this extra time. Research has shown that we can condition other signals to be associated with this sense of relaxation. Thus praising and interacting with the dog during this time can increase the value of your praise and help establish your interaction as a source of safety and comfort. The latter is immensely valuable for professional trainers who are regularly working with dogs with whom they are relatively unfamiliar.
Lastly, remember that this principle doesn’t only apply to leashes and collars. For instance, in the rehabilitation of dogs with social anxieties we are often working on how to relieve social pressures in appropriate ways. Taking a bit of extra time between exposures can help to amplify your results. The same applies to exposure to other forms of fear, phobia and anxiety as well.
Training with any kind of pressure is a responsibility, not a right. If you are going to do it, every effort should be made to do it well. Avoiding the curse of Sisyphus is just one of the many ways you can ensure you get the most out of your training.
*I recommend training dogs initially with the use of positive reinforcement techniques, and utilizing the electronic or remote collar only to solidify and reinforce the previously established training.
** Denny specifies that relief involves a strong autonomic factor, whereas relaxation involves striatial muscles and various motoric components.
Denny M.R. (1976). Post aversive relief and relaxation and their implications for behavior therapy. J Behav Ther Exp Psychiatry, 7: 315-321.
Denny M.R. (1983). Safety catch in behavior therapy: Comments on “safety” training: The elimination of avoidance motivated aggression in dogs. J Exp Psychol Gen, 112: 215-217.
Lindsay S.R. (2000). The handbook of applied dog behavior and training. Vol 1, 295-296.
Tortora D.F. (1983). Safety training: The elimination of avoidance motivated aggression in dogs. J Exp Psychol Gen, 112: 176-214.
Weisman R.G. and Litner J.S. (1969). Positive conditioned reinforcement of Sidman avoidance in rats. J Comp Physiol Phychol, 68: 597-603.