An Interview With Skinner
by Dan Platt
Dan:  We are lucky today to have with us Dr. B.F. Skinner renowned psychologist and behaviorist. He was born in Susquhanna,  Pennsylvania in 1904, and has a Doctorate in Psychology from HarvardUniversity where he also taught for over thirty years.  He is considered one of the leading behaviorist of the twentieth and twenty-first centuries, as well as a proponent of operant conditioning, and the inventor of the Skinner box, a device used for facilitating experimental observations. Some of his scientific works include The Behavior of Organisms, Verbal Behavior, Walden Two, and Beyond Freedom and Dignity.   Dr. Skinner, how would you summarize operant conditioning theory?
Skinner:  Operant conditioning is based upon the idea that learning is a function of change in overt behavior. Changes in behavior are the result of an individual's response to events, stimuli if you will, that occur in the environment. A response produces a consequence such as defining a word, hitting a ball, or solving a math problem. When a particular Stimulus-Response pattern is reinforced, or in simpler terms rewarded, the individual is conditioned to respond.
    Reinforcement is the key element in my theory.  A reinforcer, both positive and negative, is anything that strengthens the desired response. Positive reinforcers are any stimuli that result in increased frequency of response when applied; whereas, negative reinforcers are any stimuli that result in the increased frequency of a response when withdrawn.  Negative reinforcers should not be confused with adversive stimuli, or punishment, which results in reduced responses, but not necessarily a change in behavior. 
Dan:  What are some of the basic assumptions of your theory?
Skinner:  The most basic assumption is learning can be defined as the relatively permanent change in behavior brought about as a result of experience or practice.  Experience and/or practice environments impact overt behavior, the psychomotor domain, in the achieving of objectives as established in these environment.  Along with this assumption, one must assume for the most part that the mind is a black box that we cannot see into. The only way we know what is going on in the mind is to look at overt behavior. Thus, it is through operant conditioning that voluntary responses are learned. 
Dan: Could you elaborate on operant conditioning response and stimulus?
Skinner:  Where classical conditioning illustrates stimulus to response learning, operant conditioning is more a response to stimulus learning since it is the consequence that follows the response that influences whether the response is likely or unlikely to occur again.  While the antecedent stimulus in operant conditioning does not elicit or cause the response, it can influence it. When the antecedent does influence the likelihood of a response occurring, it is technically called a discriminative stimulus.  It is the appropriate use of positive and negative stimulus following a voluntary response that changes the probability of whether the response is likely or unlikely to occur again.  In order to change the probability of a given response occurring again, stimuli can be added to (a positive stimulus) or taken away from (a negative stimulus) the environment.
Dan:  Could you elaborate on the stimuli used to continue or discontinue a response?
Skinner:  The two basic premises in operant conditioning are positive and negative reinforcement strengthens behavior and punishment, response cost, and extinction weaken behavior.  The term reinforcement always indicates a process that strengthens a behavior, which is true in both positive and negative reinforcement.  The difference between positive and negative reinforcement doesn't lie in the word reinforcement but in the words positive and negative. 
    The word positive has two cues associated with it. First, a positive or pleasant stimulus is used in the process, and second, the reinforcer is added.  In positive reinforcement, then, a positive reinforcer is added after a response and increases the frequency of the response.  This is unlike negative reinforcement where first a negative or aversive stimulus is used in the process, and second, the reinforcer is subtracted. In negative reinforcement, after the response the negative reinforcer is removed, increasing the frequency of the response.  
    By understanding positive reinforcement, one can also understand response cost.  For if positive reinforcement strengthens a response by adding a positive stimulus, then response cost has to weaken a behavior by subtracting a positive stimulus. In response cost, after the response, the positive reinforcer is removed which weakens the frequency of the response. 
    By understanding negative reinforcement, one can also understand punishment.  For if negative reinforcement strengthens a behavior by subtracting a negative stimulus, then punishment has to weaken a behavior by adding a negative stimulus.  In punishment, after a response, a negative or aversive stimulus is added which weakens the frequency of the response.
    Extinction is simply refusing to reinforce a behavior.  It is neither a negative or positive reinforcer.  The concept behind extinction is that by no longer reinforcing a previously reinforced response, the frequency of the response weakens. 
Dan:  Can you expand on continuous and intermittent reinforcement? 
Skinner: Yes.  Stimuli are presented in the environment based on two basic schedules: continuous and intermittent. Continuous reinforcement simply means that the behavior is followed by a consequence each time it occurs. Intermittent schedules are based either on the passage of time, interval schedules, or the number of correct responses emitted, known as ratio schedules. The consequence can be delivered based on the passage of the same amount of  time or the same number of correct responses, which is a fixed consequence, or it could be based on a slightly different amount of time or number of correct responses that vary around a particular number, a variable consequence. This results in four classes of intermittent schedules:  1) fixed interval is reinforcing the first correct response after a set amount of time (always the same) has passed; 2)  variable interval is reinforcing the first correct response after a set amount of time has passed, and after the reinforcement, a new time period (shorter or longer) is set with the average equaling a specific number over a sum total of trials; 3)  fixed ratiois giving a reinforcer after a specified number of correct responses, used primarily for learning a new behavior; and 4) variable ratio is giving a reinforcer after a set number of correct responses, and after reinforcement the number of correct responses necessary for reinforcement changes, used primarily for maintaining behavior.  The number of responses per time period increases as the schedule of reinforcement is changed from fixed interval to variable interval and from fixed ratio to variable ratio. Variable interval and especially, variable ratio schedules produce steadier and more persistent rates of response because the learners cannot predict when the reinforcement will come although they know that they will eventually succeed.
Dan:  What are some of the implications you see as far as instruction is concerned?
Skinner:  Our knowledge about operant conditioning has greatly influenced educational practices. Children at all ages exhibit behavior. Teachers and parents are, by definition, behavior modifiers. If a child is behaviorally the same at the end of the academic year, you will not have done your job as a teacher; children are supposed to learn by producing relatively permanent changes in behavior or behavior potential as a result of the experiences they have in the school/classroom setting.
    Students understand that careful studying is reinforced by good grades.  Yet, many students need to overcome their dislike of studying to attain their desired grades.  Negative Reinforcement can be used by allowing escape from painful or undesirable situations, such as excusing them from writing a final because of good term work. Poor behavior can be modified through extinction with the premise that behavior that is not reinforced is not likely to be repeated.  Misbehavior is usually an attention seeker; thus, by ignoring the misbehavior, the student misbehavior should be extinguished. With regard to punishment, responses that bring painful or undesirable consequences will be suppressed, but may reappear if reinforcement contingencies change.  Thus, penalizing late students by withdrawing privileges should stop their lateness, but only positively reinforcing the desired modification will ensure change.
    For example, to teach a child to act in a manner in which he has seldom or never before behaved, reward successive steps to the final behavior.  To develop a new behavior that the child has not previously exhibited, arrange for an immediate reward after each correct performance.  To increase a child's performance in a particular way, you may arrange for him to avoid or escape a mild aversive situation by improving his behavior or by allowing him to avoid the aversive situation by behaving appropriately.  To encourage a child to continue performing an established behavior with few or no rewards, gradually require a longer time period or more correct responses before a correct behavior is rewarded.  To improve or increase a child's performance of a certain activity, provide the child with an intermittent reward.  To stop a child from acting in a particular way, you may arrange conditions so that he receives no rewards following the undesired act.  Further, deliver aversive stimuli immediately after the action occurs if extinction doesn't work. Since punishment results in increased hostility and aggression, it should only be used infrequently and in conjunction with reinforcement.  To help a child overcome his fear of a particular situation, gradually increase his exposure to the feared situation while he is otherwise comfortable, relaxed, secure, or rewarded.
Dan:  How do you feel operant conditioning has and will influence technology?
Skinner:  As you well know, I am an advocate of teaching machines and programmed learning. Some of the technology that has been influenced in the past due to operant condition is the multiple-choice machine, chemo sheets in which the learner checked their answers with a chemical-dipped swab, phase checks, constructed in the 1940s and 1950s, which taught and tested such skills as disassembly-assembly of equipment, and a programming style for the US Air force in the 1950s that trained troubleshooters to find malfunctions in electronic equipment.  And today operant conditioning is being widely applied in clinical settings, behavior modification, teaching, classroom management, instructional development, and programmed instruction.  It is imperative that the instructional designer understand tasks requiring a low degree of processing seem to be facilitated by strategies most frequently associated with operant conditioning.  It is also imperative that the designer's supervisor understands the importance of reinforcing desired behavior.
Dan: What do you feel is the major criticism of operant conditioning with regard to technology?
Skinner:  The major criticism is the inability to apply a successful stimulus to a response.  For example, if a situation where the stimulus for the correct response does not occur, a response may not be given and modification behavior, or even increased knowledge, cannot take place.  In addition, a behavior conditioned to respond to a certain stimulus will stop when an anomaly occurs because of the anomaly being interpreted as a punishment.  In instructional design, for example, the designer cannot be assured the program being developed will not cause an aversive reaction to the student using the program.  When desiring to increase the abilities or knowledge of a user, one must understand the importance of giving the appropriate stimulus to the desired response.  This cannot be assured when the one controlling the environment is not present.
Dan:  What do you see as the future of technology in relation to operant conditioning?
Skinner:  Operant conditioning has been and will remain instrumental to the development and use of technology.  The systems approach developed out of the 1950s and 1960s focus on language laboratories, teaching machines, programmed instruction, multimedia presentations and the use of the computer in instruction were all developed from operant conditioning models. Most systems approaches today are similar to computer flow charts with steps that the designer moves through during the development of instruction, again a premise of operant conditioning. Rooted in the military and business world, the systems approach involved setting goals and objectives, analyzing resources, devising a plan of action and continuous evaluation and modification of the program, all examples of operant conditioning. In short, technology is designed and used by analyzing the situation and setting a goal. Individual tasks are broken down and learning objectives are developed. Evaluation consists of determining whether the criteria for the objectives have been met and reinforcing as needed.  Operant conditioning will continue to influence the development and use of technology today.  As I mentioned earlier, reinforcing behavior to achieve a desired result has been the basis of employment and education since their inceptions.   
Summary of Discussion:
    Operant conditioning is based upon the idea that learning is a function of change in overt behavior. Changes in behavior are the result of an individual's response to events, stimuli if you will, that occur in the environment. When a particular Stimulus-Response pattern is reinforced, or in simpler terms rewarded, the individual is conditioned to respond. Thus, it is through operant conditioning that voluntary responses are learned. 
    The two basic premises in operant conditioning are positive and negative reinforcement strengthens behavior, and punishment, response cost, and extinction weaken behavior.  In positive reinforcement, a positive reinforcer is added after a response which increases the frequency of the response.  In negative reinforcement, after the response, the negative reinforcer is removed, increasing the frequency of the response.  In response cost, after the response, the positive reinforcer is removed which weakens the frequency of the response.  In punishment, after a response, a negative or aversive stimulus is added which weakens the frequency of the response.  Extinction is simply refusing to reinforce a behavior. 
    Continuous reinforcement simply means that the behavior is followed by a consequence each time it occurs. Intermittent schedules are based either on the passage of time, or the number of correct responses emitted. The consequence can be delivered based on the same amount of passage of time or the same number of correct responses, or it could be based on a slightly different amount of time or number of correct responses that vary around a particular number. 
    The major criticism with operant conditioning is the inability to apply a successful stimulus to a response.  Thus, when desiring to increase the abilities or knowledge of a user, one must understand the importance of giving the appropriate stimulus to the desired response.  This cannot be assured when the one controlling the environment is not present. 
References