home | about | ideas | examples | resources | contact
how to use site | tech integration overview | basics | preparation | tables | web page
community culture | glossary | usability
Knowledge is perpetually expanding, aided in part by the constant research, development, and new technologies that are an integral part of modern society. This expanding pool of knowledge generates ongoing, iterative, instructional processes throughout both public and private sectors of society. Instructional Design (ID) models offer formal, procedural methodology that supports the dissemination of knowledge to a wide range of audiences, in both public and private settings. This project considers the efficacy of the two most prevalent types of ID models, behavioral and cognitive, and their respective importance to learning outcomes.
The focus of this project is the efficacy of divergent instructional design (ID) models on learning outcomes; specifically, behavioral models versus cognitive models.
Having participated in at least a dozen corporate training initiatives in my former career, I was familiar with their spotty results. (Grubb & Ellis, 1986 to 1995) During a course in Instructional Design at Illinois Institute of Technology in 1999, I began to recognize some of the theoretical differences between the training I had taken and cognitive ID models.
I became curious as to how different ID models affect the communication competence necessary for effective learning outcomes. (Dore, 1986; Hymes 1972) The taxonomy used in most of the objective and assessment models I am familiar with, Keller's ARCS, Kirkpatrick's Levels, Mager's CRI, Merrill's CDT and ID Shell, is based primarily on Bloom's Taxonomy of Educational Objectives. (Bloom 1956) Yet Gagné is fixed on a very limited taxonomy of capabilities. Does the learning objective taxonomy ultimately lead to a measurable difference in learning outcomes?
In order to explore this topic, my initial concept was to conduct usability studies on two models; one behavioral, one cognitive. However, well into the analysis phase, I realized that such a study was beyond the capacity of a lone investigator. Therefore, I adjusted the project from an exclusively use study to an evaluation of learning outcomes that focused on usability issues. (See Scope and Methodology)
I chose Robert Mager's Criterion Referenced Instruction (Mager 1988) for my behavioral model based upon preliminary data collection. (See Preliminary Data Collection) I chose Robert Gagné's capability hierarchy as my cognitive model. [Gagné had a collaborator, Leslie Briggs, in revised editions of Principles of Instructional Design; consequently, the model is often referred to as the Gagné/Briggs model.] (Gagné 1979)
Numerous researchers in the ID field, including Mager, incorporated many of the ideas found in Gagné's research when they developed ID models, particularly his theory of learning hierarchies. However, Gagné and Mager's models are distinct in their approach on how to elicit and evaluate learning outcomes. (Gagné 1965, 1972, 1979) (Mager 1988)
Gagné posits that to effect a desired learning outcome, one must measure capability. The Gagné model accomplishes this by means of a limited taxonomy consisting of verbs for the capability that pertains to specific types of learning; each verb references a specific domain of competency according to a natural hierarchy of knowledge. (Gagné 1979, 1984, 1990)
The following chart illustrates Gagné's verb for capability paradigm:
Mager's Criterion Referenced Instruction posits that effective learning outcomes can be measured by correct responses to appropriately designed criteria (performance objectives) alone. (Mager 1975, 1984, 1988)
The following examples summarize the two models:
Gagné's theory assumes that behavior alone does not reflect knowledge (capability), it is only a measurement tool. In the above Gagné example, the intention is to teach a concept, squares; not how to write an X (action/behavior). The capability depends on a hierarchy of knowledge (discriminate, identify) applied to an object (squares) and measured by an action (mark an X).
Mager's theory assumes that criteria (performance behavior/action) reflects knowledge. In the above Mager example, the intention is to elicit a performance criteria; correctly marking X at least 8 out of 10 times. The degree of accurate performance presumes knowledge in the given domain.
It can be difficult to discern the differences in learning outcomes between these two models when they are applied to basic intellectual skills like discrimination and identification. After all, if a student correctly marks the square 8 out of 10 times, is it likely that he or she is guessing? But the distinction becomes clearer when the models are applied to higher cognitive skills, such as problem solving. A good example of this distinction is contained in the circumstances surrounding the nuclear reactor accident at Three Mile Island in 1979.
According to Scott Johnson, "a series of mechanical, electrical, and human failures led to what has been described as the worst nuclear power plant accident in the history of the United States." (Introduction) The supervisor and two plant operators in charge at the time of the accident "were former Navy reactor operators ... licensed, experienced men, and all had scored well above average on the tests that culminated their training." (Narrative Part I: The Initiating Event) Yet despite their history of accurate performance in operating and maintaining the facility, they were not able to initiate a cognitive strategy that could predict, identify, and solve the problems resulting from equipment malfunctions and failures. (Johnson 1998-1999)
During the accident, operational indicators failed and the supervisor and operators were unable to identify the underlying problem, predict its consequences, or generate a solution. At the apex of the crisis, they did not comprehend what the reactor core problem was telling them: if "primary pressure was low and containment pressure was high," (Narrative Part 3: Fuel Damage) there had to be heat build-up inside the reactor, and (under the circumstances) this could only be coming from an open steam valve. It wasn't until a day-shift engineer arrived that the connection was made and the steam valve was closed. But by then containment had been breached. (Johnson 1998-99) Three mile Island: minute by minute
While the Three Mile Island accident is an atypical situation, I believe it pointedly illustrates this crux issue of performance versus capability in behavioral and cognitive ID models. However, given the seeming appropriateness of the Mager "squares" example, the question of learning outcomes expands to become, "Are there situations where the criteria model better reflects knowledge than the capabilities model and vice versa?"
Prior to designing the evaluation, preliminary data was gathered from training departments of major corporations. The purpose was threefold: 1) to ascertain what ID models were prevalent in real world training situations; 2) to uncover ID model issues and concerns; and 3) to enhance the relevancy of the evaluation. Twelve major corporations from six industries were chosen. Ten of the corporations have global operations, two operate locally. All rely upon comprehensive training for at least one classification of employees. The results were as follows:
(Each respondent was assigned a contact number for tabulating corresponding data.)
Based on preliminary data collection, problems with ID models fall into the following categories:
The time required to design instructional models was the primary issue of concern among respondents. Comments included:
Five respondents cited problems with learning assessment. Comments included:
Three respondents cited difficulty with designing instructional models. Comments included:
Two respondents cited insufficiency in their ID models' capacity to meet instructional goals. Comments included:
Two respondents cited cost. Both outsource some training department work to consultants in order to supplement in-house resources. The comment from both was, "Costly."
For the purposes of this evaluation, it is important to note some paradoxes in responses tabulated in the preliminary data collection.
Ideally, to collect comprehensive data, one would conduct an evaluation that incorporated a full learning outcome comparison and addressed all user issues. Minimum requirements for such an evaluation would be:
An evaluation involving 72 tests and a minimum of 12 people (more if target audience members are not applicable to all 12 modules) is beyond the capacity of one investigator.
Furthermore, an evaluation executed by a single investigator who designed both modules and conducted testing alone, would only provide data about the investigator's ability in ID model design efficacy and learning outcomes.
For these reasons, a less extensive evaluation was designed to compare learning outcomes between the Mager (behaviorism) and Gagné (capabilities) ID models. Conditions of the evaluation include:
The evaluation also measures issues uncovered by preliminary data collection
The issue of Time is also addressed, but only in terms of meeting or exceeding teaching and/or curriculum constraints.
Additionally considered are issues concerning:
Results from the evaluation will also look for any correlation that might affect the issue of Cost.
The nature of the evaluation demanded participants with instructional experience and some knowledge of instructional methodology. Therefore, it was determined that target participants would have degrees in education and teaching certification. Additionally, since 80% of global issues can be uncovered with five participants (Dumas 1996, 1999), six participants were enlisted for the evaluation. They included:
Furthermore, as there is a significant population of training professionals without academic credentials working as instructors in a variety of corporate and commercial settings, I wanted to include this group as well. Unfortunately, I was unable to enlist an adequate number of training professionals to draw conclusions about this group as a distinct population. However, I believe the two professionals included in the study represent useful anecdotal and comparative data. The two training professionals included:
In order to compare Mager's behavioral model to Gagné's capabilities model, it was necessary to see if participants:
To meet these goals, the evaluation was designed with the following elements:
In addition to including a variety of learning environments, purposes, and target audiences, questions one through ten in scenarios one and two reference increasing levels of intellectual knowledge as well.
A directed field study, monitored by the investigator, was performed at the following locations:
All research used the same protocols:
Each testing process had five segments:
The step by step protocol of the study can be found at Test Protocol - Forms and Scripts
A trial evaluation was performed at U.I.C.'s Daley Library with one participant who had no background in education or professional training. The purpose was to check for ambiguous wording and/or other obvious usability problems with the evaluation questions, tasks, and questionnaire items. As a result:
For each evaluation question 1 through 10, the following was recorded:
For task items 11 and 12, the following was recorded:
For each questionnaire, the following was recorded:
The exit interview was an open discussion about the study's purpose, conducted after the evaluation and questionnaire were completed. Where relevant, participant comments were noted during this event.
Data was tabulated by question, scenario, combined scenario, individual participant, participant group, gender, age, experience, and collectively. Comments were not given a numerical value, but are included in the Findings section's issue statements where appropriate.
For each evaluation question 1 through 10
For evaluation task items 11 and 12
For combined evaluation items 1 through 12
For questionnaire items 1 through 6
For questionnaire item 7
All data collected and discussed in the Findings section are displayed in the following tables.
ID Model Evaluation - Comparison by Scenario
ID Model Evaluation - Comparison by Gender
ID Model Evaluation - Comparison by Participant Age
Evaluation completion times are not equivalent to the preliminary data issue of Time required to design. However, they do indicate the time required for participants' to; 1) perceive the differences between the models; and 2) select the best learning outcome.
Attitudinal Response Results - Questionnaire Items 1 - 7
* Participant 3 did not respond to Scenario 3, and therefore, could not answer questionnaire items 3 and 4. This had the net effect of lowering item, average, Total I, and Total III cumulative scores. Questionnaire scores without Participant 3 are:
Attitudinal Response Results Minus Participant No. 3
Rationale for Attitudinal Responses
* Participant 3 did not respond to Scenario 3, items 11 and 12. Therefore, user was unable to answer corresponding Support questions 3 and 4 on the questionnaire. This had the net effect of lowering item, average, Total I, and Total III cumulative scores. Furthermore, although Participant 3 scored 100% in supporting evaluation question selections, he failed to support any questionnaire response items. Although this participant is a consistent Outlier, the study design may be flawed. This and other testing problems are addressed in the Test Design Problems section.
Rationale for Attitudinal Response Minus Participant No. 3
ID Model Choice - Experience Comparison
Note:Novice and heuristic categories correspond to participant age categories (see Table 4), where Novices are under 35 yrs. and Experts are over 35 yrs. As Participant 3, a novice, did not complete Scenario three. Scenario three was excluded from Table 11.
Issue Categories - Adjusted for Question 7
Participant No. 3 (academic credentials) is a consistent outlier. I have calculated averages with and without him, where pertinent. Additionally, there is an outlier category for addressing anomalies. (Dumas 1999, pp. 313-314)
Findings have been organized, according to usability testing conventions (Dumas 1999, Rubin 1994), as follows:
Please remember: data on training professionals is presented as anecdotal information only and should not be used to draw conclusions about this population.
The following discussion section deals with the six most prevalent issues in detail.
Issue One: Participants exhibited a preference for cognitive models in classroom learning situations.
Scope: Global - has broad implications for learning outcomes
Severity: Level 1 - has direct affect on learning outcomes
Frequency: 4 of 6 Academics / 1 of 2 Professionals
Explanations: In the classroom learning environments of Scenario One, participants referenced higher learning outcomes 70% of the time.
Academics| One User | 5 of 5 times |
| Three Users | 4 of 5 times |
| Two Users | 2 of 5 times |
| One User | 4 of 5 times |
| One User | 2 of 5 times |
Categories: For this issue, each category has the potential to be selected 30 times (5Q x 6P)
Proficiency: Of 26 models chosen for proficiency, 21 referenced higher learning outcomes.
Comprehensiveness: Of 13 models chosen for comprehensiveness, 13 referenced higher learning outcomes.
Appeal: Of 2 models chosen for appeal, 2 referenced higher learning outcomes.
Time: Of 2 models chosen for exceeding teaching/curriculum constraints, 2 referenced higher learning outcomes.
Issue Two: Participants exhibited a preference for behavioral models in adult learning situations.
Scope: Global - has broad implications for learning outcomes
Severity: Level 1 - has direct affect on learning outcomes
Frequency: 3 of 6 Academics / 1 of 2 Professionals
Explanations: In the Adult learning environments of Scenario Two, participants referenced lower learning outcomes 63% of the time. [Results from Question 7 are treated separately]
Academics| One User | 4 of 4 times |
| Two Users | 3 of 4 times |
| Two Users | 2 of 4 times |
| One User | 1 of 4 times |
| One User | 3 of 4 times |
| One User | 1 of 4 times |
Categories: For this issue, each category has the potential to be selected 24 times (4Q x 6P) [Results from Question 7 are treated separately]
Proficiency: Of 24 models chosen for proficiency, 15 referenced lower learning outcomes.
Comprehensiveness: Of 17 models chosen for comprehensiveness, 11 referenced lower learning outcomes.
Gender: The three women in the study chose lower learning outcomes 10 out of a possible 12 times. (83%) The three men in the study chose lower learning outcomes 5 out of a possible 12 times. (42%)
Difficulty: Of 1 model judged difficult, 1 referenced higher learning outcome as not skill appropriate.
Outlier: The outlier chose only one lower learning outcome in this scenario. This participant felt the time required for the higher skill learning model would be prohibitive in this one instance.
Issue Three: Participants exhibited reasoned support for ID model selection in all ten evaluation conditions.
Scope: Global - has broad implications for learning outcomes
Severity: Level 1 - has direct affect on learning outcomes
Frequency: 6 of 6 Academics / 2 of 2 Professionals
Explanations: Although participants did not always select the higher learning outcome model, all model selections were reasonably supported.
Academics| Six Users | 9 of 9 times |
| Two users | 9 of 9 times |
Categories: For this issue, each category has the potential to be selected 54 times (9Q x 6P) [Question 7 is irrelevant to this issue as its learning outcome choices are equal.]
Proficiency: Of 50 models chosen for proficiency, 30 referenced higher learning outcomes.
Comprehensiveness: Of 30 models chosen for comprehensiveness, 18 referenced higher learning outcomes.
Difficulty: Of 5 models judged difficult, 4 referenced higher learning outcome as not age/skill appropriate, and 1 referenced lower learning outcome model as not age/skill appropriate.
Gender, Experience, Age: All participants in the study had identical results for this issue. (100%)
Comments: All comments referenced demonstrated knowledge, regardless of learning outcome levels.
Issue Four: Participants' attitudinal responses suggest behavioral and cognitive assessment models are not readily distinguishable.
Scope: Global - has broad implications for learning outcomes.
Severity: Level 1 - has direct affect on learning outcomes.
Frequency: 3 of 6 Academics described assessment models as different / 3 of 6 Academics described assessment models as similar Average 50% / 1 of 2 Professionals described assessment models as similar Average 50%
Explanations: Where assessment models were distinctly behavioral or cognitive, half the academic participants failed to recognize the difference 100% of the time, and all the professional participants failed to recognize the difference 50% of the time.
Academics| Three Users | noted difference | 2 of 2 times |
| Three Users | did not note difference | 2 of 2 times |
| Two Users | noted difference | 1 of 2 times |
| Two Users | did not note difference | 1 of 2 times |
Categories: For this issue, each category has the potential to be selected 12 times (2Q x 6P)
Experience: Of 3 participants who did recognize a difference, 1 was Expert and 2 were Novice. Of 3 participants who did not recognize a difference, 2 were Expert and 1 was Novice.
Gender: Of 3 women in the study, 2 did recognize a difference and 1 did not. of 3 men in the study, 1 did recognize a difference and 2 did not.
Age: Age findings are identical to Experience findings.
Issue Five: Participants' attitudinal responses suggest uncertainty regarding highest learning outcomes.
Scope: Global - has broad implications for learning outcomes.
Severity: Level 1 - has direct affect on learning outcomes.
Frequency: 4 of 6 Academics described both behavioral and cognitive models as best assessment of higher learning outcomes. 2 of 6 Academics described cognitive model as best assessment of higher learning outcome / 1 of 2 Professionals described both behavioral and cognitive models as best assessment of higher learning outcomes.
Explanations: Where assessment models were distinctly behavioral or cognitive, participants failed to discern either as best assessment of higher learning outcome 68% of the time.
Academics| Four Users | chose both models | 2 of 2 times |
| Two Users | chose one model | 2 of 2 times |
| One User | chose both models | 2 of 2 times |
| One User | chose one model | 2 of 2 times |
Categories: For this issue, each category has the potential to be selected 12 times (2Q x 6P). Where assessment models were distinctly behavioral or cognitive, participants failed to discern either as best assessment of higher learning outcome 67% of the time.
Experience: Of 4 participants who described both models as best assessment of higher learning outcomes, 2 were Expert, 2 were Novice. Of 2 participants who described cognitive model as best assessment of higher learning outcomes, 1 was Expert, 1 was Novice.
Gender: Of 3 women in the study, all three described both models as best assessment of higher learning outcomes. Of 3 men in the study, 1 described both models as best assessment of higher learning outcomes, 2 described cognitive model as best assessment of higher learning outcomes.
Age: Age findings are identical to Experience findings.
Issue Six: Participants exhibited a preference for behavioral models in adult learning environments when learning outcomes were equal.
Scope: Local - has proscribed implications for learning outcomes.
Severity: Level 2 - has indirect affect on learning outcomes.
Frequency: 5 of 6 Academics / 2 of 2 Professionals.
Explanations: Where learning outcomes were equal in the adult environment of Question Seven, participants referenced the behavioral model 87% of the time.
Academics| Five Users | chose behavioral model | 1 of 1 time |
| One User | chose cognitive model | 1 of 1 time |
| Two Users | chose behavioral model | 1 of 1 time |
Categories: For this issue, each category has the potential to be selected 6 times (1Q x 6P).
Proficiency: Of 6 models selected for proficiency, 5 referenced the behavioral model.
Comprehensiveness: Of 4 models selected for comprehensiveness, 3 referenced the behavioral model.
Difficulty: Of 2 models judged difficult, 2 referenced capability model as difficult for instructor to manage or assess.
Gender: The three women in the study chose the behavioral model 3 out of a possible 3 times. (100 %) The three men in the study chose the behavioral model 2 out of a possible 3 times. (67%)
Outlier: The outlier chose the only cognitive model selected for this question. This participant felt the cognitive model "demonstrated knowledge better."
In the Evaluation Scenarios One and Two, participants demonstrated a choice between the models and supported their choices logically. However, it is not clear, especially in Scenario Two, if participants identified the potential for higher learning outcomes between the models. This is demonstrated by the fact that the best results for higher learning outcomes were only 70%.
Furthermore, errors in test design influenced some participant selections in so far as: 1) behavioral models were chosen because cognitive models were considered age/skill inappropriate (instruction to draw); and 2) cognitive models were chosen because behavioral models were considered age/skill inappropriate (instruction to role play).
In evaluation tasks 11 and 12 (Scenario Three), where participants designed their own ID models, the results suggest that participants who did not choose higher learning outcomes (2 of 5 - 1 declined), did so for specific, logical reasons other than highest learning outcomes.
Questionnaire results further suggest participants may not have identified the potential for higher learning outcomes between the models. The purpose of the questionnaire was to quantify evaluation choices. It did, but not as anticipated. Once again, logical reasons other than highest learning outcomes were specified, and assessment differences between the models were not described as readily distinguishable.
However, participants did exhibit a clear distinction between classroom learning and adult learning choices, which suggests that: 1) participants readily identify different types of learning, if not different learning outcomes or ID models; and 2) participants choose ID models based upon different types of learning.
When designing this study, care was taken to guide participants without influencing their choices. Results indicate that this guidance may not have been adequate. The study's direction to choose the "best measure of knowledge" between the models may not have guided participants to focus on highest learning outcomes. Therefore, while the study records participants discriminating and choosing between ID models based upon different types of learning, it does not indicate whether or not participants view the criterion or the capability model as better reflecting knowledge.
Additionally, although not part of conclusions, training professionals sometimes described the term "behavior" as ambiguous. They did not differentiate between behavior, which refers to the action itself, and performance, which refers to the result of the action.
Bloom, B.S., Englehart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956). Taxonomy of educational objectives. New York: McKay.
Dick, W. (1993). Enhanced ISD: A response to changing environments for learning and performance. Educational Technology, 33(2), 12-15.
Dick, W., & Carey, L. (1990). The systematic design of instruction. (3rd ed.). Glenview, IL: Scott, Foresman.
Dore, J. (1986). The development of conversational competence. In R. Schiefelbusch (Ed.), Language competence: assessment and intervention. San Diego: College Hill Press.
Dumas, J. S. (1991). On usability testing. Common Ground, Newsletter of Usability Professionals' Association 1(2), November 1991.
Dumas, J. S., & Redish, J. (1999). A practical guide to usability testing (2nd ed.). Portland, OR: Intellect.
Dumas, J. S., (1998). Usability testing methods. Common Ground, Newsletter of Usability Professionals' Association 8(3) (4), July and October 1998
Dumas, J.S. (1996). How many participants in a usability test are enough. Common Ground, Newsletter of Usability Professionals' Association 6(4), October 1996
Feinberg, S. (1989). Components of technical writing. Chicago: Holt Rinehart & Winston.
Gagné, R. M. (1962). Military training and principles of learning. American Psychologist, 17, 263-276.
Gagné, R. M. (1965). The conditions of learning. New York: Holt
Gagné, R. M. (1972). The conditions of learning (2nd ed.). New York: Holt Rinehart & Winston.
Gagné, R. M. (1974). Essentials of learning for instruction. New York: Dryden Press.
Gagné, R. M. (1980). Learnable aspects of problem solving. Educational Psychologist, 15, 84-92.
Gagné, R. M. (1984). Learning outcomes and their effects: useful categories of human performance. American Psychologist, 39, 377-385.
Gagné, R. M. (1985). The conditions of learning (4th ed.). New York: Holt Rinehart & Winston.
Gagné, R. M. , & Merrill, M. D. (1990). Integrative goals for instructional design. Educational Technology Research & Development, 38 (1), 23-30.
Gagné, R. M., & Dick, W. (1983). Instructional psychology. Annual Review of Psychology, 34, 261-295.
Gagné, R. M., & Driscoll, M. P. (1988). Essentials of Learning for instruction (2nd ed.). Upper Saddle River, NJ: Prentice-Hall.
Gagné, R. M., & Glaser, R. (1987). Foundations in learning research. In R. M. Gagné (Ed.), Instructional technology foundations (pp. 49-83). Hillsdale, NJ: Lawrence Erlbaum.
Gagné, R. M., (1985). Cognitive psychology and school learning. Boston: Little, Brown & Company.
Gagné, R. M., Briggs, L. J. (1979). Principles of instructional design (2nd ed.). Orlando, FL: Harcourt Brace Jovanovich
Gagné, R. M., Briggs, L. J., & Wagner, W. W. (1988). Principles of instructional design (3rd ed.). Orlando, FL: Harcourt Brace Jovanovich
Gagné, R. M., Briggs, L. J., & Wagner, W. W. (1992). Principles of instructional design (4th ed.). Orlando, FL: Harcourt Brace Jovanovich
Hymes, D. (1972). Introduction. In C. Cazden, V. John, & D. Hymes (Ed), Functions of language in the classrooms. New York: Teachers College, Columbia University.
Johnson, Scott. (1998-1999) Inside Three mile Island: minute by minute. wowpage. com. Revised 4-11-00. http://www.wowpage.com/tmi.
Kirkpatrick, D. L. (1983). Four steps to measuring training effectiveness. Personnel Administrator, November, 19-25.
Kirkpatrick, D L. ((1998). Evaluating training programs: the four levels. San Francisco: Barret-Koehler Publishers
Kirkpatrick, D. L. (Ed.). (1998). Another look at evaluating training programs. Alexandria, VA: American Society for Training & Development.
Klein, J.W. (1999). In class handout, Unit 3: Learning Objectives & Assessment Instructional Design 535, Illinois Institute of Technology.
Mager, R. F. (1962). Preparing instructional objectives. Palo Alto, CA: Fearon.
Mager, R. F. (1975). Preparing instructional objectives (2nd ed.). Belmont, CA: Fearon-Pittman
Mager, R. F. (1988). Making instruction work. Belmont, CA: Lake Publishing Co.
Mager, R. F. (1997). Making instruction work: of skillbloomers: a step-by-step guide to designing and developing instruction that works. (2nd ed.) Atlanta, GA: Center for Effective Performance.
Mager, R. F. (1997). Measuring instructional results (3rd ed.) Atlanta, GA: Center for Effective Performance.
Merrill, M. D., Li, A., & Jones, M. K. (1990a). Limitations of first generation instructional design. Educational Technology, 30 (1), 7-11.
Merrill, M. D., Li, Z., & Jones, M. K. (1990b). Second generation instructional design. Educational Technology, 30 (2), 7-14.
Merrill, M.D. (1972). Taxonomies, classifications, and theory. In R. N. Singer (Ed.), The psychomotor domain: movement and behavior. (pp. 385-414). Philadelphia, PA: Lea & Febiger.
Merrill, M.D., (1994). Instructional design theory. Englewood Cliffs, NJ: Educational Technology Publications.
Nielsen, J. (2000). Designing web usability. Indianapolis: New Riders Publishing.
Nielsen, J., Norman, D. A. (2000). Usability on the web isn't a luxury. informationweek.com. (p.66) 2-14-2000. http://www.informationweek.com. /773/web.htm, /773/web.htm, /773/web3.htm.
Norman, D. A. (1995). The psychology of everyday things. In R. Baecker, J. Grudin, W. Buxton, & S. Greenberg (Ed.), Readings in human-computer interaction: toward the year 2000. (2nd ed.) San Francisco: Morgan Kaufmann.
Norman, D. A. (1990) The design of everyday things. New York: Doubleday.
Pipe, P., Mager, R. F. (1984). Analyzing Performance Problems, or You Really Oughta Wanna (2nd ed.) Belmont, CA: Lake Publishing Co.
Rubin, J. (1994). Handbook of usability testing. New York: John Wiley & Sons.
Smith, P. L., Ragan, T. J. (1999). Instructional design. (2nd ed.) Upper Saddle River, NJ: Merrill.
Virzi, R. (1990). Streamlining the design process: running few subjects. Proceedings of the Human Factors Society 34th Annual Meeting, 291-294.
Virzi, R. (1992). Refining the test phase of usability evaluation: how many subjects is enough. Human Factors, 34 (4), 457-468.
TEST PROTOCOL - FORMS AND SCRIPTS
The following pages contain a complete copy of all documents used in the study.
Participants were asked to read the Evaluation text aloud and use TAP as they worked.
Participants were asked to read the Debriefing Questionnaire text aloud and use TAPas they worked.
Participant Profile and Screening Form
Participants were asked to complete the following form:
Illinois Institute of Technology Consent Form
The IIT Consent Form was first read aloud to participants. Then they were asked to read the form themselves, and if in agreement, to sign and date it.