In a Climate of Change: Measuring Educational Outcomes
Assessing Educational Outcomes
Trudy W. Banta
Ted Marchese, vice president of the American Association for Higher Education, has defined assessment as "a rich conversation about student learning informed by data." This is a simple, easy- to-remember definition in an arena where every writer attaches a somewhat different meaning to the concept and most attempts at definition are much longer and more difficult to remember! In my own work I have used one of those longer definitions. I think of assessment as "a process of providing credible evidence of the outcomes of higher education undertaken for the purpose of improving programs and services within an institution" (Banta & Associates, 1993).
Two aspects of my definition are worth emphasizing. If assessment is about providing credible evidence, it means that many forms of assessment can be acceptable--as long as the findings they yield have credibility among those who use them. While the measurement methods devised initially may be far from perfect in terms of their reliability and validity, they may provide useful preliminary evaluative information if they have face validity in the eyes of their creators. My definition also makes clear that I believe assessment is a program evaluation tool, one that should be undertaken with seriousness of purpose so that it is not merely an exercise, but instead produces results to which people pay attention. In short, assessment should lead to improvement.
The purposes for undertaking outcome assessment are principally: (1) to demonstrate the accountability of higher education to its external stakeholders and (2) to improve programs and services. I obviously believe strongly in making the second purpose paramount, but I understand and appreciate the need for the first. Ideally, our assessment initiatives will be designed to serve both purposes simultaneously.
When confronted with the suggestion that they begin to look at learning outcomes, faculty often respond, "We assess students every day and give them grades. What do you want us to do differently?" I believe the easiest way to explain the difference is to think in terms of two kinds of assessment: one designed to provide information to individual students and one designed to look collectively at the performance of groups for purposes of conducting program evaluation. The very same measures may be used for both, but the measurement data are used for different purposes. For example, we assess studentsí basic skills in order to advise them about which courses to take, we provide periodic feedback to students about their strengths and weaknesses through graded papers and exams, and we may confer end-of-program certification of competence through licensing exams. Such individual assessment also enables us to give students grades and helps students learn to become self-assessors as we motivate them to identify their own strengths and weaknesses. Group assessment activities include all of the individual measures just listed plus surveys, interviews and focus groups, monitoring of the success of graduates, and reviewing student records. We administer these tests and measures to individuals, then look at their responses collectively to determine where groups of students may need help. Group assessment activities can thus be used to improve curriculum and instruction. In addition, they can be used in self-studies for institutional and/or state peer review and for regional and/or national accreditation.
Collaboration: Key to Success in Assessment
The assessment of groups for purposes of looking at student achievement of learning outcomes agreed upon collectively by faculty is a relatively recent phenomenon in higher education. Alverno College in Milwaukee and Northeast Missouri State University (now Truman State University) initiated this work in 1970 (Ewell, 1984). In 1979, Tennessee became the first state to adopt performance funding for its public institutions (Banta, 1986). By the mid-1980s the legislatures of Virginia, New Jersey, and Colorado had developed their own accountability initiatives that encouraged public institutions to assess learning outcomes. And by 1995 the American Council on Educationís annual Campus Trends survey showed that 94% of the colleges and universities in the United States were engaged at some level in assessing outcomes (El Khawas, 1995).
However, two more recent surveys (Steele, 1996 and Ewell, 1996) have indicated that the results of most assessment programs to date have been somewhat disappointing. Steele has cited lack of faculty involvement as one of the principal reasons for this, and Ewell has identified resistance among faculty and administrators as a chief obstacle to assessment. Why is it so difficult for this concept to take root and make a difference?
Faculty have so many new things to learn and so many sources of stress that assessment seems merely one more externally-imposed burden. We must learn to use technology in instruction, recruit students, provide service to the community, and build alliances across disciplinary lines and into the community in order to obtain the resources needed to update our programs and facilities. Trustees and legislators want faculty to spend more time teaching. We are told that we need to shift our paradigm from teaching to learning -- focusing on what students are learning rather than on what faculty are teaching (Barr & Tagg, 1995). We must address individual differences among students in our teaching. Our stakeholders want to see value-added for the investment they are making in higher education.
I have come to believe that the need to build alliances with colleagues to develop a collective understanding of program outcomes for students may be one of the most important sticking points for faculty involvement in assessment (Banta, Lambert & Black, in press). There are many barriers to collaboration in the academy. Graduate schools prepare specialists with a deep understanding of a narrow area and departments hire these specialists. Then promotion and tenure favor individual achievements--interdisciplinary work seems much harder to evaluate. Not surprisingly, most of our early scholarship is conducted alone. It is no wonder that there is resistance when faculty are told that in order to carry out outcomes assessment they must first get together and decide on common learning goals for the curriculum for which they are responsible. They had been thinking of the curriculum as a collection of individual courses that students would integrate intellectually for themselves.
Fortunately, the research literature suggests that there are some potentially effective ways to promote collaboration. Leaders who are committed to facilitating collaboration can be most helpful of all. Such leaders can provide opportunities for faculty to work together on important matters over sustained periods of time. Faculty need time to meet informally in relaxed environments, especially in group training/development meetings. We can give small institutional grants for collaborative activity. And perhaps most importantly, we can develop group-centered values in our faculty reward structures.
Fortunately, there is growing interest within higher education in what works in a variety of areas, including student retention, general education, the use of technology in instruction, and curriculum in the major. This natural curiosity increasingly is attracting faculty to use assessment methods to evaluate the effectiveness of new approaches, especially as these are compared to older programs and processes.
So How Do We Begin?
The steps in outcomes assessment may be summarized succinctly as follows:
1. Set learning goals/objectives.
2. Identify learning experiences for each.
3. Select/develop measures.
4. Gather data.
5. Analyze/interpret findings.
6. Make appropriate changes based on findings.
In approaching the steps in this list we should involve all of our stakeholders: students, faculty, graduates, administrators, employers, and others in the community. We should ask: What questions are important to these groups? What evidence would they find credible, acceptable? How might they participate in collecting evidence? In what ongoing processes might they find it valuable to use the evidence -- planning, curriculum improvement, program review? And, ultimately, do they trust the process?
A very important beginning point for the assessment of student outcomes is to state learning objectives for students. There is still no better guide for this process than Benjamin Bloomís 1956 work, Taxonomy of Educational Objectives. Bloom postulates a hierarchy of six cognitive domain categories, beginning with knowledge and comprehension at the simplest levels and proceeding in complexity through four additional levels, including application, analysis, synthesis, and evaluation. Educational objectives, if stated using action verbs, will literally dictate how they should be assessed. Action verbs may be connected with each of Bloomís cognitive domain categories. The following chart provides some examples:
Cognitive Domain Category Sample Verbs for Outcomes
Knowledge Identifies, defines, describes
Comprehension Explains, summarizes, classifies
Application Demonstrates, computes, solves
Analysis Differentiates, diagrams, estimates
Synthesis Creates, formulates, revises
Evaluation Criticizes, compares, concludes
The matrix that follows, which has proven to be very effective in helping faculty get started in assessment, illustrates how goals or objectives initiated with action verbs dictate the assessment measure to be used in courses designed to help students develop the learning objectives.
As an example of how one faculty has begun the process of setting learning outcomes, colleagues in the Department of English at Indiana University-Purdue University Indianapolis (IUPUI) scheduled a half-day retreat recently to initiate conversation about student learning outcomes. They developed a working draft and put that on email. Subsequently, every professor had an opportunity to offer additions, deletions, and comments on two additional drafts that were circulated via email. From that exercise came the following partial listing of goals for students majoring in English. Program graduates should be able to:
1. Demonstrate how language influences intellectual and emotional responses.
2. Apply knowledge of rhetorical context by writing effectively and appropriately in context.
3. Assess critically spoken, written, and visual representations.
4. Apply research strategies appropriate to the area of study.
5. Synthesize diverse issues and responses in collaborative discussions of text.
One of the most complex challenges we face is to design assessment that serves the purposes of improvement and accountability simultaneously. At Alverno College in Milwaukee and at Kingís College in Pennsylvania faculty articulate comprehensive learning outcomes for students in general education and the major (Alverno College Faculty, 1985; Farmer, 1988). Then a faculty committee monitors assessment activities to ensure quality of outcomes across sections, courses , and curricula. Assessment information is returned to each discipline for purposes of providing direction for improvement. Selected student performances illustrating the achievement of faculty-determined learning outcomes can be collected in succinct packages to demonstrate accountability to external stakeholders.
Methods of assessment may be quite varied, and fortunately, most of them are well known to all of us. I want to emphasize that assessment is not confined to the use of paper-and-pencil tests, though these are certainly quite useful in many situations. Other methods, which any faculty member may design, include individual or group projects, portfolios, observation of simulated practice, analysis of case studies, attitude or belief inventories, interviews, focus groups, and surveys.
Very often people think that outcomes assessment should be based principally on the use of standardized instruments. While it is true that standardized instruments usually offer higher levels of reliability and validity than locally-developed methods, the content of the standardized instruments may not sufficiently match that taught in the local setting. If a standardized instrument is to be used, it should be chosen carefully to reflect as accurately as possible the curriculum or program being assessed. Then faculty might engage in an exercise of setting expected scores on subscales of the instrument. When the actual scores are known, faculty can discuss their expectations, along with the reasons why they may not have been met. Such conversation may well lead to the decision to try some other assessment methods.
My best advice is to start with measures that you have, such as assignments in courses, course exams, actual work performance in an internship, and records of progress through the curriculum. As an illustration of the last item in this series, the student transcript can tell us much about the curriculum and the way each student experiences it, and we already have that record at our fingertips.
One of the new methods that I find very exciting is the use of primary trait scoring to give meaning to student grades. This technology, originally developed at the Educational Testing Service for scoring student essays, has been adapted for use in outcome assessment by Barbara Walvoord and associates at the University of Cincinnati and the University of Notre Dame (Walvoord & Anderson, 1998). Primary trait scoring involves identifying the traits necessary for success in an assignment, then composing a scale or rubric that gives clear definition to each point. Finally, one grades student work using the rubric. For example, if the desired trait is self-expression of a feeling evoked by an event, a scoring rubric might consist of the following five points:
1-no real expression presented
2-feelings expressed but inadequately described
3-expression generally competent
4-feelings expressed in great detail
5-detail expressed plus an intelligent response to the feelings evoked
A very popular assessment method is the portfolio. Examples of content that may be included in a portfolio include course assignments, research papers, materials from group projects, artistic productions, self-reflective essays, correspondence, and taped presentations (Magruder & Young, 1996). The portfolio is most valuable when the student offers a reflective essay explaining why each artifact was included and how it illustrates a particular kind of learning. Electronic portfolios are also beginning to find their way into the tool boxes of some evaluators.
Critical thinking is a skill or ability that most academics hope to foster as they instruct students. Peter and Noreen Facione of Santa Clara University have developed the Holistic Critical Thinking Scoring Rubric (Facione & Facione, 1996). They indicate that this rubric can be used to assess any written material for the level of critical thinking demonstrated.
Group interaction, or teamwork, is another area that faculty increasingly would like to assess. Faculty in the Department of Pharmacy Practice (1993) at Purdue University have developed a rating guide that can be used to evaluate the interaction of students in a group. Some of the characteristics included in this checklist follow:
1. Student actively contributed to discussion.
2. Student listened effectively to others.
3. Student was willing to alter own opinion when confronted with convincing evidence.
4. Student challenged others effectively.
5. Student effectively explained concepts/insights.
6. Student summarized/proposed a solution.
Faculty may score observed performance using a 5-point scale where 5=consistently excellent, 3=generally satisfactory, and 1=inconsistent and/or inappropriate.
Senior projects are used quite often in engineering and technology, as well as in a number of other fields. At the University of West Virginia, engineering faculty have selected senior projects as a principal means of assessing student learning outcomes (Shaeiwitz, 1996). Students conduct their projects individually or in groups, then present them in written form and defend them orally before two faculty members. After all projects have been presented, there is a debriefing session wherein the two best projects are presented for students and faculty to observe. Faculty then summarize the strengths and weaknesses of all projects and circulate that summary to all faculty in the college. Faculty are asked to review their own courses based upon the relevant summary data they receive.
In the 1980s, when I was attempting to help faculty address the stateís performance funding standards, we asked every department at the University of Tennessee, Knoxville to place itself on a five-year schedule for reporting on student outcomes. The faculty in the Department of Religious Studies could not agree on a common body of content on which to examine their seniors. Thus they placed themselves in the fifth year of the schedule! Finally, they decided that they could agree that every student should be able to demonstrate the skills of a scholar in the discipline, and designed a pro-seminar for seniors.
In the Religious Studies seminar, students are asked to identify a topic for a comprehensive paper, develop a bibliography for the paper, write an outline and a first draft, then provide a written paper and present it orally for critique by faculty and peers.. At every step of the way a team of faculty look at all the studentsí work both individually and collectively and gather data for program improvement. As a result of considering their assessment findings, Religious Studies faculty are assigning more writing experiences and more library research in earlier courses so that students will be more fully prepared for the pro-seminar.
In the Department of Speech Communication at the University of Tennessee, Knoxville the faculty have developed a capstone course to assess eight core learning objectives (Julian, 1996). Their assessment activities include a student symposium, a set of abstracts, an annotated bibliography, a final paper, an oral critique, and mid-term and final exams. A matrix listing the eight learning objectives in the first left column and the symposium and other measures as column headings across the top enables faculty and students to see which method(s) assess student achievement of each objective.
At North Dakota State University graduate faculty have become involved in assessment (Murphy, 1996). They have set four generic student outcomes for every program area, including effective communications, both oral and written; the ability to solve problems in oneís own field; the capacity to carry out research projects; and demonstration of knowledge in the chosen discipline. Faculty-developed rating forms allow instructors to apply multiple-level criteria to course projects, papers, research proposals, dissertations, theses, and comprehensive papers; exam responses in courses; comprehensive written exams; oral presentations in seminars; and final oral exams.
Surveys for faculty, students, administrators, former students, graduates, employers, and community members can add enormously helpful information in an assessment program. Individual interviews and focus groups can probe the depth of survey responses as well as demonstrate development over time. These more personal methods of assessment have the additional benefit of providing interaction between interviewer and interviewee.
Self-reports are also extremely valuable in the assessment process. At Indiana University- Bloomington students were surveyed to find out how they were spending their time. Faculty were interested to note that the modal time for studying is between midnight and 2:00 am! At Miami University in Ohio, students were asked to wear beepers and to report what they were doing when the beeper sounded. Miami faculty were intrigued to learn how much time students were spending walking from one place to another on campus. This prompted the initiation of some education about how one could practice a foreign language or mentally rehearse an upcoming class activity while walking. Self-reports can also provide details about social interactions. Diaries, journals, and portfolios are excellent self-report media.
If you want to know something about students--ask them! C. Robert Pace, professor emeritus at UCLA, has demonstrated in a lifetime of work that students can reliably tell us about the extent of their growth in areas we consider important, such as reading comprehension, applying math skills, communicating in writing or orally, applying scientific principles, working independently, and working with others in groups (Pace, 1990). The College Student Experiences Questionnaire (CSEQ) uses a scale that asks students about their growth in such areas as well as the extent of their involvement in such campus activities as: using the library, talking with faculty, attending cultural events, taking advantage of recreational facilities, attending meetings of clubs and organizations, and taking part in residence hall or fraternity/sorority activities. The CSEQ provides an indirect measure of student learning since research has shown that time on task and engagement enhance learning. Using the CSEQ we can gauge student involvement in learning activities.
All of the foregoing assessment methods can be applied in the cognitive realm. Many of us are also interested in studentsí non-cognitive development. For instance, we want students to develop autonomy, sensitivity to the needs of others, a sense of purpose, reliability, capacity for self-directed learning, leadership skills, respect for cultural diversity, and interest in life-long learning. The Mental Measurements Yearbook (Mitchell, 1995) published by the Buros Institute at the University of Nebraska-Lincoln is an excellent source for non-cognitive measures. Some of these include Restís Defining Issues Test, which provides a measure of moral reasoning; McBerís Picture Story Exercise, a measure of motivation; Allportís Study of Values; and Alexander Astinís Freshman Survey, which includes measures of humanitarian values and civic involvement.
Ensuring Success in Assessment
Iíll close with some recommendations for achieving and sustaining successful outcomes assessment initiatives over time. These are drawn from the literature and my own experience in visiting campuses in 36 states (Banta & Associates, 1993; Banta, Lund, Black & Oblander, 1996).
Every study of success in assessment has emphasized the importance of committed leadership. A supportive president and/or chief academic officer must provide opportunities for faculty to learn about assessment, to collaborate in setting curricular outcomes, and to be recognized and rewarded for spending time on assessment. Central administrators can also signal their commitment by appointing someone to coordinate assessment campus-wide. While a committee may be able to carry out this coordinating role, it is usually best to give an influential faculty member some released time to provide part-time or full-time leadership for assessment.
Assessment is strongest when multiple stakeholders are involved from the outset. Faculty should consult students, faculty in related disciplines, graduates of their program, employers of their graduates, and even interested individuals in the community as they develop learning outcomes and assessment strategies, collect and interpret data, and plan responsive improvements. In addition, assessment is strengthened by having every sector of the campus involved. Students spend about 85 percent of their time outside class. Thus student affairs and even administrative staff can support learning and assessment if they understand faculty goals and learning and assessment strategies.
Finally, the results of assessment need to be discussed and considered by all who have an interest in a program. Administrators can help to ensure that assessment is followed by appropriate improvement actions by asking for assessment data in such important on-going institutional processes as planning, budgeting, program review, fund-raising, marketing, student recruitment, and promotion and tenure procedures. Persistent follow-up--provided most reliably by an assessment coordinator--is critical if well-laid plans for assessment are to be implemented and sustained over time.
Alverno College Faculty. (1985). Assessment at Alverno College (Rev. ed.). Milwaukee, WI: Alverno Productions.
Banta, T. W., & Associates. (1993). Making a difference: Outcomes of a decade of assessment in higher education. San Francisco: Jossey-Bass.
Banta, T. W. (Ed.). (1986). Performance funding in higher education: A critical analysis of Tennesseeís experience. Boulder, CO: National Center for Higher Education Management Systems.
Banta, T. W., Lambert, J. L., & Black, K. E. (in press). Collaboration counts: The importance of cooperative work in assessing outcomes in higher education. In M. E. Dal Pai Franco & M. C. Morosini (Eds.) International assessment: Collaboration and networking (English working title). Porto Alegre, Brazil: Grupo de Estudo sobre Universidade-GEU.
Banta, T. W., Lund, J. P., Black, K. E., & Oblander, F. W. (1996). Assessment in practice: Putting principles to work on college campuses. San Francisco: Jossey-Bass.
Barr, R. B., & Tagg, J. (1995). From teaching to learning - A new paradigm for undergraduate education. Change, 27(6), 12-25.
Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook I: Cognitive domain. New York: Longmans, Green.
Department of Pharmacy Practice. (1993). Group interaction assessment form. West Lafayette, IN: Department of Pharmacy Practice, Purdue University.
El Khawas, E. (1995). Campus trends. Higher Education Panel Report Number 85. Washington D.C.: American Council on Education.
Ewell, P. T. (1984). The self-regarding, institution: Information for excellence, Boulder, CO: National Center for Higher Education Management Systems.
Ewell, P. T. (1996). The current pattern of state-level assessment: Results of a national inventory. Assessment Update, 8(3), 1-2, 12-13, 15.
Facione, P. A., & Facione, N. C. (1996). Critical thinking ability: A measurement tool. Assessment Update, 6(6), 12-13.
Farmer, D. W. (1988). Enhancing student learning: Emphasizing essential student
competencies in academic programs. Wilkes-Barre, PA: Kings College.
Julian, F. D. (1996). The capstone course as an outcomes test for majors. In T. W. Banta, J. P. Lund, K. E. Black, & F. W. Oblander (Eds.), Assessment in practice: Putting principles to work on college campuses (pp. 79-82). San Francisco: Jossey-Bass.
Magruder, W. J., & Young, C. C. (1996). Portfolios: Assessment of liberal arts goals. In T. W. Banta, J. P. Lund, K. E. Black, & F. W. Oblander (Eds.), Assessment in practice: Putting principles to work on college campuses (pp. 171-174). San Francisco: Jossey-Bass.
Mitchell, J. V. (Ed.) (1995). The mental measurements yearbook of the School of Education, Rutgers University. Lincoln: Buros Institute of Mental Measurements, University of Nebraska-Lincoln.
Murphy, P. D. (1996). Assessing student learning in graduate programs. Assessment Update, 6(6), 1-2.
Pace, C. R. (1990). The undergraduates: A report of their activities and progress in college in the 1980s. Los Angeles: University of California, Los Angeles, Center for the Study of Evaluation.
Shaeiwitz, J. A. (1996). Capstone experiences: Are we doing assessment without realizing it? Assessment Update, 8(4), 4, 6.
Steele, J. M. (1996). Postsecondary assessment needs: Implications for state policy. Assessment Update, 8(2), 1-2, 12-13, 15.
Walvoord, B. E., & Anderson, V. J. (1998). Effective grading: A tool for learning and assessment. San Francisco: Jossey-Bass.