College & Research Libraries News
A new approach to reference statistics
Assistant Reference Librarian Louisiana State University Libraries
Many reference librarians consider collecting reference desk statistics a waste of time. All too often they are right, in two ways. Time-consuming statistical procedures performed every hour of every day by reference professionals “waste” time by taking it away from patron service; and statistics systems designed without a sound scientific basis are a waste of anyone’s time.
Before July 1986, the method of collecting reference desk statistics at Louisiana State University Libraries suffered from both of these common failings. Librarians and paraprofessionals assigned to the reference desk were expected to record every patron question with a tally mark on a statistics sheet which classified questions by hour of day and by type (information, reference, research, or card catalog). This system was a distraction at best, and positively hindered reference service during hours of peak usage. At such times, the desk staff found it practically impossible to record every single question; some left many questions unrecorded, while others set down large numbers of marks at random, simply to reflect how “busy” the shift had been.
To make matters worse, there was considerable variety in staff interpretations of the basic question categories (information, reference, research, card catalog). Also, the library administration wanted statistics kept on the types of patrons (faculty, students, etc.) served at the reference desk, which would have made statistics taking hopelessly cumbersome under the system in use at that time.
Toward the end of fiscal year 1985/86, sentiments were strongly in favor of a new approach to reference desk statistics. The ideal system would collect statistics by both question type and patron type in a more scientific and statistically sound manner while freeing the desk staff to concentrate on the information needs of patrons.
The new approach
Reference Services Division head Jane P. Kleiner became interested in sampling reference desk statistics when she served as ACRL liaison to the Public Library Development Task Force. After Douglas L. Zweizig (author of Output Measures for Public Libraries) spoke to the LSU Libraries staff about output measures and the successes other college and university libraries have had with statistical sampling, it was decided that the LSU Libraries Reference Services Division should try a similar approach. Rather than attempt to record every question asked during approximately 4,000 hours of service throughout the year, a small number of selected hours would be designated as statistics sessions, and statistics would be recorded only during those hours. The exact number and distri-
February 1988 / 85
bution of the hour-long statistics sessions would be determined as a statistical sample from a total “population” of 4,000 + hours.
The Reference Department turned to William G. Warren and Kung-Ping P. Shao of the LSU Department of Experimental Statistics for advice. They studied the reference desk statistics for a typical month (April) of the previous year, and concluded that the figures approximated a Poisson distribution—a statistical model used to predict the arrival of travelers at bus and train stations, among other things.
The LSU Libraries administration had decided that a 90 % confidence level and an error range of ± 10 % for the total number of questions asked during the year would be sufficiently precise. Warren and Shao found in the April 1986 statistics an average (mean) rate of 32 questions per hour and a standard variance (using the Poisson model) of ± 14 questions. Applying these figures to the equation below (a standard equation for determining the size of a simple random sample) yielded a sample size of 52 hours out of a total of 4,103 hours of service. Concerned that greater variations might occur in other months of the year, Warren and Shao suggested increasing the sample to 60 hours. They also emphasized the importance of distributing these 60 hours individually and randomly throughout the year in order to maintain the validity of the sample as a simple random sample.
(where σ equals the standard variance, equals the mean rate of questions, D equals the acceptable range of error, and Z is a standard normal deviate determined by the confidence level/2)
The statistics sessions were distributed using a table of random numbers. Each hour of service in the fiscal year was numbered from 1 to 4,103, and 60 random numbers in that range were recorded. A list of the days, dates, and hours corresponding to those random numbers was compiled and rearranged into a calendar of statistics sessions for fiscal year 1986/87. Between two and nine sessions fell in each month, with an average of five per month.
New statistics sheets were designed, with columns for patron type (faculty, graduate student, undergraduate student, and “other”) and rows for question type (information, reference, research, and online catalog training—the LSU Libraries had recently converted to a NOTIS automated system with online catalog). Each sheet was to be used for one statistics session only. A single mark would indicate both patron type and question type, so the number of undergraduate students asking reference questions (for example) would be recorded for each session.
Because it would be necessary to ask each patron whether he or she was a faculty member, graduate student, or undergraduate student, there was concern that the process would consume a great deal of desk staff time during statistics sessions. To avoid delaying service to patrons and possibly biasing the sample, it was decided to assign two of the department’s eight graduate assistants to take statistics during each session. The assistants would be able to concentrate fully on the statistics, while the desk staff could provide uninterrupted reference service. Also, the eight assistants could be trained to record statistics more uniformly than the varied group of librarians, paraprofessionals, and assistants assigned to the desk at various times of the day and week. To further insure uniformity in statistics recording, a manual was written containing detailed definitions and examples of the four types of questions.
Results
At the end of the fiscal year, the figures from all the statistics sessions were totaled and multiplied by 68.383 (4.103 hours of service divided by 60 hours of statistics equals 68.383) to produce the totals used in the annual report. Questions were divided by source (in person or telephone), type (in- formation, reference, research, online catalog training), and patron type (faculty, graduate student, undergraduate student, “other”), so there were 32 individual question categories (2x4x4 = 32). For example, in-person faculty information questions would be one individual question category. Individual categories were combined into columns (all in-person faculty questions, for example), rows (all in-person information questions, for example), and larger groupings, including a grand total of all questions asked. In all, the statistics yielded 53 separate data elements (totals of different combinations of patron and question type) for cross-comparison and comparison with the previous year’s figures in the annual report.
Shao analyzed these data elements, using the original statistics sheets, to determine an error range for each element at a confidence level of 90% . For the grand total of all questions asked in the year, the error range was ±11.23%, very close to the desired range of ± 10 % . The more specific data elements had wider error ranges, because they comprised smaller parts of the sample population and in many cases had greater variance from one statistics session to the next. Some of the individual question categories had error ranges of ± 50 % and more; those figures were considered only indicative of a general range into which the actual number of questions might have fallen.
Even with these levels of error, the new approach is considered far more reliable and statistically valid than the old approach, in which the inaccuracies were subjective (resulting from unpredictable human error) rather than objective and random, and were therefore not subject to measurement or control. The error levels in the present system could be reduced by increasing the sample size; however, the library administration has decided that the present level of accuracy is sufficient for the time being.
The reference staff have greatly appreciated their release from the tyranny of recording statistics every hour of every day, and have reported improved interactions with patrons as a result of greater freedom to concentrate on service. Graduate assistants were scheduled for only one or two statistics sessions in the average month, so the extra duty did not impose any great hardship. In fact, statistics duty frequently helped them make up hours lost during breaks and university holidays. The assistants became quite adept at statistics- taking during the year, and encountered surprisingly little difficulty asking each patron his or her status.
Small modifications have been made in the statistics system for fiscal year 1987/88. The most significant change is that the “other” category of patrons has been divided into four sub-categories (faculty and staff of other universities, other information professionals, elementary and secondary school students, and “other”). In general, however, the new sampling approach to reference desk statistics has been highly successful at LSU Libraries, and is considered worthy of emulation by other reference departments who would like to increase the accuracy of their desk statistics while decreasing the effort and resources devoted to collecting them. ■ ■
Article Views (By Year/Month)
| 2026 |
| January: 13 |
| 2025 |
| January: 11 |
| February: 22 |
| March: 23 |
| April: 28 |
| May: 24 |
| June: 39 |
| July: 22 |
| August: 26 |
| September: 32 |
| October: 31 |
| November: 35 |
| December: 42 |
| 2024 |
| January: 14 |
| February: 11 |
| March: 18 |
| April: 30 |
| May: 12 |
| June: 14 |
| July: 17 |
| August: 17 |
| September: 22 |
| October: 9 |
| November: 13 |
| December: 9 |
| 2023 |
| January: 17 |
| February: 18 |
| March: 20 |
| April: 18 |
| May: 29 |
| June: 15 |
| July: 12 |
| August: 25 |
| September: 11 |
| October: 9 |
| November: 23 |
| December: 14 |
| 2022 |
| January: 16 |
| February: 19 |
| March: 16 |
| April: 20 |
| May: 12 |
| June: 14 |
| July: 14 |
| August: 20 |
| September: 11 |
| October: 9 |
| November: 20 |
| December: 14 |
| 2021 |
| January: 4 |
| February: 9 |
| March: 19 |
| April: 4 |
| May: 15 |
| June: 14 |
| July: 14 |
| August: 20 |
| September: 16 |
| October: 21 |
| November: 18 |
| December: 12 |
| 2020 |
| January: 0 |
| February: 5 |
| March: 7 |
| April: 10 |
| May: 4 |
| June: 7 |
| July: 6 |
| August: 6 |
| September: 8 |
| October: 7 |
| November: 9 |
| December: 7 |