previous | home | notes | PDF full text | appendix | next

Digital Natives or Digital Strangers?
Teaching the Eighteenth Century Online, from Ctrl-F to Digital Editions

Allison Muri


SEARCHES, DATABASES, FIELDS, AND CATEGORIES

If “the Googlization of everything” (Vaidhyanathan n.p.) is upon us, my goal is to teach students to actively seek to understand how the texts they read are assembled and to exert some control in creating the pages they most want to read rather than to passively accept the database’s attempt to interpret keyword or natural language searching. Google’s excellence in this regard tends to foster certain misunderstandings when inexperienced users search scholarly databases with different interfaces and different means of determining authority and relevance. Google bases keyword search results on the frequency of those keywords on a given page, as well as on the length of time the page has existed and the number and quality of links to the page. It also ranks web pages based on the anchor text used to link to them, so the keywords do not even have to appear on the page that comes up in the search results. More than two hundred signals help the search engine to find what readers are looking for, based on contextual information: the algorithm is designed to distinguish between place names and the names of people, for example. Google’s algorithm also uses synonyms for keywords to expand relevant results and to provide alternatives for misspelled words (Levy; Google; googleblog).

The power of popular language usage and page popularity are key to Google’s pursuit of the “perfect search engine,” defined by co-founder Larry Page as one that “understands exactly what you mean and gives you back exactly what you want” (Google). Google does not, as professors habitually warn their students, determine if the information on the page is researched, peer reviewed, or thoroughly the invention of a delusional mind; however, it does, almost magically, “understand” what information is being requested. If I type “meaning of innocence and experience” into a Google form, William Blake and his works dominate the first page of results, but these include SparkNotes, a MySpace page of musical recordings inspired by Blake, and 123helpme.com; the same term in EBSCO’s MLA International Bibliography results in scholarly articles but the first page lists works on John Milton, William Blake, Benito Cereno, the Naudet brothers, Octavio Paz, Charles Dickens, Elizabeth Bowen, Andrew Marvell, Jean Paul Sartre, and Albert Camus.

I need hardly note that students already know how to conduct research using Google, and some are very effective at searching and evaluating the results. As De Rosa et al. found in 2005, “89 percent of college student information searches begin with a search engine,” while library websites “were selected by just 2 percent of students as the source used to begin an information search” (De Rosa et al. 1-7). Sixty-eight percent reported Google was the search engine they used most recently, while Yahoo! was used by 15 percent and MSN Search by 5 percent (1-8). However, corroborating what I see in second- and third-year students’ essays, their evaluations of relevance and authority of the results are often naive or desultory. According to this report, “More than half of college students (53 percent) believe information from search engines is at the same level of trustworthiness as library information” (3-4).

In my experience, students do not know that Google’s PageRank algorithm incorporates “votes” -- that is, the number of incoming links from other trusted sites -- to help determine how useful any results will be for a given search term. Usually this system works very effectively and Google is, of course, a valuable and beneficial tool, but the fact that the library’s scholarly databases are largely foreign territory for undergraduates is a concern. Even when introduced to these resources, the relatively undifferentiated display of results in some databases can lead inexperienced users to assume that the MA thesis or newspaper article they have found is equivalent to a peer-reviewed publication. Even experienced students can have difficulty deciphering whether they are searching only titles and abstracts, or also full-text articles, and thus how best to frame their search terms. A significant number of students assume that search engines can interpret questions and phrases as natural language (one student noted in frustration that the MLA Bibliography database found no results for a search on “Blake” AND "significance of illumination" OR "meaning of illumination"): frequently they find using the library databases difficult and intimidating. While most students know how to search for a specific phrase by placing the words in quotation marks, other limiters such as Boolean operators, wildcards, proximity searches, and so on are frequently unfamiliar and poorly understood.


Accordingly, in an admittedly artificial exercise, I ask my students to create complex search terms in library databases using these operators as creatively as they can to generate sets of results that might provide answers to a given research question (See Online Research Project). To introduce them to Boolean searching, I show my students how to narrow a large set of results from a broad keyword search down to a more manageable set. The library’s catalogue interface works well, or for students in eighteenth-century studies, the Eighteenth-Century Collections Online (ECCO) database interface proves especially effective since we can conduct both author-title field searching of records and full-text searching of works. For example, searching for “Alexander Pope” as a keyword in ECCO results in 1065 hits. The difference between keyword searching and field searching is immediately obvious (in ECCO, the term “Alexander Pope” is searched as “Alexander n4 Pope” and thus the results include such works as The Ancient and Present State of Glocestershire by Sir Robert Atkyns the Younger, the full text of which contains the words “Pope Alexander”). Searching for “Alexander Pope” in the author field narrows results to 853 (Fig. 2).

Fig. 2
Results of a search for the words Alexander Pope in the author field in ECCO. Used with permission from Eighteenth Century Collections Online (ECCO) from Gale, a part of Cengage Learning.

Please click on the figure above to enlarge.

The first eight of these are Letters of Abelard and Heloise by Peter Abelard. Students will readily enough determine why this is so: the title of each also includes variations on “To which is added” or “Together with” “The Poem of Eloisa to Abelard By Mr. Pope.” I ask: What if you wanted to exclude the poem from the resulting set? While students may not recognize the word “Boolean,” they usually know that dropdown menus containing these operators can filter the results further, for example, by specifying “NOT Abelard” in the title field (Fig 3).

Fig. 3
Boolean search for Alexander Pope NOT Abelard [in title] on ECCO’s Advanced Search page. Used with permission from Eighteenth Century Collections Online (ECCO) from Gale, a part of Cengage Learning.

Please click on the figure above to enlarge.

The result set for the second search is 830 texts. Scrolling down, we can see that several pages of results are comprised of translations of The Iliad and Odyssey of Homer and The Iliad of Homer: if we were more interested in Pope’s original works than his translations, how would we further exclude these titles? Students will suggest excluding Homer from the title field as well, but when I ask them whether we use AND, OR or NOT in the next dropdown, they often suggest OR, a reasonable assumption since in natural language expression we might say “Alexander Pope but not Abelard or Homer” (Fig 4). Please click on figure to enlarge.

Fig. 4
Boolean search for Alexander Pope NOT Abelard [in title] OR Homer [in title] in ECCO’s Advanced Search page. Used with permission from Eighteenth Century Collections Online (ECCO) from Gale, a part of Cengage Learning.

Please click on the figure above to enlarge.

However, when we try this term (Fig 4) the result set is 1007 -- higher than the original search for Pope as author. I ask the students to determine the reason for this result (that is, when using these menus the search engine applies the terms in the order they appear). The search in Fig 4 is equivalent to:

(all works in the database where the author field contains Alexander within 4 words of Pope and the title field does not contain Abelard)

OR

(all works in the database where the title field contains Homer)

Accordingly the results include works such as Proofs of the Enquiry into Homer’s Life and Writings, Translated into English by Thomas Blackwell. The alternative “NOT Homer” results in a much smaller set of 645 results (Fig 5). Please click on figure to enlarge.

Fig. 5
Boolean search for Alexander Pope NOT Abelard [in title] NOT Homer [in title] in ECCO’s Advanced Search page. Used with permission from Eighteenth Century Collections Online (ECCO) from Gale, a part of Cengage Learning.

Please click on the figure above to enlarge.

This latter search is equivalent to nesting the terms (Abelard OR Homer) as follows in Fig 6. Please click on figure to enlarge.

Fig. 6
Boolean search for Alexander Pope NOT (Abelard OR Homer) [in title] in ECCO’s Advanced Search page. Used with permission from Eighteenth Century Collections Online (ECCO) from Gale, a part of Cengage Learning.

Please click on the figure above to enlarge.

At this point, judicious keyword searching can narrow results down to a specific set in which we might read what Pope wrote about genius and poetry in his original works by excluding the variants “translated/translation” from the titles and finding instances of “poet/poets/poetry/poetic” within five words of “genius” in the texts (Fig 7)
. Please click on figure to enlarge.

Fig. 7
Boolean search for (Alexander Pope NOT (Abelard OR Homer) [in title] NOT translat* [in title]) AND poet* n5 genius [in full text]) in ECCO’s Advanced Search page. Used with permission from Eighteenth Century Collections Online (ECCO) from Gale, a part of Cengage Learning.

Please click on the figure above to enlarge.

Discussing the implications of dirty Optical Character Recognition (OCR) and the inherent limitations of the search term is important, as even mature students may assume the tool’s infallibility and comprehensiveness. Such an exercise might seem to aid the supposed evolution toward “shallow” reading, but it should in no way encourage a student not to read and further study entire texts. Rather, the lesson demonstrates how digital information is categorized and filtered and how much (or little) control readers have over the contents of the digital pages they retrieve. It should, as well, allow them to compare, to recognize, to articulate, and eventually even to redress the limitations of poorly designed or deliberately constrained interfaces
.


Reading online provides an instrument of access to a rich heritage of the written word. It is hardly the “nightmare scenario . . . of efficient and prosperous information managers living in the shallows of what it means to be human” (Birkerts 194). We cannot afford, as Birkerts famously advised, to “refuse it” (229). Nevertheless, there continues to be a fear of reading and writing online, a fear that I worry translates into diffidence, sometimes outright hostility, toward teaching these technical skills and online texts in the humanities. Nicholas Carr, for example, creates a progressive narrative of the “deepening page,” a cultural evolution from the “intellectual ethic” of orality as mere communal record, to an enriched consciousness wrought by “the book’s ethic of deep, attentive reading”: “The ideas that writers could express and readers could interpret became more complex and subtle, as arguments wound their way linearly across many pages of text. As language expanded,” he writes, “consciousness deepened” (75). However, Carr warns, the “world of the screen . . . is a very different place from the world of the page. A new intellectual ethic is taking hold” (77): “Our indulgence in the pleasures of informality and immediacy has led to a narrowing of expressiveness and a loss of eloquence” (108). In terms of the search engine, Carr writes of how

our attachment to any one text becomes more tenuous, more provisional. Searches also lead to the fragmentation of online works. A search engine often draws our attention to a particular snippet of text, a few words or sentences that have strong relevance to whatever we’re searching for at the moment, while providing little incentive for taking in the work as a whole. (91)

Carr concludes that the “strip-mining of ‘relevant content’ replaces the slow excavation of meaning” (166). My readers, I am sure, will recall Jonathan Swift’s demonizing of index learning and compendiums assembled “without the fatigue of reading and thinking,” the gathering and alphabetizing of quotations at the expense of consulting the authors, the certainty that the modern writer’s head will thus “be empty provided his commonplace book be full” (70–2). Now, as then, processes of gathering relevant quotations, of indexing, collation, and retrieval, are vital to the deeper synthesis that scholarship entails, regardless of the technology.

previous | home | notes | PDF full text | appendix | next

 


Contact Us - 2009 Digital Defoe | ISSN 1948-1802 (online)