Monday, February 22, 2016

Some Rubricy Details for the Text Mining/Data Visualization Assignment

The core of this assignment is that you engage in text mining or data visualization tools, such as those at Voyant Tools that we used in the Frank Norris project.

First, you are to strip down the text files for your assigned range of issues of IF: Worlds of Science Fiction. These files should be uploaded to our class Google Drive (DH) so all students can access the entire corpus. Then, based on your reading, (at least one issue of the magazine's run) you will design a text mining or data visualization project using this corpus. This project can take many forms, but should be justified based on your reading. I suggest building word lists based on themes you discover through reading or based on the time period you are exploring.

How you use the tools is up to you. While I do want to see a project that attempts to explore several key ideas, (don't just generate a single word cloud and be done) these ideas could cut across several smaller explorations, or encompass a single, larger one. Remember that projects are not definitive. You are also not meant to be masters of the technology or the material--there will be questions raised, and results that do not fit in with your expectations. Be prepared to fail, but keep in mind that failures are part of humanities, especially the digital humanities. Fail well.

You are to blog about this project and should include the following:

    -Description of your process and the thinking that led to its development.

    -Description of your results and your interpretation. How much more information might you need? How could this process or project be improved moving forward? What did you learn about the literature and the process?

    -Description of challenges you faced, and how that may change what you try if you had a future project. What might other students behind you need to know to build on your work?

   -What is the value of such examinations? What can we learn about literature? Are there limitations that simply cannot be overcome?

I would love to see some examples included in your blog--sample data or visualizations.