Monthly Archives: October 2012

Call for proposals: Small Data in a Big Data World

CFP: Small Data in a Big Data World
Panel at International Congress of Qualitative Inquiry (ICQI) 2013  
to be held May 15-18, 2013 on the campus of the University of Illinois, Urbana-Champaign

Recently the academic research world has been flooded with discussion of the uses and implications of ?Big Data.? For those of us whose research focuses on digital environments this discussion includes conferences, grants, special publications, and job announcements that focus on Big Data and the computational turn in social science and humanities research.

Big Data is not necessarily defined by the size of the data set, for humanities scholars have long been interested in huge textual and image-based corpora.  Instead, Big Data refers to the increasing complexity of relationships between data objects in a given set, often requiring large-scale computational and algorithmic resources for analysis.   Small Data research, on the other hand, often begins with a theoretical (e.g., critical race theory) or methodological (e.g., case study or ethnography) approach, which is then applied to digital data drawn from less-popular websites, YouTube videos, or even individual blog posts and comments.

Unfortunately, the tools used to analyze Big Data seem to be influencing modes of thought about new media and digital research away from the theoretical and towards the scientistic.  For example, in a recent article Bruns and Burgess (2012) argue that humanist,  interpretive studies of social media are ideosyncratic, non-repeatable, and non-verifiable.   Although Bruns and Burgess concede that there is space for traditional qualitative methods, their suggestion is that these methods need to be integrated and innovated upon in a big data context.

Given the increasing amounts of attention (e.g., external funding, public policy, or student interest) ?big data? is accruing, where does this leave Small Data research and researchers? This panel seeks to explore the position of Small Data in relation to the discussion and/or use of Big Data. As the definition of Big Data is still in flux we are using Bruns & Burgess (2012) to ground our individual presentation. We are seeking presentations that will explore a variety of views on this turn toward Big Data and the impact on the researched, the researcher, and academia.

References:

Bruns, A., & Burgess, J. (2012). Notes towards the Scientific Study of Public Communication on Twitter. Conference on Science and the Internet. D?sseldorf. Retrieved Oct. 8, 2012 from http://snurb.info/node/1678.

Individual presenters should submit a 150 word abstract to each of the organizers by Nov. 15, 2012.

Organizers:

Andre Brock
Assistant Professor
School of Library and Information Science
University of Iowa
andre.brock@gmail.com

Lois Ann Scheidt
Doctoral Candidate
School of Library and Information Science
Indiana University
lscheidt@indiana.edu

Info PhD student shows at VisWeek

Kyungho Lee, graduate student in Informatics and Fulbright fellow, was chosen to show his work at the VisWeek 2012 art show.   It is featured here:  http://blog.visual.ly/visweek-2012-art-show/

Talk: Developing an Information Extraction and Visualization Toolkit

HITC/AIIS Joint Seminar

Developing an Information Extraction and Visualization Toolkit

Wendy Chapman
Associate Professor
Division of Biomedical Informatics, University of California, San Diego

Friday, October 26, 2012, 4:00 PM
Siebel Center for Computer Science, Room 3405

Live stream link: http://media.cs.illinois.edu/live/HITClive.asx

There are many barriers to developing NLP algorithms for clinical text and to applying NLP to clinical tasks. At UCSD, we are addressing some of the barriers through development of shared resources to assist developers in annotating text and evaluating NLP annotations. We are also developing shared resources to assist potential users of NLP in developing knowledge bases for particular clinical problems, in customizing NLP applications, and in visualizing the output of NLP annotations for clinical research and decision support. I will describe the Information Extraction and Visualization Toolkit (IE-Viz) we are developing to aid non-NLP experts in application of NLP to clinical tasks.

Bio: After studying linguistics, Dr. Chapman received her PhD from the University of Utah in Medical Informatics with a research focus in natural language processing (NLP). She spent ten years at the University of Pittsburgh and moved to the University of California, San Diego in 2010.  Dr. Chapman's work has mainly addressed extraction of information from clinical reports, including identifying evidence of acute bacterial pneumonia from chest radiography reports and evidence of conditions relevant to detecting disease outbreaks from emergency department reports. She has developed an information extraction system called Topaz that maps text to concepts from a user's knowledge base and uses the ConText algorithm to assign attribute values for negation, experiencer, and historicity. Dr. Chapman led the American Medical Informatics Association NLP Working Group from 2008 until 2012 and is collaborating on efforts to develop shared conventions for NLP. She is working on several collaborative grants creating visualization tools for NLP output and developing infrastructure for NLP development and application.

www.healthit.illinois.edu

Talk: Crowd-Powered Systems

HCI Seminar Talk

Title:  Crowd-Powered Systems
Who:  Michael Bernstein, Assistant Professor, Stanford University
When:  Tuesday, October 16, 2012,  11am
Where:  3403 Siebel Center

Crowd-Powered Systems

Abstract:

Crowd-powered systems combine computation with human intelligence, drawn from large groups of people connecting and coordinating online. These hybrid systems enable applications and experiences that neither crowds nor computation could support alone.

Unfortunately, crowd work is error-prone and slow, making it difficult to incorporate crowds as first-order building blocks in software systems. I introduce computational techniques that decompose complex tasks into simpler, verifiable steps to improve quality, and optimize work to return results in seconds. These techniques advance crowdsourcing into a platform that is reliable and responsive to the point where crowds can be used in interactive systems.

In this talk, I will present two crowd-powered systems to illustrate these ideas. The first, Soylent, is a word processor that uses paid micro-contributions to aid writing tasks such as text shortening and proofreading. Using Soylent is like having access to an entire editorial staff as you write. The second system, Adrenaline, is a camera that uses crowds to help amateur photographers capture the exact right moment for a photo. It finds the best smile and catches subjects in mid-air jumps, all in realtime. These systems point to a future where social and crowd intelligence are central elements of interaction, software, and computation.

Bio:

Michael Bernstein is an Assistant Professor of Computer Science at Stanford University. His research in human-computer interaction focuses on the design of crowdsourcing and social computing systems. This work has received Best Paper awards and nominations at premier venues in human-computer interaction and social computing (ACM UIST, ACM CHI, ACM CSCW, AAAI ISWSM), and it has appeared in venues such as the New York Times, Slate, CNN and The Atlantic. Michael has been awarded the NSF graduate research fellowship, the Microsoft Research PhD fellowship, and the George M. Sprowls Award for best doctoral thesis in Computer Science at MIT. He holds Ph.D. and masters degrees in Computer Science from MIT, and a B.S. in Symbolic Systems from Stanford University.

Atlas.ti 7 data analysis program available on campus

 

ATLAS is excited to announce that we now have Atlas.ti 7 available on all of the computers in our lab located at Lincoln Hall 2043. Atlas.ti is a qualitative data analysis program designed to help researchers manage and analyze non-numerical data such as text, video, audio and graphics in a meaningful and systematic way. Students and faculty engaged in qualitative or mixed-methods research are invited to use the lab or request a consultation at:

http://www.atlas.illinois.edu/services/stats/consulting/

Assistantships Available for Spring Semester

**Assistantships Available for Spring Semester*

*Center for People and Infrastructures, Coordinated Science Laboratory

http://www.csl.illinois.edu/infra-center

The new Center for People and Infrastructures (located in the Coordinated Science Laboratory at Illinois) seeks curious and talented graduate students for work on a new project exploring use patterns, tracking capabilities, and pricing structures of consumer-level broadband internet access. The goal of the project is twofold: firstly, to better understand how users employ high-speed uploading and downloading possibilities, and secondly, to predict the level of detail with which providers might track such use for purposes of determining pricing schemes.

Candidates should come with curiosity about the subject matter, and willingness to participate in a cross-disciplinary discussion that includes researchers in engineering, social science, communication and design. 

Useful skills include any of the following:

- a background in network analysis or engineering

- data analysis and visualization prototyping

- interaction design and implementation for browser-based applications (HTML/Javascript)

- social science survey design and implementation

Opportunities for hourly labor and research assistantships are available. To apply, send a cv and a statement of interest/experience to Kevin Hamilton, Co-Director of the Center for People and Infrastructures (kham@illinois.edu). (Transcripts may be requested at a later date.)