Why people are the real power behind big data | Pearson Labs

Why people are the real power behind big data | Pearson Labs

10Jun

Kristen DiCerbo is a research scientist at Pearson, and co-author of Impacts of the Digital Ocean on Education, where she makes the case for the potential of data to transform teaching by using it to decipher how students learn. 

Responding to a post on The Guardian by professor Diana Laurillard of the University of London, Kristen argues that realising the full power of data is as much about bringing the right people together, as it is about the digital tools we employ.

***

There is a lot of buzz around the promise of “big data” to transform education. Last week on the The Guardian site, Diana Laurillard suggested lecturers take control of data collection and analysis in higher education. She argues that teachers are the ones who know what they need to inform course design and they can combine data from their classes with data from other lecturers using the same design to generate larger data sets and better answer education questions.

The comments section of the article is very impressive in its consideration of the issues here. In it, Laurillard clarifies that she is talking about, “a different kind of ‘big data’.” The article is meant to suggest we redefine what big data means away from a data science definition (an amount of data that is so large our traditional systems can’t handle it) toward a lecturer-centric definition that means outcomes from a number of lecturers using the same learning design.

I would argue that Laurillard has accurately defined the problem, but misses on the solution.

The Problem

The conversations around big data in education are largely driven by computer and data scientists, people coming out of technology companies. The lack of educators in the discussion is apparent at innovation summits and meetings of start-ups where the promise of big data is touted.

This leads to a lot of problems with interpretation of results and recommendations from findings. There is some allusion in the comments to the problem of confusing correlation in the data with causation. One commenter suggests it is not such a big deal, but I have seen this in action. Analysis of a variety of learning management systems has consistently shown correlations between logging in to class on the first day and successful course completion. Based on this, I have seen programs underway to make sure students are logging in the first day. Will this really increase successful course completion? While the correlation is an empirical fact, we must remember that just because two things are related does not mean that one causes the other. Rather, it is likely that learners who are more motivated and/or organized are more likely to log in on the first day and do better in the course. Logging in on the first day is an indicator, but not a cause. Interventions might be better aimed at improving organizational or time management skills, for example. Would having lecturers in control of the process make it more likely that we would dig more deeply into questions of why we see given patterns and what to do to address them? Highly likely.

Laurillard suggests the problems are not just with interpretation, but go back to asking the right questions, the questions that lecturers need to answer to really improve learning in their classes.

Laurillard’s Solution

Laurillard suggests that new technology platforms allow many lecturers to use the same learning platforms with the same learning design and collect the same data. Yes! This ability to gather data from learners engaged in highly similar activity is absolutely the way forward in thinking about how the digital revolution can transform our understanding of learning and teaching. However, I think Laurillard underestimates the amount of data needed for this, and thus the expertise in data management and analysis needed to benefit from this data.

Lecturers and students are highly variable in terms of what they do in their classes. Even following the same learning design, lecturers will emphasize different things, modify assignments, and respond to questions differently. Students will do very different things in the same learning environment. What “big data” allows is the analysis, not just of outcome data, but of process data. We don’t just want to know if the students got the right answers or what their score on the final test was, but also what digital tools, hints, and learning aids did they use to get there? How long did it take? Did they take multiple attempts?

Taking into account all the variability in lecturers and students requires information from 10’s of thousands of instructors and millions of data points from student interactions. This in turn requires special technology systems to collect the information and special tools and expertise for analysis. Could some analysis be done without this? Of course, but it doesn’t give us the full benefit of that data that really changes things in the digital world.

A Different Proposal

To really get at “why” we see a particular outcome across classes and lecturers, we need to be using both the lecturer input AND all this process data. Lecturers can develop hypotheses about why we see a particular outcome and the data analyst can see if there is evidence for this in the process data. Lecturers without expertise in large-scale data analysis are not likely to be able to use data in this way. They will continue to stay at the level of the “what” outcome is observed and their hypotheses about why they see those results, without the ability to get at patterns in the process data that can confirm these hypotheses.

In the comments section of the article, there is an attempt to break down the tasks required to make use of data. The list proposed by commenter CrispinW is:

* creator of what are likely to be many different tools;
* creator of the learning design;
* teacher who supports/implements the learning design;
* aggregators of learning outcome data;
* interpretors [sic] of learning outcome data

In response Laurillard says that surely lecturers are the creators of the learning design, the teacher who implements, and the interpreters of the outcome data. This is absolutely true on an individual class level. However, she suggests they can also be the aggregators across lecturers on a large scale.

I would propose that if we are looking at process and outcome data, we are looking at the need to aggregate data on a scale that exceeds the software and data capabilities of most lecturers. This is not a criticism of lecturers; I work with post-docs coming out of graduate programs in statistics who don’t know how to manipulate data at this scale. It has become a specialized skill; we are not all experts at everything. Along the same lines, the analysis of this information also requires specialized analysis and tools that are not commonly in the toolkit of most lecturers.

That said, I hear Laurillard saying that the big data movement is not answering the questions lecturers want answered. What I would propose is not that lecturers begin to take on trying to learn how to use unstructured data bases and master educational data mining techniques. Rather, I think we need the lecturers driving the analysis that data experts are doing. They should have far more input and, yes, control, over the questions being investigated. They do not need to perform the aggregation and analysis of data themselves in order to have this. Rather, companies and groups with expertise in big data need to bring lecturers to the table and give them a voice. Let them suggest questions that the analysts pursue. Create a feedback loop where the analysts report to the lecturers and develop new questions.

In reality, “big data” is a lot of little data. It is a record of many learners’ individual experiences interacting in digital environments. Data analysis can be seen as learning from these records of experience with the main goal of feeding back into the lecturer and learner what they should do next to move towards their goals. Too much of what happens in big data in education currently includes neither the lecturer nor the learner in discussions. This does not mean they need to do the work learning how to collect, store, aggregate, and analyze the mammoth data sets. Rather, it means they need to be brought in as a driver of the processes. We need to be working together to make the best use of the information the digital revolution now makes available.

***

Join Kristen DiCerbo as she discusses this topic in a free webinar on 18 June. Register here.

To read more about the use of process data to inform decisions about teaching and learning, see The Impacts of the Digital Ocean on Education.

View post:  

Why people are the real power behind big data | Pearson Labs

Share this post