Now this doesn’t mean Watson would score 16 points a game if he played the whole game. We’d have to look at the data on what happens when he plays eight minutes or more in a game. Maybe ask, “How many points does the opposing team score per minute that Robinson (or Watson) is on the court?” Here’s another good question: “How often does Michigan win when Robinson plays at least 20 minutes?”
Alternatively, we could train our algorithm to search for which players are on court when Michigan has the best ratio of points scored to points opponent scored. Even though we use machine learning, it all starts with experienced humans asking really good questions.
The higher ed connection
Basketball is a good analogy for higher education in some specific ways. For example, the game is dynamic and the team is made up of individuals who at any point in time could be at their highest or lowest mental and physical performance in the game, just like all students at Michigan are in the “game” of getting through graduation. Following the basketball analogy, points could be analogous to grades, skills, knowledge attained, or job offers. For the sake of this exercise we will equate points with grades.
You could ask, “What can we know about the students with the most As?” and the data would give you a persona or profile. This data tells you something but isn’t very actionable. You could go deeper and ask, “What can we know about students with the most As who are first-time college students vs. students with the most As who have prior college experience?” This will give slightly more actionable data if there are strong differences.
Or, switching it around, ask, “What resources (academic or otherwise) at Michigan do these two different groups of A students interact with?” It may turn out that students with prior college engage with the career center frequently from day one, or those with no prior college engage with social networks more frequently from the beginning. This could suggest a different freshmen or sophomore pathway to success.
How to ask the best questions to get more accurate data
Another intriguing point in the Wolverines’ example is the value of performance by Robinson vs Watson. Top down, in points per game and time playing, Robinson’s stats are higher. However, using a bottom-up approach, in points per minute Watson beats Robinson.
In much of higher education, including reliance on academic grades and standardized tests, the questions and data tracked miss important high-performing stats that reveal the whole student body’s contributions and performance in ways that can lead to more wins. As stated earlier, using machine learning makes it possible to ask better questions that require a tedious amount of data crunching, yet removes the possibility of human error and fatigue in getting to the better data.