How Can Machine Learning Improve Educational Technologies?

by Blake Mason, John Vito Binzak and Fangyun (Olivia) Zhao 

As computer-based technologies continue to establish a significant presence in modern education, it becomes increasingly important to understand how to improve the ways that these technologies present educational content and adapt to learners. Achieving these goals in the design of cognitive tutors, instructional websites, and educational games leads to difficult questions. For example, what are the best activities and examples that will help learners understand the educational content? In what order should these examples be given? How can these activities focus learners’ attention in productive ways? These are difficult questions for instructional designers to answer on their own. Here at UW-Madison, educational researchers are teaming up with computer science experts in machine learning to test theories about how design decisions can optimize the learning technologies. There are many ways that machine learning techniques can be used to improve the design and effectiveness of educational technologies.

Here we outline two interesting examples:

Visual Representations of the Water Molecule

One example of where LUCID students are applying machine learning techniques to educational problems, is our ongoing project studying how students perceive chemical properties from molecular diagrams. To succeed in chemistry courses, students need to develop perceptual fluency with visual representations of molecules to understand how they convey important properties.

A challenge for designing instructional interventions to support this learning is that acquiring perceptual fluency a form of implicit learning, that occurs in ways that students are not consciously aware. Therefore this form of knowledge is difficult to articulate, and we cannot rely on traditional methods to pinpoint what students do and do not understand. To get around this issue, can design experiments to see how advanced and novice students see visual representations of molecules differently. First, we take a long list of molecules and record all of the visual features of each of the molecules.

ChemTutor Interface

Then, we have students judge the similarity of molecules presented 3 at a time: “is molecule a more similar to b or to c?”  Finally, using a specific form of machine learning called metric learning we see which visual features predict students similarity judgments, and thus detect which features students attend to when viewing visual representations of molecules. By comparing the results of chemistry experts and novice students, we hope to build a better understanding of how perceptual fluency changes over experiences. In the ongoing ChemTutor project at UW, we hope to use this knowledge in the development of new cognitive tutors capable of providing adaptive feedback that help as students identify and focus on key visual features of molecular diagrams.


Another example of creating interactive model for teaching and learning is combining the strengths of eye-tracking technologies along with machine-learning algorithms. Current educational software focuses much on creating features that attempt to attract students’ interest and raise motivation. We are interested in developing a tool that learns from and adapts to students’ habits such as where they tend to look on the screen and how they become distracted. Using eye-tracking technology, we can interpret gaze fixation to understand where students focus their attention, and then customize instructional materials accordingly. In addition to making better educational technologies, this work is also important for researchers studying human attention. Specifically, researchers are interested in understanding how changes in gaze fixation relate to shifts in attention, and using this data to develop models that predict gaze behaviors. Through multiple phases of development, this project demonstrates how improving education in powerful ways can involve research on low-level processes of cognitive behavior, to software development and user testing.


Posted in LUCID, Machine Learning

Interactive Science Communication

 by Scott Sievert and Purav “Jay” Patel

Scientific research is still communicated with static text and images despite innovations in learning theories and technology. We are envisioning a future in which interactive simulations and visualizations are used to enhance how complex methods, procedures, and results are taught to other scientists and non-scientists.


Scientists in various fields ask questions about different things, but they share the same basic of communication. Nearly all research is communicated to other scientists, journalists, and the public in the form of static text and images encapsulated in journal articles and conference papers. These papers can be found in print and online for a high price. Recently, the “open science” movement has focused on broadening the number of people who can access these papers by eliminating the access fees for these (often) pricey papers. But there’s a problem. Even if all scientific papers are free and convenient for everyone to physically access, they will still not be cognitively accessible. In other words, the number of people who can understand the meaning of most scientific papers (even superficially) will be low.


We believe that research should be conveyed directly to the public. At the moment, the traditional text and images approach makes it difficult to do so. Because of this, science journalism acts as an intermediary, translating the jargon for the public to appreciate (see physorgnewspapers, and educational videos.

But this is not always a good thing. Science journalism is often misleading (see John Oliver’s takedown) because their focus is to hook the audience and drive viewership and readership. Major change is needed because scientific literacy affects how people understand and affect the environment, their bodies, and
voting. Scientists need better ways to communicate their ideas and learn about other’s work. By one estimate, the total volume of scientific research (measured in the number of published papers) doubles every nine years.

For young and seasoned scientists in any field, this is a major bottleneck toward producing great science. How can the best discoveries be made if scientists are unable to keep up with the latest findings? But how can technical research be communicated effectively without “dumbing it down?” Below, we explore some promising ideas.

Suggestion 1: Interactive Simulations of Experimental Tasks

One problem with traditional research papers is their use of dense text and cluttered visuals to convey the materials used in a study. For example, I led a study during my master’s to understand how typical undergraduate comprehend irrational numbers. This study used four tasks, each including subsections. When I wrote up the scientific papers and sent it to the journal Cognitive Science, there was a constraint – use only text and images. Of course, I have the option of
including hyperlinks to the website hosting my experimental simulations. And I had the option of uploading the simulations to another website (Open Science Framework – I did both of these things, but felt that most readers wouldn’t go through the trouble of visiting those website. When I got the reviews back from the editor of the journal, I found clues in their comments that no one actually used these websites. The best way to communicate the experimental tasks and procedures effectively is to directly embed them into scholarly articles. Some programmers, user interface designers, and scientists in computer science have played with this idea. I was particularly inspired by these three examples:


What do these examples all have in common? In each case, the authors could have used blocks of clearly written text to communicate complex visuospatial relationships. Instead, they chose to bring the concepts to life and allow users to manipulate the information in ways that are more natural. It is as if an expert is besides you drawing and explaining otherwise obscure ideas.

So far, prototypes of interactive scientific papers have focused on math and computing (formal sciences). At the start of my first semester of an Educational Psychology PhD, I wondered how these ideas could apply to psychological papers like the one I wrote for my master’s thesis. With the help of three Computer Science students in a Human-Computer Interaction class, I set about embedded experimental simulations in a webpage containing my scientific article. The paper was titled How the Abstract Becomes Concrete: Irrational Numbers are Understood Relative to Natural Numbers and Perfect Squares. I started by reducing the length of the paper to 10% of the original size (start small!). I helped develop each simulation corresponding to different sections of the tasks.

This is what the original section of my scientific paper looked like. Though it makes sense to most people in the field of numerical cognition, it is hard for people outside of the field to say with confidence that the description is clear.









Now let’s have a look at the interactive simulation that enhances the text:

Traditional scientific papers give users a third-person glance into experiments, whereas we aim to provide users a first-person view into what the authors did. From a technological perspective, this change is fairly simple. Existing experimental code can be embedded into digital documents, saving authors valuable time. Moreover, any extra effort required can pay off significantly down the road. The enhanced paper is now easier for scientists, friends, family members, potential employers, collaborators, students, autodidacts, and many others to understand.

Currently, most experimental tasks live inside hard-to-access supplementary materials documents that are usually not accessed. The work that goes into creating these documents often doesn’t bear fruit. By directly embedding experimental simulations we can “show and tell” what happened and rescue authors from wasting time on supplementary materials.

It is not difficult to imagine how to extend this to other kinds of tasks with different kinds of input. In the interactive article mentioned earlier, the complexity of the tasks varying from simple keyboard responses (Z or M key) to mouse clicks and text input. It is also not difficult to consider embedding simulations of experimental tasks in fields like neuroscience and education. Health science fields like physical therapy and surgery may benefit from videos and simulations to practice new therapeutic procedures.

Suggestion 2: Interactive Data Visualizations

Interactive visualizations are useful because they allow the user to see the results of some experiment or function as parameters under their control are changed. This is super useful for data exploration and explanation and it’s picking up a lot of steam. These tools can allow users to interact with the results of scientific experiments or analyze how a particular method performs. Examples of tools that can create these interactive figures are ipywidgets and Holoviews.

These tools allow simple creation of interactive tools in Python, a high level language. These tools allow easy data exploration and can be embedding in a variety of contexts. In the future, one could imagine generating visualizations that show group level trends with the option to quickly change the visual so that individual differences across participants are transparent. For instance, consider an interactive visualization showing a density plot of participants’ scores on a cognitive task. By clicking buttons of dragging a slider, the user could abstract out more general information in the form of a violin plot, then a boxplot. Abstracting out even further, the user could change the original plot to a bar plot or table. Given that journal and conferences
adhere to strict (and problematic) word and page limits, creative interactive data visualizations capable of communicating vast amounts of information in small spaces would improve the quality of scientific articles. Authors would not need to question which one of a dozen analyses and results to focus on. By using rich multimedia, more information can be presented without overwhelming users.

One scientific journal that utilizes interactive graphics heavily is Distill. In this, the team at Distill works with the authors to create a set of interactive graphics for their article that highlight certain points. This has lowered the barrier to reading and understanding any of their papers. Interactive media can convey subtle points that are not immediately obvious.


This example can be extended easily. Communication in science is critical, but surprisingly difficult. Scientists need to communicate their methods, results and experimental design so other scientists can reproduce their results. Allowing quicker and more accurate communication between scientists can yield more robust experimental results and fuel collaboration. These ideas are espoused by arxiv, a digital, open-access scientific publication platform developed by physicists. More recently, website like the Open Science Framework (OSF) and Authorea have offered tools that enable open-access disseminate of scholarly research. The OSF archives preprint, experimental materials, and data across different fields for anyone to access. Authorea is a new digital authoring tool for scientists to collaborate online and embed interactive figures into their manuscripts. This tools also helps format manuscripts for the needs specified by journals and conferences, saving tedious editing. It is possible that one of these platforms (or another
like it) will play a large role in developing the scientific documents of the future.

Extend Your Learning

In addition to the sources listed above, consider checking out the
following examples of multimedia-enhanced scholarly communication:

1. description of benefits:
2. “Building blocks of interactive computing”, the introduction of Jupyterlab:
3. Jupyer widgets, which has live (!) examples of ipywidgets and other libraries:
4. Explorable Explanation:
5. Interactive PhD Thesis:
6. Journal of Visualized Experiments:

Posted in LUCID, Machine Learning, Resources, Uncategorized

Singular Value Decomposition (SVD)

By Lowell Thompson and Ashley Hou

This is a collaborative tutorial aimed at simplifying a common machine learning method known as singular value decomposition. Learn how these techniques impact computational neuroscience research as well!

Singular value decomposition is a method to factorize an arbitrary m \times n matrix, A, into two orthonormal matrices U and V, and a diagonal matrix \Sigma. A can be written as U\Sigma V^T. The diagonal entries of \Sigma, called singular values, are arranged to be in decreasing magnitude. The columns of U and V are composed of the left and right singular vectors. Therefore, we can express U \Sigma V^T as a weighted sum of outer products of the corresponding left and right singular vectors, \sigma_i u_i v_i^T.

In neuroscience applications, we have a matrix R of the firing rate of a given neuron, where the first dimension represents different frontoparallel motion directions and the second represents disparity (a measure that helps determine the depth of an object). Relationships between preferences for direction and depth could serve as a potential mechanism underlying 3D motion perception. SVD can be used to determine whether a neuron’s joint tuning function for these properties is separable or inseparable. Separability entails a constant relationship between the two properties: a particular direction preference will be maintained across all disparity levels, and vice versa. If this were the case, then all of the vectors in the firing matrix could be described in terms of a single linearly independent vector, or function. This is also known as a rank 1 matrix.

Using SVD, we can approximate R by \sigma_1 u_1 v_1^T, which is obtained by truncating the sum after the 1st singular value. This will be a low-rank approximation of R. If R is fully separable in direction and disparity, only the first singular value will be non-zero, indicating the matrix is of rank 1. \sigma_1 u_1 v_1^T will then be a close approximation of R. In general, the closer R is to being separable, the more dominant the first singular value \sigma_1 will be over the other singular values, and the closer the approximation \sigma_1 u_1 v_1^T will be to the original matrix R.

Below is an interactive example that can help you visualize this concept. On the left is the representation of a neuron’s joint tuning function, given by a matrix who’s rows and columns are defined by the “Direction” and “Disparity” properties. The values within each cell of the matrix are an example neuron’s firing rate for the given combination of these properties. On the right, we have altered the representation of the matrix by plotting the firing rate of the neuron across different disparity levels for each direction of lateral motion. The peak firing rate is deemed the neuron’s disparity preference at that particular frontoparallel motion direction. Notice that regardless of motion direction, this neuron maintains a similar, slightly negative disparity preference (preferring objects near the observer). These cell types are predominantly found in the middle temporal (MT) cortex of rhesus macaques, an area of the brain that seems to be specialized for both 2D and 3D motion processing (Smolyanskaya, Ruff, & Born, 2013; Sanada & DeAngelis, 2014).

Using the slider on the bottom of the graph, you are manipulating the example neuron’s separability. As you move the slider to the far right side, representing the largest degree of inseparability for this example, the disparity tuning curves develop a peculiar pattern. That is, for directions of motion that are nearly opposite to one another (~180 degrees apart), the disparity preference of the neuron is flipped. These types of neurons are predominantly found an area that lies just above MT in the cortical hierarchy, the medial superior temporal area (MST). Cells of this type have been deemed “direction-dependent disparity selectivity” (DDD) neurons, and are potentially useful in differentiating self-motion from object-motion, although this is hypothesis has not been critically evaluated (Roy et al., 1992b; Roy & Wurtz, 1990; Yang et al., 2011).


Another plot is displayed below that illustrates how the singular values of a matrix will change depending on the cell’s separability. Notice as the cell becomes less separable, the magnitude of the first singular value decreases, and the contribution of other singular values begins to increase. The inset plot illustrates this using a common metric for evaluating separability, known as the degree of inseparability. This is simply the ratio of the first singular value compared to the sum of all the singular values.

Lastly, we’ve provided another interactive graph where the left portion is the same example neuron from the previous graph. On the right, is the prediction generated from \sigma_1 u_1 v_1^T. As you move the slider to the right, increasing the degree of inseparability, you’ll notice the prediction becomes increasingly dissimilar to the actual firing matrix.

Posted in LUCID, Machine Learning, Resources

What is a Computational Cognitive Model?

By Rui Meng and Ayon Sen

A computational cognitive model explores the essence of cognition and various cognitive functionalities through developing detailed, process-based understanding by specifying corresponding computation models.

Sylvain Baillet discusses various aspects of cognitive computation models

Computational model is a mathematical model using computation to study complex systems. Typically one sets up a simulation with the desired parameters and lets the computer run. One then looks at the output to interpret the behavior of the model.

Computational cognitive models are computational models used in the field of cognitive science. Models in cognitive science can be generally categorized into computational, mathematical or verbal-conceptual models. At present, computational modeling appears to be the most promising approach in many respects and offers more flexibility and expressive power than other approaches.

Computational models are mostly process based theories i.e., they try to answer how human performance comes about and by what psychological mechanism.

A general structure of a Neural Network

Among the different computational cognitive models, neural networks are by far the most commonly used connectionist model today. The prevailing connectionist approach today was originally known as parallel distributed processing (PDP). It is an artificial neural network approach that stressed the parallel nature of neural processing, and the distributed nature of neural representation. It is now common to fully equate PDP and connectionism.

Recurrent Layer Neural Network

A general structure for the Recurrent Neural Network

Neural networks were inspired by the structure of human brains. One particular variant of the neural network is called the recurrent neural network. It embodies the philosophy that learning requires remembering. Recurrent neural networks are also becoming a popular computational model for cognitive sciences.


Try it yourself: Play with a neural network to see how it works. A Neural Network Playground

Real World Example: Optimizing Teaching

Cognitive Computational models are used to do tasks that are hard or impossible to do with a lab experiments e.g., too many people are involved for the experiment to be feasible. For example, let us assume a teacher wants to teach 100 problems to children in class. But due to time constraint she can only teach 30 problems to the students. The teacher would preferably like to select 30 problems such that the children learn the most i.e., perform well on all 100 problems. Note that, there are 100C30 = 2.93e25 possible question sets. To evaluate the question sets, the teacher would need to teach each dataset to one group of children and evaluate their performance. Let there be 30 children in each class then the total number of children required would be 30 X 2.93e25 = 8.81e26 which is large. This makes identifying an optimal question set infeasible.

This task can be simplified if a cognitive computational model of children for that particular task can be devised. Then the teacher only needs to test the question sets on the cognitive model to figure out which one is the best. This saves a lot of time and is feasible.

Additional Resources:

Posted in LUCID Library, Resources

Martina Rau on Learning with Visuals

Lucid Faculty, Educational Psychology Professor, Director of Learning, Representations, & Technology Lab as well as Computer Sciences Affiliate, Martina Rau is interested in educational technologies to support more effective learning with visuals.

While we generally think of visuals as helpful tools, Rau highlights the fact that visuals can be confusing if  students do not know how to interpret visuals, construction visuals, or make connections among multiple visuals.

She created a video blog, Learning with Visuals, that aims to translate the research conducted on campus and share findings in a more accessible and useful approach to teaching and learning with visuals. She aims to help students looking for effective study strategies, parents wanting to help with children learn, teachers who notice that students often have difficulties understanding visuals, and of course prospective researchers in educational psychology.

 Involving Students from Multiple Disciplines in Research


Here she discusses the importance of students from multiple disciplines learning from each other and gaining an understanding of what it takes to conduct research in a different field.


Collaborating with Visuals


Here she discusses how educational technologies can support students to more effectively collaborate with visuals.




Translating Research into Everyday Language


Here she discusses the importance of finding new ways to learn and teach with visuals as well as how she plans to make her research accessible to non-scientists, such as teachers, parents, and students.

Posted in LUCID Library, Resources

Poolmate: Pool-Based Machine Teaching

Poolmate provides a command-line interface to algorithms for searching teaching sets among a candidate pool. Poolmate is designed to work with any learner which can be communicated with through a file-based API.

Developed by Ara Vartanian (, Scott Alfeld (, Ayon Sen ( and Jerry Zhu (, poolmate details can be found on aravart or github

If machine learning is to discover knowledge, then machine teaching is to pass it on.

-Jerry Zhu

Machine teaching is an inverse problem to machine learning. Given a learning algorithm and a target model, machine teaching finds an optimal (e.g. the smallest) training set. For example, consider a “student” who runs the Support Vector Machine learning algorithm. Imagine a teacher who wants to teach the student a specific target hyperplane in some feature space (never mind how the teacher got this hyperplane in the first place). The teacher constructs a training set D=(x1,y1) … (xn, yn), where xi is a feature vector and yi a class label, to train the student. What is the smallest training set that will make the student learn the target hyperplane? It is not hard to see that n=2 is sufficient with the two training items straddling the target hyperplane. Machine teaching mathematically formalizes this idea and generalizes it to many kinds of learning algorithms and teaching targets. Solving the machine teaching problem in general can be intricate and is an open mathematical question, though for a large family of learners the resulting bilevel optimization problem can be approximated.

Machine teaching can have impacts in education, where the “student” is really a human student, and the teacher certainly has a target model (i.e. the educational goal). If we are willing to assume a cognitive learning model of the student, we can use machine teaching to reverse-engineer the optimal training data — which will be the optimal, personalized lesson for that student. We have shown feasibility in a preliminary cognitive study to teach categorization. Another application is in computer security where the “teacher” is an attacker and the learner is any intelligent system that adapts to inputs. More details are from this research overview: Machine Teaching

Machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing education and personnel training.

-Jerry Zhu in Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education

Jerry Zhu is a LUCID faculty member and CS professor, Ayon Sen is a LUCID trainee and CS graduate student.

Poolmate is based upon work supported by the National Science Foundation under Grant No. IIS-0953219. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Posted in Resources

LUCIDtalks: Machine Learning with Ayon Sen

As part of our LUCID program, the graduate students meet weekly and discuss topics that are interesting to both the computation and behavioral graduate students.

In this video Ayon provides an overview of Machine Learning in 20 minutes. Ayon explains linear and non-linear learners, the overfitting vs. bias-variance trade off and provides resources for those interested in learning more about machine learning.

Posted in LUCID, LUCID Library, Machine Learning

Apply to LUCID!

We are currently accepting applications for Fall 2018!

Please read at our LUCID overview and diversity statement

Details for how to apply to the LUCID program at UW–Madison:

1. Apply to UW-Madison at Graduate School
2. Apply to a Ph.D. program in one of the four core departments in LUCID:


Educational Psychology

Electrical and Computer Engineering (ECE)

Computer Sciences (CS)

3. Apply to LUCID:

Email your statement of interest to ceiverson(at)wisc(dot)edu

In your statement of interest please answer the following:

  • Please tell us about your background and how your experience contributes to the LUCID diversity mission.
  • Describe your interest and background in research that connects computation, cognition, and learning.
  • Finally, what do you hope to gain from being a LUCID trainee?
Posted in LUCID

Watch Our LUCID Video

Our new LUCID video demonstrates our collaborative learning approach to tackling the real-world projects.

This video highlights the LUCID program and focusses on four pedagogical approaches: interdisciplinary groups, practical problems, prof-and-peer mentoring, and science communication.

Posted in LUCID


HAMLET  (Human, Animal, and Machine Learning: Experiment and Theory)

HAMLET is an interdisciplinary proseminar series that started in 2008. The goal is to provide behavioral and computational graduate students with a common grounding in the learning sciences. Guest speakers give a talk each week, followed by discussions. It has been an outlet for fresh research results from various projects. Participants are typically from Computer Sciences, ECE, Psychology, Educational Psychology as well as other parts of the campus. Multiple federal research grants and publications at machine learning and cognitive psychology venues have resulted from the interactions of HAMLET participants.

Meetings: Fridays 3:45 p.m. – 5 p.m. Berkowitz room (338 Psychology Building)

Subscribe/Unsubscribe: HAMLET mailing list

This semester we are trying a new format for HAMLET. Speakers pursuing research at the intersection of computation and human behavior will give a short (10-15 minute) introduction/outline/description of a current research project or problem, with the aim of sparking collaborative discussion in the group about the project.

How would people from your field approach it?
Are there existing tools for solving the problem?
What is already known about the domain?
What kind of data are involved?
How does the research connect to real-world issues?

Brainstorm sessions will be interleaved with standard talks on more mature work throughout the semester.

Spring 2019 Schedule:

2/8 John Curtin, Psychology

Brainstorm: Digital phenotyping with (somewhat) big data for harm reduction in alcohol and other drug use disorders

John has been applying machine learning methods to Facebook and geo-tracking data to understand and predict patterns of relapse in
people recovering from addiction. He will briefly characterize issues that motivate the research, the kinds of data his group is working with,
some of the work they have already done and some of the challenges they are currently facing. The discussion will focus on generating ideas for moving the project forward.

2/15: Data blitz: Current research in cognition, perception, and cognitive neuroscience (special event at the WID)

2/22: Paula Niedenthal, Psychology

Brainstorming connections between emotion, cognition and datascience

Paula will initiate a session concerning ways that scientists in machine learning/data science might collaborate with scientists working to understand human affective cognition. Read about some of her recent work here: Science Daily

3/1: Dimitris Papailiopoulos (ECE, WID)  LOCATION CHANGE: Union South (look at the TTIU for the room)

Dimitris Papailioupolos will spark a discussion assessing whether principles of human cognition can guide us toward more robust machine learning.

3/8 Alyssa Adams, VEDA Data Solutions

Emergence, evolution and video games

Alyssa Adams, a data scientist working at Madison startup VEDA Data Solutions. Alyssa’s work focuses on understanding what makes living systems different from non-living systems, and in particular the mechanisms that underlie emergence in social and biological systems. Toward these questions she applies computational approaches to data generated by people interacting in video games–so if you like emergence, math, human behavior and video games, you should really come. Alyssa is also a great person to talk to if you are interested in how data science is being applied in the wild, or what it is like to move from an academic to an industrial career path.

3/15 Rob Spencer, Psychology

3/22  Spring Break

3/29 Karen Schloss, Psychology & WID

4/5 Jon Willits, Psychology at University of Illinois

4/12 Ari Rosenberg, Neuroscience

4/19 Greg Zelinski, Psychology at Stony Brook University

4/26 Glenn Fung Moo, American Family Insurance

5/3 Sarah Sant’Ana, Psychology



Fall 2018 Schedule:

Sep 14, Tim Rogers, Psychology

Investigating the covariance structure of widely-held false beliefs

ABSTRACT: It has become increasingly clear that our society is suffering a crisis of false belief. Across a variety of domains–including science, health, economics, politics, history, and current events–there often exists vociferous disagreement about what ought to be matters of fact. I’d like to know whether cognitive science and machine learning together can help us to understand why this is happening. In prior work we have studied the formation of false beliefs in small well-controlled lab studies with no real-world stakes. In this talk I will describe some preliminary work looking at real false beliefs occurring in the wild. In collaboration with a summer PREP student and colleagues in Psychology and in the School of Journalism, we generated a long list of incorrect but putatively widely-held beliefs spanning many different knowledge domains. Using Amazon Mechanical Turk, we asked people about the degree
to which they endorsed or rejected the belief. We also collected a variety of data about each respondent, including sociodemographic data, political leanings, media trust, IQ, and so on. The aim was to measure how susceptibility to false beliefs pattern across individuals, and to understand what properties of individuals predict susceptibility to which kinds of false beliefs. Our early results provide some provocative evidence contradicting some straight-forward hypotheses about false belief formation, but there is a lot more work to be done. We would like to connect with others interested in these problems from a machine-learning-and-social-media perspective, and in particular would like to consider ways of using social network analytics to assess the degree to which media consumption may be causing the patterns we observe.

Sep 21, LUCID faculty meeting – no HAMLET

Sep 28, no HAMLET

Oct 5, Jay Patel, Educational Psychology and Ayon Sen, Computer Science

A Novel Application of Machine Teaching to Perceptual Fluency Training, a collaboration between Ed Psych and CS

W will hear from Jay Patel and Ayon Sen, who will present new work applying neural networks and machine teaching to undergraduate chemistry learning—something for everyone!


In STEM domains, students are expected to acquire domain knowledge from visual representations that they may not yet be able to
interpret. Such learning requires perceptual fluency: the ability to intuitively and rapidly see which concepts visuals show and to translate among multiple visuals. Instructional problems that engage students in nonverbal, implicit learning processes enhance perceptual fluency. Such processes are highly influenced by sequence effects. Thus far, we lack a principled approach for identifying a sequence of perceptual-fluency problems that promote robust learning. Here, we describe a novel educational data mining approach that uses machine learning to generate an optimal sequence of visuals for perceptual-fluency problems. In a human experiment, we show that a machine-generated sequence outperforms both a random sequence and a sequence generated by a human domain
expert. Interestingly, the machine-generated sequence resulted in significantly lower accuracy during training, but higher posttest
accuracy. This suggests that the machine-generated sequence induced desirable difficulties. To our knowledge, our study is the first to show that an educational data mining approach can induce desirable difficulties for perceptual learning.

Oct 12, Tim Rogers, Psychology

Data science and brain imaging at UW Madison

Over the summer Tim had conversations with several groups about their work applying machine learning or other data science tools to brain imaging data. These activities are part of a broader trend across the country, with collaborations between neuroscientists, cognitive scientists, and computer scientists driving very rapid innovations in techniques for finding signal in neural data. Many of us are interested in connecting disparate efforts across campus to develop a general knowledge base, community of support, and new collaborations in this field. There is also interest in understanding how work conducted here relates to other high-profile work emerging from other labs.

Tim will present a general overview of the variety of new methods that I am aware of and some of the astonishing results they have uncovered. The overview will be targeted at both cognitive scientists/neuroscientists and members of the machine learning
community. I will also highlight some of the opportunities for new work and collaboration that I am aware of, and hope to initiate a discussion about what we can do to promote these efforts at UW.

Join the new email list to facilitate discussion, collaboration, and information-sharing on these issues.  (BIG = “Brain-imaging Interest Group”; LUCID is the broader community interested in learning, understanding, cognition, intelligence and data-science)

Oct 19, LUCID faculty meeting – no HAMLET

Oct 26, no HAMLET

Nov 2, Jerry Zhu, Computer Science

Discuusion: Can Machine Teaching Influence How Humans Do Regression?

Jerry is interested in discussing with cognitive scientists possible relationships between a hot topic in machine learning–data
poisoning, and defences against data poisoning–and approaches to human learning and teaching. Can central concepts from this area of study in machine learning help or hinder human learning?

How do people make numerical predictions (e.g. weight of a person) from observed features (height, fitness, gender, age, etc.)?  We may assume that they learn this from some sort of regression in their head, applied to training data they receive.  Can machine teaching help them learn better predictions by being careful about the training examples we give them?   Or to mess them up intentionally?  This is equivalent to ‘training data poisoning attacks’ in adversarial machine learning.   I will explain how this is done with control theory under highly simplified, likely incorrect assumptions on human learning.  This is where we need the cognitive science audience to participate in the discussion, and together we will explore the research opportunities on this topic.

Nov 9, Data Blitz

75 minutes, 9 talks, a sampling of current research at UW connecting data science and human behavior

This HAMLET will feature a cross-disciplinary data-blitz. We will hear a series of 5-minute talks on topics at the intersection of
machine learning, data science, and human behavior. This is a great opportunity to hear about some of the projects happening on campus now that you may want to get involved with.

Can neural networks help you learn chemistry? How do Trumpian false beliefs differ from other partisan false beliefs? These questions any many more will be addressed in this data blitz!

Nov 16, Gary Lupen

How translateable are languages? Quantifying semantic alignment of natural languages.

Do all languages convey semantic knowledge in the same way? If language  simply mirrors the structure of the world, the answer should be a qualified “yes”. If, however, languages impose structure as much as reflecting it, then even ostensibly the “same” word in different
languages may mean quite different things. We provide a first pass at a large-scale quantification of cross-linguistic semantic alignment of approximately 1000 meanings in 70+ languages. We find that the translation equivalents in some domains (e.g., Time, Quantity, and Kinship) exhibit high alignment across languages while the structure of other domains (e.g., Politics, Food, Emotions, and Animals) exhibits substantial cross-linguistic variability. Our measure of semantic alignment correlates with phylogenetic relationships between languages and with cultural distances between societies speaking the languages, suggesting a rich co-adaptation of language and culture even in domains of experience that appear most constrained by the natural world.

Nov 23, Thanksgiving – no HAMLET

Nov 30, Emily Ward, Psychology

Dec 7, Priya Kalra, Educational Psychology





Spring 2018 Schedule:

Feb 2, Blake Mason

Title: Low-Dimensional Metric Learning with Application to Perceptual Feature Selection

Abstract: I will discuss recent work investigating the theoretical foundations of metric learning, focused on four key topics: 1) how to learn a general low-dimensional (low-rank) metrics as well as sparse metrics; 2) upper and lower (minimax) bounds on the generalization error; 3) how to quantify the sample complexity of metric learning in terms of the dimension of the feature space and the dimension/rank of the underlying metric; 4) the accuracy of the learned metric relative to the underlying true generative metric. As an application of these ideas, I will discuss work with collaborators in Educational Psychology that applies metric learning for perceptual feature detection in non-verbally mediated cognitive processes.

Feb 9, Adrienne Wood

Form follows function: Emotion expressions as adaptive social tools

Emotion expressions convey people’s feelings and behavioral intentions, and influence, in turn, the feelings and behaviors of perceivers. I take a social functional approach to the study of emotion expression, examining how the physical forms of emotion expression are flexible and can be adapted to accomplish specific social tasks. In this talk, I discuss two lines of research, the first of which applies a social functional lens to smiles and laughter. I present work suggesting that smiles and laughter vary in their physical form in order to achieve three distinct tasks of social living: rewarding others, signaling openness to affiliation, and negotiating social hierarchies. My approach, which generalizes to other categories of expressive behavior, accounts for the form and context of the occurrence of the expressions, as well as the nature of their influence on social partners. My second line of research examines how cultural and historical pressures influence emotional expressiveness. Cultures arising from the intersection of many other cultures, such as in the U.S., initially lacked a clear social structure, shared norms, and a common language. Recent work from my collaborators and myself suggests such cultures increase their reliance on emotion expressions, establishing a cultural norm of expressive clarity. I conclude by presenting plans to quantify individual differences in the tendency to synchronize with and accommodate to the emotion expressive style of a social partner, and relate those differences to people’s social network positions. Given the important social functions served by emotion expression, I suggest that the ability to use it flexibly is associated with long-term social integration.

Feb 16 Nils Ringe, Professor, Political Science

Speaking in Tongues: The Politics of Language and the Language of Politics in the European Union

Politics in the European Union primarily takes place between political actors who do not share a common language, yet this key feature of EU politics has not received much attention from political scientists. This project investigates if and how multilingualism affects political processes and outcomes. Its empirical backbone is a series of in-depth interviews with almost 100 EU policy-makers, administrators, translators, and interpreters, but it also involves at least two potential components where political science, linguistic, and computational approaches overlap. The first involves the analysis of oral legislative negotiations in the European Parliament, where non-native English speakers interact using English as their shared language, and in native-English speaking parliamentary settings (in Ireland, Scotland, and/or the UK), to determine if “EU English” differs syntactically and semantically from “regular” English. The expectation is that speech in the EP is simpler, more neutral, and more utilitarian. The second component involves the identification of languages spoken in EP committee meetings using computational methods, to determine the language choices members of the EP make.

Feb 23 Andreas Obersteiner

Does 1/4 look larger than 1/3? The natural number bias in comparing symbolic and nonsymbolic fractions

When people compare the numerical values of fractions, they are often more accurate and faster when the larger fraction has the larger natural number components (e.g., 2/5 > 1/5) than when it has the smaller components (e.g., 1/3 > 1/4). However, recent studies produced conflicting evidence of this “natural number bias” when the comparison problems were more complex (e.g., 25/36 vs. 19/24). Moreover, it is unclear whether the bias also occurs when fractions are presented visually as shaded parts of rectangles rather than as numerical symbols. I will first present data from a reaction time study in which university students compared symbolic fractions. The results suggest that the occurrence and strength of the bias depends on the specific type of comparison problems and on people’s ability to activate overall fraction magnitudes. I will then present preliminary data from an eye tracking study in which university students compared rectangular fraction visualizations. Participants’ eye movements suggest that the pure presence of countable parts encouraged them to use unnecessary counting strategies, although the number of countable parts did not bias their decisions. The results have implications for mathematics education, which I will discuss in the talk.

Mar 2 Nicole Beckage, University of Kansas

Title: Multiplex network optimization to capture attention to features

Abstract: How does attention to features and current context affect people’s search in mental and physical spaces? I will examine how procedures for optimally searching through “multiplex” networks — networks with multiple layers or types of relationships — capture human search and retrieval patterns. Prior work on semantic memory, people’s memory for facts and concepts, has primarily focused on modeling similarity judgments of pairs of words as distances between points in a high-dimensional space (e.g., LSA by Laudauer et al, 1998; Word2Vec by Mikolov et al. 2013). While these decisions seem to accurately account for human similarity in some contexts, it’s very difficult to interpret high dimensional spaces, making it hard to use such representations for scientific research. Further, it is difficult to adapt these spaces to a specific context or task. Instead, I define a series of individual feature networks to construct a multiplex network, where each network in the multiplex captures a “sense” or type of similarity between items. I then optimize the “influence” of each of these feature networks within the multiplex framework, using real world search behavior on a variety of tasks. These tasks include semantic memory search in a cognitive task and information search in Wikipeida. The resulting weighting of the multiplex can capture aspects of human attention and contextual information in these diverse tasks. I explore how this method can provide interpretability to multi-relational data in psychology and other domains by developing an optimization framework that considers not only the presence or absence of relationships but also the nature of the relationships. While I focus on applications of semantic memory, I discuss mathematical proofs and simulation experiments that apply more generally to optimization problems in the multiplex network literature.

Mar 9 Psychology visit day Data Blitz! (WID Orchard Room)

The UW-Madison Psychology department will be hosting its second annual data blitz for prospective graduate students in the Orchard Room of the WID. The data blitz will feature speakers from across the department presenting their research in an easily digestible format. Each talk will be 5 minutes long with an additional 2 minutes for questions at the end of each talk. All are welcome to attend. Below you will find a list of the speakers and the titles of their talks:

    • Rista Plate “Unsupervised learning shifts emotion category boundaries”
    • Ron Pomper “Familiar object salience affects novel word learning”
    • Elise Hopman “Measuring meaning alignment between different languages”
    • Anna Bartel “How diagrams influence students’ mental models of mathematical story problems”
    • Aaron Cochrane “Chronic and phasic interactions between video game playing and addiction”
    • Pierce Edmiston “Correlations between programming languages and beliefs about programming”
    • Chris Racey “Neural processing underlying color preference judgments”
    • Sofiya Hupalo “Corticotropin-releasing factor (CRF) modulation of frontostriatal circuit function”

Mar 16 Varun Jog, Assistant Professor, ECE

Title: Mathematical models for social learning

Abstract: Individuals in a society learn about the world and form opinions not only through their own experiences, but also through interactions with other members in the society. This is an incredibly complex process, and although it is difficult to describe it completely using simple mathematical models, valuable insights may be obtained through such a study. Such social learning models have turned out to be a rich source of problems for probabilists, statisticians, information theorists, and economists. In this talk, we survey different social learning models, describe the necessary mathematical tools to analyze such models, and give examples of results that one may prove through such an approach.

Mar 23 Josh Cisler, Assistant Professor, Department of Psychiatry

Title: Real-time fMRI neurofeedback using whole-brain classifiers with an adaptive implicit emotion regulation task: analytic considerations

Most fMRI neuroimaging studies manipulate a psychological or cognitive variable (e.g., happy versus neutral faces) and observe the manipulations impact on brain function (e.g., amygdala activity is greater for happy faces). As such, the causal inferences that can be drawn from these studies is the effect of cognition on brain function, and not the effect of brain function on cognition. Real-time fMRI refers to processing of fMRI data simultaneous with data acquisition, enabling feedback of current brain states to be presented back to the participant in (near) real-time, thus enabling the participant to use the feedback signals to modify brain states. We are conducting an experiment using real-time fMRI neurofeedback where the feedback signal consists of classifier output (hyperplane distances) from a SVM trained on all grey matter voxels in the brain. The feedback signal is embedded within a commonly used implicit emotion regulation task, such that the task becomes easier or harder depending on the participant’s brain state. This type of ‘closed loop’ design allows for testing whether manipulations of brain state (via feedback) have a measurable impact on cognitive function (task performance). The purpose of this presentation will be to present the experimental design and resulting data properties for the purpose of obtaining feedback and recommendations for understanding and analyzing the complex dynamical systems relations between the feedback signal, brain state, and task performance.

Mar 30 (no meeting, spring break)

Apr 6 Student Research Presentations

Mini talk 1: Ayon Sen, Computer Sciences

For Teaching Perceptual Fluency, Machines Beat Human Experts

In STEM domains, students are expected to acquire domain knowledge from visual representations. Such learning requires perceptual fluency: the ability to intuitively and rapidly see what concepts visuals show and to translate among multiple visuals. Instructional problems that enhance perceptual fluency are highly influenced by sequence effects. Thus far, we lack a principled approach for identifying a sequence of perceptual-fluency problems that promote robust learning. Here, we describe a novel educational data mining approach that uses machine learning to generate an optimal sequence of visuals for perceptual-fluency problems. In a human experiment realted to chemistry, we show that a machine-generated sequence outperforms both a random sequence and a sequence generated by a human domain expert. To our knowledge, our study is the first to show that an educational data mining approach can yield desirable difficulties for perceptual learning.

Mini talk 2: Evan Hernandez, Ara Vartanian, Computer Sciences

Block-based programming environments are popular in computer science education, but the click-and-drag style of these environments render them inaccessible by students with motor impairments. Vocal user interfaces (VUIs) offer a popular alternative to traditional keyboard and mouse interfaces. We design a VUI for Google Blockly in the traditional Turtle/LOGOS setting and discuss the relevant design choices. We then investigate augmentations to educational programming environments. In particular, we describe a method of program synthesis for completing the partial or incorrect programs of students, and ask how educational software may leverage program synthesis to enhance student learning.

Apr 13 (no meeting)

Apr 20

Rob Nowak: All of Machine Learning

Apr 27 (STARTING AT 4PM instead of 3:45pm). Tyler Krucas, The Wisconsin Gaming Alliance.

An Industry Perspective on Data in Game Design and Development

The Wisconsin game development industry offers a surprisingly comprehensive cross section of the types of individuals and teams that develop video games. This includes everything from studios that collaborate on AAA titles such as Call of Duty and Bioshock Infinite, to studios that work largely on mobile or free-to-play games, to studios that primarily work on educational games or games for impact. In all cases, data collection and analysis is an important tool in every step of the game development process. However, the scale of the data collected and its use can vary dramatically from developer to developer. In my talk, I will provide an overview of the the game development ecosystem in Wisconsin, as well as examples of the different types data collection and use practices found in the regional industry. Critically, I’ll frame this discussion in the context of possible links with the HAMLET group – in terms of possible sources of data to address fundamental questions surrounding human learning or behavior as well as possible collaborations.

Fall 2017 Schedule:

Sept 15, Virtual and Physical: Two Frames of Mind, Bilge Mutlu (CS)

In creating interactive technologies, virtual and physical embodiments are often seen as two sides of the same coin. They utilize similar core technologies for perception, planning, and interaction and engage people in similar ways. Thus, designers consider these embodiments to be broadly interchangeable and choice of embodiment to primarily depend on the practical demands of an application. In this talk, I will make the case that virtual and physical embodiments elicit fundamentally different frames of mind in the users of the technology and follow different metaphors for interaction. These differences elicit different expectations, different forms of engagement, and eventually different interaction outcomes. I will discuss the design implications of these differences, arguing for different domains of interaction serving as appropriate context for virtual and physical embodiments.

October 13, Learning semantic representations for text: analysis of recent word embedding methods, Yingyu Liang (CS)

Recent advances in natural language processing build upon the approach of embedding words as low dimensional vectors. The fundamental observation that empirically justifies this approach is that these vectors can capture semantic relations. A probabilistic model for generating text is proposed to mathematically explain this observation and existing popular embedding algorithms. It also reveals surprising connections to classical notions such as Pointwise Mutual Information in computational linguistics, and allows to design novel, simple, and practical algorithms for applications such as embedding sentences as vectors.

October 20, Vlogging about research, Martina Rau (Ed Psych)

October 27, LUCID faculty meeting, No large group meeting

November 3, A Discussion of Open Science Practices, Martha W. Alibali (Psych)

This HAMLET session will be a discussion of open-science practices, led by Martha Alibali. We will start with brief discussion of the “replication crisis” and “questionable research practices”. We will then discuss solutions, including better research practices, data sharing and preregistration. Please read at least some of the provided papers, and come prepared to ask questions and share your experiences.

Replication crisis paper

Questionable Research Practices (QRPs) and solutions paper paper

Data sharing paper

Preregistration paper pic

November 10, Systematic misperceptions of 3D motion explained by Bayesian inference, Bas Rokers (Psych)

Abstract: Over the years, a number of surprising, but seemingly unrelated errors in 3D motion perception have been reported. Given the relevance of accurate motion perception to our everyday life, it is important to understand the cause of these perceptual errors. We considered that these perceptual errors might arise as a natural consequence of estimating motion direction given sensory noise and the geometry of 3D viewing. We characterized the retinal motion signals produced by objects moving along arbitrary trajectories through three dimensions and developed a Bayesian model of perceptual inference. The model predicted a number of known errors, including a lateral bias in the perception of motion trajectories, and a dependency of this bias on stimulus contrast and viewing distance. The model also predicted a number of previously unknown errors, including a dependency of perceptual bias on eccentricity, and a surprising tendency to misreport approaching motion as receding and vice versa. We then used standard 3D displays as well as a virtual reality (VR) headsets to test these predictions in naturalistic settings, and established that people make the predicted errors. In sum, we developed a quantitative model of 3D motion perception and provided a parsimonious account for a range of systematic perceptual errors in naturalistic environments.

November 17, Total variation regression under highly correlated designs, Becca Willett (ECE)

Abstract: I will describe a general method for solving high-dimensional linear inverse problems with highly correlated variables. This problem arises regularly in applications like neural decoding from fMRI data, where we often have two orders of magnitude more brain voxels than independent scans. Our approach leverages a graph structure that represents connections among voxels in the brain. This graph can be estimated from side sources, such as diffusion-weighted MRI, or from fMRI data itself. We will explore the underlying models, computational methods, and initial empirical results. This is joint work with Yuan Li and Garvesh Raskutti.

November 24 Thanksgiving Holiday

December 1, Micro-(Shape-And-Motion)-Scopes, Mohit Gupta (CS)

Imagine a drone looking for a safe landing site in a dense forest, or a social robot trying to determine the emotional state of a person by measuring her micro-saccade movements and skin-tremors due to pulse beats, or a surgical robot performing micro-surgery inside the body. In these applications, it is critical to resolve fine geometric details, such as tree twigs; to recover micro-motion due to biometric signals; and the precise motion of a robotic arm. Such precision is more than an order-of-magnitude beyond the capabilities of traditional vision techniques. I will talk about our recent work on designing extreme (micro) resolution 3D shape and motion sensors using unconventional, but low-cost optics, and computational techniques. These methods can measure highly subtle motions (< 10 microns), and highly detailed 3D geometry (<100 microns). These sensors can potentially detect a persons pulse or micro-saccade movements, and resolve fine geometric details such as a facial features, from a long distance.

December 8, Influence maximization in stochastic and adversarial settings, Po-Ling Loh (ECE)

We consider the problem of influence maximization in fixed networks, for both stochastic and adversarial contagion models. Such models may be used to model infection spreads in epidemiology, as well as the diffusion of information in viral marketing. In the stochastic setting, nodes are infected in waves according to linear threshold or independent cascade models. We establish upper and lower bounds for the influence of a subset of nodes in the network, where the influence is defined as the expected number of infected nodes at the conclusion of the epidemic. We quantify the gap between our upper and lower bounds in the case of the linear threshold model and illustrate the gains of our upper bounds for independent cascade models in relation to existing results. In the adversarial setting, an adversary is allowed to specify the edges through which contagion may spread, and the player chooses sets of nodes to infect in successive rounds. Our main result is to establish upper and lower bounds on the regret for possibly stochastic strategies of the adversary and player. This is joint work with Justin Khim (UPenn) and Varun Jog (UW-Madison).

For other events check out our calendar: Seminar and Events  This content and updates can be found: HAMLET

HAMLET Archives

Fall 2015 archive
Fall 2012 archive
Fall 2011 archive
Spring 2011 archive
Fall 2010 archive
Fall 2009 archive
Spring 2009 archive
Fall 2008 archive


Posted in Events