Last week in the PREP data science workshop, Tim Rogers introduced data science using R and created this tutorial for graphing a semantic network. The tutorial is written as both a JuPyteR notebook and a plain html file.
Learn about working with graphs in R, by building graphs to visualize the semantic network for animals. The tutorial is intended for people with only a little knowledge of R and/or graphs.
For those of you with JuPyteR Notebooks and would like the full interactive experience, on Tim’s github, Tutorial for Graphing a Semantic Network there is a .ipynb extention that will allow you to do the demo interactively! Just clone or download the repository, open the notebook and you are set. The notebook must be in the same directory as the listX.txt files.
Each listX.txt file contains a list of animals produced by a single volunteer in an animal-fluency task. Each volunteer was given two minutes to type out as many animals as she could think of, in whatever order they occurred to her. The tutorial works through the process of reading these data into R and using the igraph and visNetwork packages to construct a network that captures links in memory amongst the various animals produced. You can add your own data, or those of others, by just adding new listX.txt files.
For those of you who don’t have JuPyteR Notebooks but want to run through the tutorial in R or RStudio, just open the .html file, and copy-and-paste the blocks of R text from your browser into your console. In this case, make sure to set the R working directory to the location where the listX.txt files reside.
Tutorial for Graphing a Semantic Network (html file)
The Semantic Network tutorial will walk through the process of building and visualizing graphs (network data structures) in R, introducing some basic features of R along the way. Graphs are useful structures for describing many different kinds of data. As an example domain, we will use graphs to visualize mental structures–specifically, the concepts and relations amongst various different animals that exist in your mind!
In this visualization you can zoom in and out, click and drag the whole graph or individual nodes, and highlight all the items connected to an individual node by hovering over it or clicking it with the mouse. And, visNetwork has some cool algorithms for automatically sorting the layout. When you hover over a node, it’s “title” attribute pops up in a yellow box. I find it much easier to see what is going on with these abilities to manipulate the graph.
In this case the semantic structure is now clearer. The purple nodes mainly contain farm animals; the blue nodes are mainly water animals; the red nodes are mainly zoo animals, etc. So how do the preceding commands work to generate this image?
In visNetwork, the special operator %>% is a kind of “pipeline.” It basically says, take the output of the preceding command and pass it on as the first input argument to the next command. So the above code first runs visNetwork with the n and l dataframes we made as arguments, starts to generate a visualization, but before rendering it the visualization object gets passed to the next command, visOptions. visOptions is a function that sets some options for the visualization–in this case, enabling the “highlight nearest” function when you hover with a mouse. The output of that function in turn is passed forward to the visPhysics function, which controls parameters of the simulated physics in the visualization.
That’s right–the visualization is actually simulating real physics in your browser. The layout of the nodes and edges is determined as follows: visNetwork treats each node as kind of like a little electrical particle, all the same charge. The closer two nodes are together, the more they repel each other. The edges are treated as little springs–the further apart two connected nodes get, the more the spring tries to pull them back together. visNetwork starts everything off in a random position, then simulates the resulting physics. Nodes that are all connected to one another have many “springs” joining them, so get pulled toward one another despite the repulsion. Nodes that are connected by few springs get pushed apart. The whole graph self-organizes into a visualization where nodes in the same community tend to be laid out in similar locations.
The arguments in visPhysics basically say: use an algorithm called “force Atlas 2” to simulate the physics, and set the gravitational constant (amount of repulsion) to -2. The more negative that number gets, the stronger the repulsion, and the further apart the nodes will be.
visNetwork has many alternative ways of laying out graphs, including all of the layouts available from igraph. Check out the visNetwork guide to read more.