July 31, 2013

What visual analytics can (not) tell us: centrality of nodes in a network graph

Visual analytics make the promise to give insights into massive amounts of data that are otherwise not tractable. The challenge is to find visual representations of the data that are (easily) accessible.

One particularly useful and frequently found example of a data structure is a network graph. A commonly used visual representation of such a graph looks like the following:

Imagine that the nodes represent inventors, and two nodes are connected by an edge if the corresponding inventors collaborate. Who are the best-connected inventors? Spoken in terms of network analysis and graph theory: which nodes have the highest centrality, i.e., lie at a center of a (local) hub of nodes? Obviously, for the left of the two networks shown above, it is the node lying in the middle that is the most central. Or is it? Actually, both graphs are exactly the same: Four nodes, each connected to every other node. It is just the visual representation that is different. By the way, this particular graph is called the tetrahedal graph.

But fear not: we can measure centrality independently from the particular graph drawing. Actually, there are quite a number of different concepts of centrality in a network (Google’s PageRank being one of them). Therefore, if you really want to know who the key players in a network are, color-code their centrality, like so:

The basis for this graph are actual data of collaborating inventors and high-tech institutions. The size of the nodes represents the institution’s/inventor’s patenting activity. See the small dark blue node? This inventor might not have a large portfolio but she is well-connected!

Drive Innovation Smarter and Faster
in the Digital Era.

Transform your enterprise with cutting-edge AI insights. Enhance decision-making, uncover market trends, and drive growth with real-time, automated intelligence.

100x Faster Insights

70% Cost Cut

Uncover Game-changing Patterns