Leveraging Bubble Matrix Charts for Large Twitter Datasets

This paper describes a technique for creating a useful visualization based upon a set of 800,000 Twitter follower records.  Centrifuge can easily create a link analysis visualization of this data, but such a visualization may typically be difficult to interpret due to the intrinsic limits of displaying this much information.  In such a case, Centrifuge Analytics provides alternate ways to glean important insights from the data.


When a Twitter user “follows” another user, this reflects some degree of connection between those two users because it means that the follower wishes to track what the followed-user is saying.  (It doesn’t mean that the follower is sympathetic — it could be the follower disagrees with the other user, but wishes to stay apprised of what he’s saying.)

The collection of this data can be represented as a directed graph where graph nodes are users and graph edges represent the follow relationship.

There are scenarios in which people are interested in analyzing and better understanding who follows whom on Twitter.  This could include determining where the “clusters” of followers lie, the overlaps between clusters, etc.

Centrifuge does an excellent job of rapidly computing and displaying a link analysis diagram, for large sets of data.  But link analysis diagrams with this much data are hard to interpret, in any tool.  (Imagine making sense of a photograph of 100,000 people.)

This paper describes an alternate visualization technique in which we pre-process the data to make it amenable to display in what is often called a bubble matrix. The bubble matrix is similar to what Excel calls a Bubble Chart, however, in the bubble matrix the labels are the same on both axes (but may not be in the same sequence.)  Where the labels intersect, a circle (bubble) is drawn, and the bubble’s diameter is proportional to some value intrinsic to the two dimensions on the axes.

Bubble Matrix Chart

Please download the full PDF version of this paper.


 More Papers: Please see the request form below:

Big Data is Here to Stay
Data Visualization for Fraud Analysis
Next Generation Information Visualization
Preventing Retail Fraud


Big Data is Here to Stay:  Big Data is everywhere. IDC projects that by 2ottom of by most people outside of the Intelligence Community (IC) is the fact that “intelligence” is the collective result of qualified bits (noise) of information, which together provide the inferences and potential conclusions when these bits are uniquely connected and correlated.

This paper explores interactive analytics as it applies to the intelligence community and provides case studies in the areas of Threat Finance against counter-insurgency operations and cyber security against network intrusion attacks to illustrate the power of the Centrifuge approach.

Data Visualization for Fraud Analysis:  Today more than ever, fraud investigators are faced with unprecedented challenges as they attempt to accurately identify fraud and money laundering activity. Investigators are asked to operate in shrinking windows of time, while the volume and velocity of data pouring in grows exponentially. In most investigative processes, the single most important component is human judgment. So the question is “Where is the analyst-centric innovation?” One approach that has proven highly effective in this environment is called Interactive Analytics.

This paper explores this subject in depth. It isolates the three most important factors needed to overcome the challenges faced with current approaches. At a time when the reputation of financial institutions is at stake and regulatory compliance standards on the increase, effective new approaches could not be more relevant.

Next Generation Information Visualization:  The big data analysis problem today is real and significant. Users are drowning in data and need new and innovative approaches to support effective decision making. Existing products in the business intelligence (BI) and analytics arena are simply not meeting the mark. However, information visualization has proven to be highly effective at navigating through and exploring massive amounts of data.

This paper discusses the characteristics of the next generation information visualization products and how they must allow analysts to rapidly assimilate, comprehend and act on all of the information at their disposal, even when they don’t know what questions to ask in advance. Specifically, next generation products will support highly interactive visualizations, collaborative analysis and integrated views.

Preventing Retail Fraud:  Fraud continues to be one of the most pervasive threats to the success of any retailer. But while the problem is well-known throughout the industry, fraud detection has been an elusive problem to address. The ever-expanding array of fraud strategies and vulnerabilities is leading retailers to an unavoidable conclusion: modern challenges require modern solutions. The most promising application of this concept appears to be coming from the field of big data analytics.

Data analytics is also well-suited to the dynamic nature of modern retail. As market trends and fraudulent tactics evolve over time, businesses can be confident that their monitoring tools will respond in-step to alert analysts of new potential vulnerabilities.

White Paper(s) Request Form