Difference Heatmaps

Comparing two data sets is often much more interesting than analyzing just one. Single data set tells you how things are, but data set comparison teaches you the core differences between two data-generating processes or evolvement of a process over time. Such information forms a solid base for strategic decisions and monitoring. For example, you might want to see how your market visibility has evolved compared to previous month or understand the strengths and weaknesses of two competing solutions. To do such analysis, you collect data which typically comprises large number of data entries each having one or more measured attributes.

For data sets having only one measured attribute, simple statistical visualizations such as histograms work just fine. However, using such one-dimensional methods for multi-attribute analysis can lead to making wrong conclusion or missing something essential. To avoid those risks, prefer visualization methods which can summarize all data in a single view.

Scatter plots are perhaps the simplest and most used visualization type for 2- or 3-dimensional data. They are OK for small and easy data sets, but visualizing large or 'noisy' data sets as scatter plots is usually nothing but a mess. For example, the xyz-plot below tries to visualize two data sets (blue and red dots) but figuring out differences or similarities from that picture is practically impossible.

HeatMiner® brings a solution in form of three-dimensional difference heatmaps visualizing the core difference between given data sets. The picture below shows a difference heatmap generated from the same two datasets visualized earlier as a scatter plot. The first immediate advantage is that the visualization shows only parts of the xyz-space having enough points or at least some differencies. So, it filters out the outliers and non-interesting parts to focus our attention to the essential. Secondly, HeatMiner uses colors to illustrate the difference in point density so that areas where the first data set has more points are painted blue while the red color indicates that second data set is densier. Areas with identifical point densities are painted gray.

Difference heatmap generated with HeatMiner®, the visual data mining technology by Agience Oy Ltd

Although the picture above gives a good overview, it does not yet show the most significant differencies which probably are in the inside the visualized 3d-shape. Therefore, HeatMiner can do cross sections to reveal the inner parts of the shape. The cross section below shows that indeed densities differ the most in the middle (yellow and cyan areas).

Difference heatmap cross section generated with HeatMiner®, the visual data mining technology by Agience Oy Ltd

HeatMiner can also create a view which shows both the overall trend and the most significant details. This is achieved by visualizing the big shape as a see-through heatmap and core differencies as smaller 3d-shapes. In the picture below, the two shapes inside the wireframe enclose the areas where density difference exceeds a given threshold.

Difference between two data sets visualized with HeatMiner®, the visual data mining technology by Agience Oy Ltd

Try HeatMiner!

Some of these great HeatMiner® heatmap visualizations are now available as-a-Service at the Cloud'N'Sci.fi algorithm marketplace. Go to the HeatMiner homepage and create heatmaps from your own data using free evaluation campaigns! New heatmap types and demos are introduced frequently at the new HeatMiner wiki pages so keep an eye on it for updates.

Want to see more?

Check the latest HeatMiner solutions and demos!

<< Heatmaps from sparse data | Good-old Demos | See-through Heatmaps >>

Like it?

Become a Facebook fan!