Graph visualization on the web with Gephi and Seadragon

The project takes another big step forward and bring dynamic graph exploration on the web in one click from Gephi with the Seadragon Web Export plugin.

Mathieu Bastian and Julian Bilcke worked on a Seadragon export plugin. Directly export large graph pictures and put it on the web. Seadragon is pure Javascript and works on all modern browsers. As it uses images tiles (like Google Maps), there is no graph size limit.

Go to your Gephi installation and then to the Plugin Center (Tools > Plugin) to install the plugin. You can also download manually the plugin archive or get the source code.

/seadragon-samples/diseasome/seadragon.html

Sample with Diseasome Network dataset directly exported from Gephi

Communication about (large) graphs is currently limited because it’s not easy to put them on the web. Graph visualization has very much same aims as other types of visualization and need powerful web support. It’s a long time we are thinking about the best way to do this and found that there is no perfect solution. We need in the same time efficiency, interactivity and portability. The simpleness of making and hacking the system is also important, as we want developers to be able to improve it easily.

By comparing technologies we found that Seadragon is the best short-term solution, with minimum efforts and maximum results. It has however still a serious limitation: interactivity. No search and no click on nodes are possible for the moment. But as it is JS, I don’t see hurdles to add these features in the future, help needed.

The table below see our conclusions on technologies we are considering. We are very much eager to discuss it on the forum. As performance is the most important demand, WebGL is a serious candidate but development would require time and resources. We plan to start a WebGL visualization engine prototype next summer, for Google Summer of Code 2011, but we would like to discuss specifications with anyone interested and make this together.

Portability Efficiency Effort Interactivity
Flash
Java2D/Processing
Canvas (Processing.js/RaphaelJS)
WebGL
Seadragon
Figure: Comparing technologies able to display networks on the web.

How to use the plugin?

Install the plugin from Gephi, “Tools > Plugin” and find Seadragon Web Export. After restarting Gephi, the plugin is installed in the export menu. Load a sample network and try the plugin. Go to the Preview tab to configure the rendering settings like colors, labels and edges.

Export directly from Gephi Export menu

The settings asks for a valid directory where to export the files and the size of the canvas. Bigger is the canvas, more you can zoom in, but it takes longer time to generate and to load.

Export settings, configure the size of the image

Note that result on the local hard-drive can’t be viewed with Chrome, due to a bug. Run Chrome with “–allow-file-access-from-files” option to make it work.

Kudos to Microsoft Live Labs for this great library, released in Ms-PL open source license. Thank you to Franck Cuny for the CPAN Explorer project that inspired this plugin. Other interesting projects are GEXF Explorer, a Flash-based dynamic widget and gexf4js, load GEXF files into Protovis.

Gephi initiator interview: how “Semiotics matter”

Today I have the honnor to interview a special member of Gephi Team: Mathieu Jacomy.

Mathieu is an engineer, a founder of the WebAtlas NGO, teacher in Sciences Po Paris, and leads R&D in the TIC Migrations program in the Fondation Maison des Sciences de l’Homme and Telecom ParisTech school.
He is the main developer of the “Navicrawler” software. He also created the first Gephi prototype.

 

heymann2_8080
Sebastien Heymann: Hi Mathieu Jacomy, you are the creator of Graphiltre, the first Gephi prototype that you developed in 2006. What was the purpose of making a yet-another-graph-software?
jacomy8080
Mathieu Jacomy: Hi ! I’m glad to answer your questions, and I hope our readers will be pleased to know more about Gephi.

At this time I was analyzing a lot of graphs and I wasn’t satisfied by the existing free tools. That’s why I started to build my own tools.

I had no money to use professional tools, and I needed to understand precisely what the software was doing : the open source, free softwares perfectly fit these constrains.
I was using the amazing software Guess proposed by Eytan Adar, that himself built for his own needs. I was doing quite the same thing as him, and I couldn’t start to explore graphs without this tool.
But I wasn’t satisfied because the software didn’t allow so much manipulations. I couldn’t look at the substructures as easily as I wanted, and it was difficult to make nice cartographies.
I was dreaming of a “graph-dedicated Photoshop“, a visualization-oriented software rather than a script-oriented tool.

A good way to figure out what I mean is to look at the spatialization process. In famous softwares such as Pajek or Guess, you have algorithms called “layout”, “force-vectors” or “energy model”. These algorithms give its shape to the graph, and it is probably the most critical part of the process to build a clear visualization. Because the substructures or “patterns” that one may see in the image strongly depend on the algorithm and the settings chosen. But in the same time, most of users also want to quickly look at the global shape of the graph, and may not be aware that it’s important to search for the best algorithm to use depending on the time you have, the quality you want, the size of the graph, its degree distribution, the substructure that you expect to recognize… I was careful with these algorithms but even if I understood their principles and specificities, I couldn’t figure out how they were transforming the graph, and I couldn’t evaluate their differences.

Why? Because in these softwares you can’t :
– Manipulate the graph while the algorithm is running
– Modify the settings while the algorithm is running
– And sometimes, you can’t event see the graph while the algorithm is running
How can you just understand what’s happening there? Of course I started to work on a software that allowed this. But the same kind of problems appears again in other parts of the process, like filtering, image exporting… Pajek is clearly built in a mathematical perspective. Guess is more user-friendly, but not enough. I didn’t want a tool for mathematics experts, but a tool for people that actually have to explore and understand graphs. A professional tool for a job that didn’t exist at this time.

This was the starting point of “Graphiltre“. Building a graph exploration system so that you can understand what you are doing by looking at what happens on the screen, and do anything (including filtering) without typing a single script line.

Continue reading →

Diseasome at Online Information'09

Diseasome.eu, the human disease network map, was presented at Online Information’09, the international symposium of information systems. Our conference shows how interactive exploration of complex networks can lead to innovative interfaces for document discovery:

CPAN-Explorer, an interactive exploration of the Perl ecosystem

We are proud to announce the first Gephi-based system for exploring a complex network, CPAN-Explorer. This is a visualization project aiming at analyzing the relationships between the developers and the packages of the Perl language, known to be organized as the CPAN community (Comprehensive Perl Archive Network). Produced by RTGI Labs and our team, it was initially discussed in a talk at the FPW’09.

You can download original graph source files from each subproject page.
Available formats are: GEXF (Gephi graph format), GDF (Guess graph format), SVG, and PDF.
For some of the subprojects, an embedded javascript visualization is also available. For the community graph, a special Flash webpage is available for online exploration.

Website: http://www.cpan-explorer.org/

map of the Perl community on the Web

We generated two maps (authors and modules) using the CPANTS data. For the websites, we crawled a seed generated from the CPAN pages of the previous authors. Each of this graphs are generated using a force base algorithm.

All the map are available in PDF files, in creative common licence. The slides are in french, but we will explain the three maps here.

Flash interface

CPAN’s modules

The first map is about the modules available on the CPAN. We selected a list of modules which are listed as dependancies by at least 10 others modules, and the modules who used them. This graph is composed of 7193 nodes (or modules) and 17510 edges. Some clusters are interesting:

  • LWP and URI are really the center of the CPAN
  • a lot of web modules (XML::*, TemplateToolkit, HTML::Parser, …)
  • TK is isolated from the CPAN
  • Moose, DBIx::Class and Catalyst are forming a group. This data are from march, we will try to do a newer version of this map this summer. This one will be really interesting as Catalyst have switched to Moose

The CPAN’s authors

This map is about the authors on the CPAN. There is about 700 authors and their connections. Each time an author use a module of another author, a link is created.

  • Modern Perl, constitued by Moose, Catalyst, DBIx::Class. Important authors are Steven, Sartak, perigin, jrockway, mstrout, nothingmuch, marcus ramberg
  • Slaven Rezić and others TK developpers are on the border
  • Web map

    We crawled the web using the seed generated using the CPAN’s authors pages.

    • again, the “modern group”, on the top of the map, with Moose/Catalyst/DBIx::Class developpers
    • some enterprises, like shadowcat and iinteractive in the middle of the “modern Perl”, Booking in the middle of the YAPC’s websites (they are a major sponsor of this events), 6apart, …
    • perl.org is the reference for the Perl community (the site is oriented on their sides)
    • cpan.org is the reference for the open source community
    • github is in the center of the Perl community. It’s widely adopted by the Perl developpers. It offers all the “social media” features that are missing on the CPAN

    We hope you like this visualisations, have fun analyzing them 🙂

    Thanks Franck for the original post.

    cpan_community