Gephi initiator interview: how “Semiotics matter”

Today I have the honnor to interview a special member of Gephi Team: Mathieu Jacomy.

Mathieu is an engineer, a founder of the WebAtlas NGO, teacher in Sciences Po Paris, and leads R&D in the TIC Migrations program in the Fondation Maison des Sciences de l’Homme and Telecom ParisTech school.
He is the main developer of the “Navicrawler” software. He also created the first Gephi prototype.

 

heymann2_8080
Sebastien Heymann: Hi Mathieu Jacomy, you are the creator of Graphiltre, the first Gephi prototype that you developed in 2006. What was the purpose of making a yet-another-graph-software?
jacomy8080
Mathieu Jacomy: Hi ! I’m glad to answer your questions, and I hope our readers will be pleased to know more about Gephi.

At this time I was analyzing a lot of graphs and I wasn’t satisfied by the existing free tools. That’s why I started to build my own tools.

I had no money to use professional tools, and I needed to understand precisely what the software was doing : the open source, free softwares perfectly fit these constrains.
I was using the amazing software Guess proposed by Eytan Adar, that himself built for his own needs. I was doing quite the same thing as him, and I couldn’t start to explore graphs without this tool.
But I wasn’t satisfied because the software didn’t allow so much manipulations. I couldn’t look at the substructures as easily as I wanted, and it was difficult to make nice cartographies.
I was dreaming of a “graph-dedicated Photoshop“, a visualization-oriented software rather than a script-oriented tool.

A good way to figure out what I mean is to look at the spatialization process. In famous softwares such as Pajek or Guess, you have algorithms called “layout”, “force-vectors” or “energy model”. These algorithms give its shape to the graph, and it is probably the most critical part of the process to build a clear visualization. Because the substructures or “patterns” that one may see in the image strongly depend on the algorithm and the settings chosen. But in the same time, most of users also want to quickly look at the global shape of the graph, and may not be aware that it’s important to search for the best algorithm to use depending on the time you have, the quality you want, the size of the graph, its degree distribution, the substructure that you expect to recognize… I was careful with these algorithms but even if I understood their principles and specificities, I couldn’t figure out how they were transforming the graph, and I couldn’t evaluate their differences.

Why? Because in these softwares you can’t :
– Manipulate the graph while the algorithm is running
– Modify the settings while the algorithm is running
– And sometimes, you can’t event see the graph while the algorithm is running
How can you just understand what’s happening there? Of course I started to work on a software that allowed this. But the same kind of problems appears again in other parts of the process, like filtering, image exporting… Pajek is clearly built in a mathematical perspective. Guess is more user-friendly, but not enough. I didn’t want a tool for mathematics experts, but a tool for people that actually have to explore and understand graphs. A professional tool for a job that didn’t exist at this time.

This was the starting point of “Graphiltre“. Building a graph exploration system so that you can understand what you are doing by looking at what happens on the screen, and do anything (including filtering) without typing a single script line.

14a_p_ljpg
Graphiltre propotype, 2006
heymann2_8080
Sebastien Heymann: So how was Graphiltre able to satisfy your both technical and usable requirements?
jacomy8080
Mathieu Jacomy: Graphiltre allowed me to “twist” graphs as I wanted. And one of the most valuable “feelings” that you can experience with it, is to move nodes while the spatialization is running to look at how many nodes follow, how the local and global structures react, at which speed and in which way. I was really pleased, even if it was a lot of work. But of course the tool was just a prototype and most of the necessary features were missing. That’s why, and because I used the same graph format as Guess (the .gdf), I switched from a tool to another depending on my needs.

Then if you ask me: “Why didn’t you include your work inside Guess?”, of course you’ll be right. I strongly thought about it, because creating a new tool means developing again many basic features – quite for nothing.
To be honest, I just couldn’t do that, I wasn’t good enough to understand the source code of Guess – actually I tried! But behind this, lies another important issue. Because the inner structure of Guess (including a live script editing feature, several graphic engines, the JUNG graph core library, SQL bindings…) was too shy in my opinion. This software didn’t make a strong choice. Some very different options stay unchosen in Guess. I’ll give you an example. It is based on the “Piccolo” graphic library, which is good even if not graph optimized ; but if you look closely you’ll see that you can actually switch from this module to another module such as TouchGraph. But even if most of users keep using Piccolo – the default option-, Guess is locked by the need to tune its features up to each concurrent library. And this kind of problem lies everywhere in Guess because it’s a composite software, a puzzle built from various sources.
I’m criticizing Guess but I must say that I have a lot of respect for this software, for Eytan Adar and its team. They opened my mind and in many ways Guess had the taste of the future, it was a more decisive improvement to the world of graph-workers than Pajek. Because for the first time you could use your body to interact with graphs, not only your brain. You could actually “handle things”. The problem of Guess is just to be thought as a “research tool” and not a “general public software”, probably for obvious reasons (time, research priorities…). As an engineer, I started Graphiltre in another perspective.

Fortunately, with another design, you see things differently and you can sometimes make some improvements. There is a performance issue in Guess. I had big graphs to study, and Guess was way too slow for me. I started to measure the time it took to spatialize a graph and I understood that it was difficult for Piccolo to display a large graph. And by large, I mean more than 100 nodes / 1000 edges – which is actually quite small. Of course a complex graph is a lot of information, but it’s only simple shapes : one-color lines, rounds, squares and sometimes letters. You know that in a computer a CPU can do many different things, but isn’t so powerful about simple and repetitive tasks. I thought about the millions of polygons displayed in video games: the idea was to benefit from the power of GPUs. And I needed to rethink the display engine compared to the multiple solutions of Gephi. I wanted to first improve the display performance, and then try to use GPU power to improve spatialization algorithms.

I was amazed to use some video games’ technologies, and it appeared that it was a good idea. Graphiltre uses an OpenGL binding since the beginning. And since the beginning it is way faster than Guess for display. But I didn’t achieve the “GPGPU” perspective (general purpose graphic processor unit), even if today it’s easier to do it – and I just think that “physical engines” like “Havoc” already do what we need…

SVG Line and arrow links
heymann2_8080
Sebastien Heymann: You stepped aside the leading development to Mathieu Bastian in September 2007. At this time, Graphiltre was renamed “Gephi 0.5”, and Mathieu started then the first complete code rewriting, achieving Gephi 0.6 one year later. Now as you know how the project evolved, would you hand the project on the same person again? And at first, why stopping your contribution to the core development?
jacomy8080
Mathieu Jacomy: I’m not a software developer. The most that I can be is a software designer. I started this tool not for itself, but to do my job quicker, deeper, better. When it appeared that Graphiltre was a good idea, I felt that I couldn’t keep improve it, because I wasn’t good enough at it, but also because I didn’t want it to take all my time – would have been counter-productive! As I searched someone to keep the tool going on, Mathieu was the right man in the right place – and with the right first name! I told him all my vision and he took care of it. When it appeared that he had more contributed to the project than me, he officially became the leader and had the right to give a real name to the software: Gephi. He clearly overstep my expectations. I finally have the tool I dreamed of, and my energy can be used for other purposes: the project is now in good hands.

Mathieu did a really impressive work on the software. And a consequence is that now, it’s too difficult for me to contribute to the core of the code. But there is a difference for me: Mathieu made Gephi a contributive tool, based on a modern open-source philosophy. That’s why I keep improving some aspects such as spatialization, user interface, and filters. I still share my vision with the Gephi team and I’m always here if someone needs an opinion or an advice. I like discussing a lot about specifications and development perspectives. But I think that my main role is to make Gephi stay in touch with the right concepts about graphs.

Gephi 0.6 Beta2 Screenshot
Gephi 0.6 Beta2, March 2009
heymann2_8080
Sebastien Heymann: You also helped to create the first version of the GEXF specification, the network file format used by Gephi. GEXF is now supported by a consortium of industrial and research leaders in France. Why not using the good old GraphML? What are the differences?
jacomy8080
Mathieu Jacomy: I have to say that it’s in part a question of vanity. You’re young, the world is yours, you don’t know well what the others do and you think that what you do is better. You want to expand your territory, and the best way to do it when you lead a software project, is to create a new format. The users will be forced to use it, and it will be a mark of your “technological power”. It’s a common mistake when people innovate, but there’s something good in it: you have a deeper and faster understanding of something if you do it again, rather that just reading the fucking manual. But it’s not a good reason to keep using .gexf.

Even if a consortium uses this format, it doesn’t mean that it’s a good reason to keep using it. First, they don’t necessary need it, nor actually use it. They can support it for other reasons. Actually, the same reason I told you before: technological influence.

This question is a good one for innovators, and it’s important to give a clear answer because it touches to what a technology is, and to what a technology looks like. That’s why we need to separate two questions :
1) Why isn’t the .gexf based on GraphML ?
2) Why is there a new format associated to Gephi ?

The answer to 1) is: Vanity, laziness, and the possibility to take another direction later. There is no strong technical reason.
The answer to 2) is: Because we don’t claim to be compatible with anyone. Gephi already has some specificities that make it not fully compatible with other softwares. And it will be more and more the case. Because it has a strong identity, and I speak about features, it’s useful to make it clear, that it’s not using “generic” graph files. That’s why there is a name and a format linked to Gephi: because you have to know that if you open a “Gephi graph” in another software, you’ll probably lose some informations.

These two separated answers have nevertheless something in common: freedom. The .gexf file format is something the community can handle, it’s easy to implement a specific feature, because only few people have to agree on the format. It represents the freedom of the community, the opportunity to design its own tools for its own needs. It worths as a caution, as a freedom of move, not as itself. In my opinion, if we don’t implement strong specificities in the future, it may be question to leave this format. But its existence is useful for the moment, and might be decisive some day.

gexf.net
GEXF.net website, November 2009
heymann2_8080
Sebastien Heymann: You’re currently contributing to Gephi 0.7 specifications, in particular on features and UI design. What are the “mantras” behind this?
jacomy8080
Mathieu Jacomy: I think that every innovation stands on a vision – even if a strong vision doesn’t make a successful innovation. And my vision was that graphs should be brought to people that want to handle, twist and stretch them. And of course, represent them, as cartographies. I met mathematicians that worked on graphs for 20 years and never had the idea to visualize them as a picture. This sounded crazy, because graphs are concrete to me. Graphs are actual. And since they are, they should be handled easily. Graphs exist for the body, and it’s not a metaphor: people that twist and stretch graphs actually feel physical interactions ; these allow them to understand the structures. Remember that the first idea to spatialize a graph is to model it as a physical system of springs.

So my goal is to keep Gephi simple, handy for users. It isn’t easy for two reasons. First, graphs are complex – the purpose of Gephi to allow users to understand them and share this knowledge, with a cartography for example. Two, making graph handling easy means high performance, and it needs a high structural complexity for the software. But we achieved some improvements in this way, and I think that the 0.7 version is a milestone for that. Gephi is much more complicated than my original Graphiltre prototype, but it stays simple enough for users, and even simpler on many aspects. Many features appeared and the user interface is richer, but the work flow is more fluid.

But the secret, if there is one, the ‘mantra’ behind Gephi’s development philosophy isn’t the vision itself. The key of success in this kind of innovative process is to keep the good concepts in the center of the tool. I don’t have a “technology for technology” philosophy. Tools are made for users, and everything has to be about them. For example there is something I forbid : to implement an algorithm with no access to it. It may sound strange, but in research people sometimes do this, because they think that if it’s in the code and if they can use it (with a command line for example) then it’s OK. But it’s not : sharing with others also means respecting them, and respecting their right not to be comfortable with “computer scientist style” interfaces. It’s an example of the wrongs things to do. But what are the right things to do? There is no definitive answer to this question, because you innovate by making something new, and it means that you have to forget what previously was right or wrong. Nevertheless there is a vision behind an innovation: you have the intuition of what’s good or not. That’s why I say that the concepts are the key. Dedicate your tool to your concepts. This is, I think, my best advice.

You understood that I want user-centric tools. But the user isn’t actually in the center of the software. Mainly because when you develop a software, the user isn’t there. So what’s the link with the user? The user interface? No. It’s the concepts that make your software “work”. The user interface only stem from them. You have to think “the user will use my concepts to do its work, and I have to guide him so that he understands and benefits from them”. It’s your responsibility as a software designer to assume that there is a hidden power in any tool.
Think of a screwdriver: how to design the handle of a screwdriver? Long or short? Heavy or light? Fat or thick? Square or round section? Your concept is that the user has to push the back of the screwdriver to screw well on, with the arm in the axis of the tool. You know that if the user takes the screwdriver in its hand like a spoon, he won’t make a good work. Your power is to force the user to push the back of the tool. Your responsibility is to assume this power and to make it to serve the user. Your guideline for the interface will be to prevent the user from using your screwdriver like a spoon. That’s why you’ll chose a short handle, so that the user doesn’t want to grasp it. You’ll design a big flat back, so that it’s comfortable for the user to push it. You’ll make the tool easy for the right use and difficult for wrong uses. If you do that, you don’t expect from users to read the fucking manual. They’ll learn from using the tool.

Yes, users will learn from using the tools. And this is my point. The value of an innovation is the value of the concept behind the tool, plus the value of learning it by practice. This is the mantra of Gephi design. I was successful in my job, because I knew that the graphical aspect of a graph is very important. I made high quality spatializations, and very nice pictures. You can read Jacques Bertin to learn more about that. I’ll just say that my concept was: semiotics matter. As soon as a graph is spatialized, it’s read. As soon as it’s read, the signs in it, the system of signs it is, have to be carefully tuned. This was my secret, and the idea to develop and share a tool was to help people achieve a high quality work. I wanted them to understand that semiotics matter. But rather than writing a book, I wrote a software. And now people do great cartographies with Gephi because they benefit from the concepts I put inside Gephi. And I take responsibility for forcing them to use my concepts, even if they don’t realize it. And now it’s their concepts as well as mine. Franck Ghitalla calls this principle “to embody Human Sciences concepts in tools”, and he is right. This is the key. And I’ll give you a clear example.

When you spatialize a graph, a mathematical algorithm will make nodes converge to a locally optimal position. This means that the nodes are mathematically well placed. But as a system of signs, the graph may not be satisfying. For example, two nodes are very close one to another. If you show the names of the nodes, they hide each other or they superpose so that you can’t read them. What’s the meaning of a mathematically perfect position if you can’t read the cartography? My idea was to implement an algorithm that shifted nodes just a little enough to make the names readable. The loss is mathematically insignificant, and the image improves a lot. You know, my concept, “Semiotics matter”… But to achieve it you have to make the size of your text (graph as a system of signs) accessible to the mathematical algorithms (graph as a theoretical object). This is mathematically weird, but we don’t care. Concepts in the center: we designed Gephi so that it’s possible, that’s all. This feature is a real innovation (it involves new design principles for a graph software) and most of users love it.

10a_p_ljpg
Web cartography, 2008
heymann2_8080
Sebastien Heymann: Do you intend to take part in Gephi project in another way?
jacomy8080
Mathieu Jacomy: I honestly don’t know! It depends on many things that I can’t tell here 🙂
15b_p_ljpg
Readability improvements on printed maps, 2009
heymann2_8080
Sebastien Heymann: Last questions: How do you use Gephi, what do you achieve with? Do you plan to extend it and create plugins?
jacomy8080
Mathieu Jacomy: I’ll build new plug-ins and share them, of course, but I’ll keep them secret until they’re ready. Don’t tell anyone!
I use Gephi to analyze graphs, and to make always nicer cartographies. I also teach students how to achieve web cartographies with Gephi in Sciences Po in Paris. For more informations about what I do, take a tour on WebAtlas.fr (but in french only) or type my name in Google ;). Or…
…wait for it…
…just use Gephi!

 

screenshot_960

  • New Plugin-oriented architecture
  • New User Interface
  • New Cartography Creator module
  • New Network Statistics
  • and more…

 

Discover Gephi 0.7 now!

2 Comments

  1. Really good interview and in depth exploration of what makes gephi innovative and so important in our rising networked society.

    We, at linkfluence, are of course waiting for the new gephi and are very proud to be part of the story since the beginning now and observing the gephi community growing.

    Best !

    Reply

Leave a comment