Posts by Mathieu Jacomy

Creator of Gephi (but not lead developer!). Engineer and researcher at Sciences po médialab. I'm specialized in digital methods in social sciences, sometimes called digital sociology or digital humanities.

Meet Eduardo, our new lead developer

Mathieu Bastian has been our lead developer for more than ten years. He is now the proud father of an adorable little girl, congratulations! 🍾🎈🎉 At this occasion he decided to step down from his leading role in Gephi development, and hand over the reins to Eduardo Ramos Ibáñez.

Mathieu has been the true architect of Gephi’s source code. Not only is he its most prolific author, but also the engineer reflecting on its structure, drawing its blueprints, and building the foundations. He transformed my clunky 2007 prototype into an actual software over half a dozen complete refactorings, never drawing back from facing challenges. We owe him everything that makes it work: the ability to be installed, to be maintained, to have plugins… There would be no Gephi without him. Today is a good occasion to write thank you Mathieu for your years of service to the project! Fortunately he will stay an active member of the community – we would be lost without his invaluable knowledge on the most intricate depths of Gephi’s source code.

So who is Eduardo? Let him present himself:

I am a spanish software developer, currently living in Madrid and I have been helping to maintain Gephi for several years. I love to create interesting software and trying to push its limits, specially data visualization!

I am kind of a progressive music fan, and a cat lover 🙂

You can follow or contact me on twitter @eduramiba

Eduardo is the person who knows best Gephi’s source code after Mathieu, and it is only natural that he is the next in line to lead development efforts. You already know his work since he almost entirely developed the data laboratory, but as often an important part of his contribution is not so visible – maintaining the source code, fixing this bug… This is how he became an expert of Gephi development over the years. He is now developing a new OpenGL engine for Gephi 1.0. Welcome Eduardo, and thank you for stepping up to this new role!

If you want to know more about the situation and future of Gephi, we wrote about it in this blog post.

Is Gephi obsolete? Situation and perspectives.

Note: because it’s a bit long, you can also read it on Medium.

Despite years of collaboration, for the first time, Eduardo, Mathieu and I sit together at the same place. The Gephi community mainly exists online, and its members have few occasions to see each other in person. But we have to talk. Mathieu Bastian is Gephi’s lead developer and currently lives in Berlin. Eduardo Ramos Ibáñez is the second most prolific contributor after Mathieu and the only other one to know Gephi’s core in depth. He lives in Madrid. As for me who started our project, Mathieu Jacomy, I live in Paris. We just arrived in Berlin to have an in-depth talk about Gephi: state of the project, its relevance, its future. Our goal is to question the Gephi project and reevaluate our commitment to it. We need a picture of the different options. We start by the elephant in the room: is the project still worth it? Here is our answer.

What is wrong with the Gephi project

We aim at identifying the project’s strengths and weaknesses. It is not only about evaluating if its benefits counterbalance its issues, but also about finding the right course of actions. Let us start with the problems.

A common issue to niche open source projects, our most limited resource is technical leadership. What does it mean? It is a consequence of Gephi’s code being fairly complicated. Fortunately this is not an issue for all contributors, for instance it is pretty easy to implement a statistics plugin. Many parts of Gephi could be improved by plugin developers, but not all parts. Sometimes we need to modify architecture itself, or a deep and specific part like the GraphStore engine. When it comes up, only a few community members are competent. Namely Mathieu and Eduardo. Coding these parts would not require crazy skills, but a fair amount of Gephi-specific knowledge. Unfortunately that knowledge is imprisoned in the brains of two people (well it’s still better than one!). This is what we call the bottleneck of technical leadership. We may choose to fix core issues ourselves or disseminate the knowledge to other developers, but both scenarios require the precious time of Eduardo and/or Mathieu.

Technology is changing, we must adapt, and it wears out technical leadership. It is obvious to developers but not to users: we cannot just produce a version of Gephi that works well and let it be. It would stop working because technology changes. New versions of Java, new operating systems would break features that work well in today’s environment. Sadly when incompatibilities arise, it is generally for the core developers to deal with. We were in such situation before version 0.9, at a moment when the new GraphStore engine was not ready yet but the Java compatibility broke and during that time, Mac users were not able to use Gephi without a convoluted turnaround. We are not sure to be able to keep up efficiently with these changes because of our limited technical leadership.

Technology evolves in an unfavorable direction. User experience is at the center of the Gephi project. Unfortunately it seems that the Java language tends to drift away from user interface design and development. Admittedly, it has never been a strength of Java. This technology does not support modern UI design – I feel like Java assumes that the UI will be developed by an engineer rather than a designer. It may become even worse. With the obsolescence of OpenGL on Mac and the removal of JavaFX from the runtime environment, we could live in a world where multiplatform softwares have a Java brain and a web face. Gephi is based on the JOGL library whose development is increasingly uncertain, which forces us to consider alternatives like WebGL. We understand that it makes sense to delegate modern UI design to a well-established environment (HTML5 and friends). However WebGL is far from OpenGL stability and performance. We think that from the user stance, Gephi is a lot about forging one’s network exploration and analysis skills on small and easy cases, and scaling them up to larger, more complicated cases. Thanks to its OpenGL engine, it is able to work almost as well for networks of hundred thousands nodes than of tens of nodes. If the ability to visualize huge datasets is key to Gephi, then web technologies are not a viable alternative. We have no definite solution to this issue and we might be facing a technological dead-end in a not so distant future.

Gephi is not only about tech. As a projet it must also face the changes in the lives of its key contributors. Mathieu just had his first child, and more generally our careers follow their own paths that do not always align with the needs of the project. On the one hand we become more efficient at what we are doing, but on the other hand we have less and less time to dedicate to the project. In fact, we just have less spare time. We do not want Gephi to die but we are at risk of becoming tired of the burden it represents. We did not lose our desire for this unexpected journey, but reality often knocks on the door and it would be dishonest to omit this aspect of the situation.

Finally user needs are also changing. Users can access many other systems for network analysis and visualization. A market of web-based solutions emerged and each system found a niche to settle in. A landscape of network tools. Gephi is not necessary anymore, if it ever was. Complex networks were once the most fashionable trick of social science pioneers in a big data world, but now they have UMAP and deep neural networks. Complex networks entered a “business as usual” era. They ceased to draw the attention of the most creative minds. Complex networks had their moment, and it passed. We do not think that it is bad or sad, it might actually be a chance. Nevertheless the context has changed and it is possible that Gephi is not anymore what people need. So what do they need?

What is right with Gephi

We believe that Gephi actually still meets some needs, sometimes in its own unique way. Note that these ideas are not the outcome of a systematic study, but stem out of our empirical contacts with users, during workshops, online, or in our everyday lives. Eduardo, Mathieu and I were pretty convergent in our feelings.

First of all Gephi still has a public and it lies mostly in the sphere of education and research. The Facebook community is active and often features the visualization of digital data in a social science perspective, such as Twitter networks. Since it is the main place where to ask for help, it also attracts a certain amount of exotic tinkering and experimentation. The Gephi community is more than just about using the software, it is also a space where people share what they have done, discuss various topics, and get feedback. It has something of a subculture. We believe that Gephi has some appeal to curious minds, and that it helps a certain public getting engaged with network analysis. Following who mentions Gephi on Twitter also made us realize that “Gephi” is sometimes used as a label to refer to a visual exploration. This seems to be particularly the case in social network analysis (SNA), the community where Gephi spread the most. Since they emerged, digital humanities also made a wide use of Gephi. From what we observe Gephi tends to be more used in social science and by beginners, but it is nevertheless used in natural sciences and by advanced users like data scientists. We can measure its success in the research sector by its 3780 citations (counted in Google Scholar). This public probably finds something in Gephi that it does not find elsewhere, even if just that it is free. This fairly large amount of users is still a good reason to keep maintaining and developing it.

Gephi also has some specificities that could be lost with it in the unfortunate event that its development comes to an end. It has its niche and many users value it for what makes it special. We believe that this specificity comes in three parts. (1) It is a free software that you can install easily on multiple platform. This make it one of the few inexpensive options for teaching, workshops etc. (2) It approaches network analysis from a graphical and interactive perspective that is more intuitive than the math equations of graph theory. It can be understood by non experts such as students and data journalists or social science researchers reaching out of their core competencies. (3) It allows you to scale up your network analysis and exploration skills to much bigger networks. Its learning curve bridges small qualitative networks with large quantitative datasets. The effects of complexity and the way you explore data will be very different but the basic tools at your disposal will stay the same (layout, statistics, filtering…). Gephi is an all-around tool that allows beginners to understand the gist of network exploration. It is at its best in a pedagogical setting where people will leverage practice to improve their data analysis skills.

I want to mention that some of the things that make Gephi appealing are not, in our views, essential. We are well aware that Gephi allows to produce impressive images and that the sight of a spatialization layout unfolding a network have something fascinating. They certainly are an important factor in its success. They also play a role in user engagement with data, which is key to progressing in data science. However these attractive features only make sense insofar as they lead users to improving their network analysis skills. Though Gephi may be used to produce “data porn”, we believe it does not end there. Toying is just the first step towards the ability to get insights out of networks. Other devices might produce evocative visualizations, but Gephi is one of the few that actively leverage play to arouse interest for science (in the field of network analysis).

Where the Gephi project currently stands

Gephi is not the only software for network analysis and more importantly, it does not want to be. Depending on one’s style and skills, other options might be preferable. NetworkX might be more flexible if you know Python. To draw diagrams you should head for GraphViz. As a biologist, Cytoscape is the tool your community is using… and have you tried NodeXL? Different devices do different things and Gephi does not want to be all of that. In the past we have been tempted to build a generic tool for any kind of network, even the dreaded dynamic hierarchical mixed weighted graph. We now want to focus on what Gephi does best and articulate it with other tools that have specific benefits.

We think that Gephi’s niche is visual, interactive exploration of common types of networks with a set of features that are not too specific, and that scale to large number of nodes and edges. We have observed that most users tend to explore networks of multiple orders of magnitude: from 10 to 10K nodes, or from 100 to 10M nodes… We believe that it is a key feature. Conversely we do not believe that producing a static map is its main mission. Other tools are in a better position for that task, and we prioritize exploration features over graphic outputs. Instant visual feedback central to Gephi’s identity. What it is in the best position to do, is making things visible when users apply an algorithm to their network. Fostering this kind of awareness helps users reflect on their method, make sense of their activity, and streamline their workflow.

The Gephi Toolkit has lost most of its relevance. Graph processing libraries like NetworkX have matured and feature most if not all operations you can do in Gephi. The toolkit is basically a separate branch of the project that requires a certain amount of maintenance. It drains forces from the main project. Considering that Gephi’s source code is open and that it is possible to tinker experiments without the Toolkit, we believe that it would make sense to discontinue it – though we did not officially pull the trigger so far.

Refocusing Gephi is not only about removing parts, but also filling holes. For instance though we will deprecate hierarchical graphs because they are not so common, we consider supporting parallel edges, well represented in datasets. In the same spirit, because spatialization layouts are so central to user experience, we consider adding algorithms evaluating the quality of a layout and other features supporting visual network analysis. For instance we believe that edges visualization should be improved in the exploration panel. Last but not least, refocusing Gephi is also about reordering the general user interface to put emphasis on what is important and simplifying what is not. Reflexions about Gephi’s future user interface have already been presented in a previous blog post.

Finally it is worth talking about the project. We like that Gephi is opinionated, multiplatform, free, and open source. We do not want to change any of that. We will not go as far as writing a manifesto but we state here that Gephi is not a company, we do not want it to be company, and it will not become one. This does not mean that there can be no economic activity involving Gephi, but that when it happens it is not hosted by the project. So what is the Gephi project? An informal network of contributors that involves multiple individuals at various degrees, with no clear boundaries, and where anyone can bring their own thing to the project. However being free and open does not mean that we have no structure: the GPL 3 licence protects the project, codes and contents have authors, and different persons have different roles. Gephi is not only software and plugins but also website, blog, Facebook community… A good part of people’s energy goes to producing contents. There is a Gephi project around the Gephi software, and it might become increasingly important.

As a conclusion to this section, lets us summarize what Gephi is and will remain:

  • Free
  • Open source
  • Extensible by the community
  • Multiplatform
  • Installable as a normal software
  • With local based files (no cloud hosting, works offline)
  • User centric
  • Focused on exploration
  • Beginner friendly (as much as possible)
  • Opinionated – it will not always do what other tools do.

Gephi’s future: version 1.0 and beyond

An important part of our discussion revolves around future features. It is not only about what Gephi should focus on, but also what we can do in today’s and tomorrow’s context. As explained above, we have a limited technical leadership and we are constrained by the evolution of Java and OpenGL. This leads us to consider which features can be considered in the current state of Gephi and which features would require a paradigm change. We are not only imagining future Gephi but also future future Gephi (what our project could be if we challenged a number underlying assumptions). We have two different horizons: Gephi 1.0, a focused version of today’s software, and Gephi 2, a possible future on a different ground.

For Gephi 2 we are anticipating that Java is not fully supporting our needs, and we are considering porting a part (and possibly all) of the software in a different platform. Current technological context incentivizes us to use a Java brain behind a web-based face, but WebGL is still a bottleneck for big networks. We have no good solution but it might emerge in time. We are also acknowledging the blooming of the network analysis ecosystem and we believe that a single software might not be the best solution to address a constellation of user needs. For instance if Gephi focuses more on exploration, it leaves room for a different tool about network publication. This tool might be a part of our project and not be the software itself. It might not sound dramatic but for us it is an decisive psychological step to think of the project as multiple tools and not just the Java software. It brings clarity to our intentions and opens new possibilities to address difficult problems.

Future features: fragments of road map

Gephi 1.0 can feature a number of changes that make sense as a natural extension of today’s Gephi, while the more dramatic changes are postponed to Gephi 2. We have no clear picture of what Gephi 2 might be, but its existence helps us select the right features for a close future. Here is a list of improvements we would like to implement before moving to a different paradigm.

  • UNDO feature, limited to the “GEXF scope”: network data, metadata, positions, sizes, colors…
  • Default save to GEXF. More stable than “.gephi” though it does not save the state of the user interface.
  • Activity log, possibly coordinated to undo, possibly stored in the GEXF. A plugin is already exploring that direction.
  • Parallel edges. The GraphStore supports it but not the rest of Gephi.
  • New OpenGL engine. Eduardo already prototyped an alpha version.
  • Curved edges in visual exploration. These are important because they help identifying edge orientation.
  • Quick search in nodes and metadata. It turns out it should be pretty easy to implement.
  • New icons. Many resources are now available to do better and the technical part is trivial.
  • Cleaner data laboratory
  • Embed Java: no more hassle with installing the right Java version.
  • Install from MacStore. Easier for Mac users.
  • Fix filter composition.
  • Better statistics reports in HTML5.
  • Revamp appearance, label color & size, sliders… For instance incentivize rankings as opposed to default unitary mode.
  • Label anchor (start, middle, end)… and possibly some jitter.
  • Better label adjust (one that works better). Possibly with label jitter.

In conclusion

Gephi is not obsolete, and we have a good hope to make its strengths more apparent by refocusing our development efforts towards version 1.0. As an additional outcome of our discussion, we now welcome Eduardo as our new lead developer, but more on that in a separate blog post. Thank you for your support and cheers from Berlin!

Eduardo Ramos Ibáñez, Mathieu Bastian, and Mathieu Jacomy

Gephi 0.9.2 : a new CSV importer

A new version of Gephi has been released! Thanks to Eduardo’s relentless issue fixing, Gephi’s overall stability has been improved. Eduardo is the author of the Data Laboratory, and at this occasion he revamped its CSV importer for a more flexible and straightforward user experience.

The new CSV/spreadsheet importer

Did you know that Gephi can export and import just the table of nodes or the table of edges? This feature is useful in many situations, for instance to produce charts in Excel or to clean data in Open Refine. Below we will showcase the new features and more generally explain how to import a spreadsheet as a list of nodes.

To import a spreadsheet you have to reach the Data Laboratory and click on Import Spreadsheet. In the example below a network is already loaded: we will decide later whether the imported nodes will be merged into the existing ones or not.

Tuto01

Gephi is now able to recognize the type of file you upload, and the support of Excel files has been added. Choosing the right separator is crucial since improperly separated columns would compromise the data. In the example below Gephi recognized that the separator is the Comma (as in a properly formatted CSV file).

Tuto02

The encoding of the file is a common issue, notably with languages using accents and special characters. Gephi can guess the encoding and you can manually edit it if necessary. In the example below Gephi correctly guessed the UTF-8 encoding.

Tuto04

Selecting a different encoding would produce errors. Fortunately the Preview table allows you to see them and fix the encoding. In the screenshot below, see how the wrong encoding produces exotic characters in the data.

Tuto03

When you validate these settings, Gephi now opens the exact same panel as when you open a new network. I personally love this addition since it brings more consistency to the user experience. It allows Gephi to provide a number of useful informations like the number of nodes detected or the issues found during the import process.

Tuto05bis

Do not miss an important feature here: in this panel you decide either to create a new workspace with the imported data or to merge the new nodes with the old ones. This very useful feature was already present at the opening of a new network, but many users still ignore it exists. Mind to select the Append option if you intend to merge the nodes. In that case when an imported node has the same Id than an already present one, the new node data will override the old one.

Tuto06

More info

Take a look at the full list of improvements there:
https://github.com/gephi/gephi/releases/tag/v0.9.2

How do I get this release?

  • If you have a recent Gephi, the update will be automatically proposed to you
  • If you have an older version (0.8 or before) you have to download and install manually
  • This update can be downloaded from http://gephi.org

Improving the Gephi User Experience

This is an effort to rethink the design of Gephi authored by Donato Ricci, co-founder of Density Design and senior designer at the Sciences Po médialab in Paris, and me, creator of Gephi and an engineer at the same lab (note that I am not Mathieu Bastian, our lead developer and actual powerhouse of the project for the past 10 years).

While this text presents possible improvements and practical solutions, it does not address practical considerations of available labor. Also, be aware that this is not a formal roadmap for future releases but rather a way to open the current state of our reflections for brainstorming. So feel free to share your ideas and comments with us.

There are five main categories that structure the improvements that we currently envision:

  • Design strategy: Ensure that a coherent design philosophy is applied across the entirety of the project
  • User interface: Identify and correct user-facing errors
  • Network-focus: Re-focus design and architecture around the network’s position of primary importance
  • Filling in the Gaps: Providing expected functionality
  • Miscellany: Other minor issues

In addition, we have drafted a UI mockup illustrating some of our propositions.

Design strategy

Gephi was built by engineers without a comprehensive design strategy. This situation is fairly common: engineers approach design in an ad-hoc fashion learning by trial and errors and through casual user observation but without a formal user-testing protocol. Should the tool succeed, it is mostly because the utility triumphs over the pain of use. Gephi is an embodiment of this phenomenon in its current state. Some computer scientists may find it simple, partly because using terrible interfaces is a part of their job, but for many users Gephi is confusing. Geeks of a masochistic tendency may love the tool as a result of digital Stockholm Syndrome, but the bulk of users that could benefit from Gephi find it to be confusing and opaque. In our defense, developing desktop applications is heavily constrained and the Java technology was not helping us to overcome this difficulty. What could a designer do to alleviate this situation? Apply a strategy.

A designer does more than treating the symptoms of poor usability; he or she approaches user experience from its fundamentals. Improving Gephi requires rethinking some of its longest standing features from a new standpoint. A design strategy is the solid foundation upon which we build both a satisfying user experience and underlying software architecture.

Our design strategy fits in five basic points: obtaining substantial and organized user feedback, giving Gephi a clear workflow, implementing a facet-oriented interface layout, reordering panels from the user standpoint, and removing unnecessary features. Each point is explained in further detail along with practical guidelines for implementing potential solutions in the future.

User feedback

We cannot build a sustainable user interface without a quantitative measure of user activity. These data are necessary to support and validate design choices. One approach to obtain this information is to log and collect feedback about interface usage.

An optional logger could be implemented in Gephi to allow users to opt-in to the collection of logs in order to improve the software. Data harvesting can be done as a campaign: for example, we may ask some users to activate it for one month to evaluate the usage of a new interface paradigm that we are testing.

A clear workflow

Users need a clear and visible path to start with Gephi, in particular when opening a new file. We need to remove information to allow users focusing on what is important.

Gephi involves not only the software itself, but also the installer, website and documentation. Our ultimate goal is to make the entry process as simple as possible by coordinating these different elements. We begin by focusing here on just the software itself. We propose to consider that there are only two proper ways to enter in Gephi:

  • Opening a file (constructing a network from a pre-generated file)
  • Connecting to a data source (embedded scraper or API connection)

We also need to clarify the roles of the “open” and “import” functions. We have to clarify that:

  • If the user has a file and needs to see it in Gephi, then “Open” is the right answer
  • If the user has an external data source he or she wishes to connect to, there is distinct menu option for this function

Use case: we have observed that some users try to get in Gephi with a table of nodes and a table of links, and do not succeed in finding the right path. The problem there is that it is not explicit that it is necessary to create an empty document, go to the data table, and then import the tables. Since the pattern is “I have files and I want to see them in Gephi”, then the answer should be under the “Open” menu item.

Facet-oriented interface layout

Rethinking overall design has the virtue of allowing for the reorganization of the interface from a user-centric perspective. The current interface relies on the panels system provided by the Netbeans Platform, which provides some beneficial properties for design. We were inspired by Ben Shneiderman’s motto of “overview first, zoom and filter, then details-on-demand” and it has been quite successful. However the different views are not articulated in a coherent way and the features sometimes struggle to find the right place in terms of visibility.

We propose three simple guidelines for a better organization of the panels:

  • The global hierarchy of containers should reflect the generality of the features
  • Some panels are not facet-dependent: they should not change with the facet
  • The network should occupy a single place whatever its facet, since it is always the same object

Panels guidelines

These guidelines have two consequences. First: facet-dependent panels should be contained inside facet-independent panels; which is to say that there is a single container for all facet-dependent features. Second: the facet selector (currently the three tabs on top) has to be inside this container.

We illustrate this with a comparison between a representation of the current layout and a new simplified structure.

Web

Reordering panels from the user standpoint

A part of our design strategy is to reduce visual clutter by grouping panels that are not used simultaneously. Though it is not intuitive, we prioritize separating panels that work well together over grouping similar features. For instance the following panels should be placed in different groups in order to be used combined:

  • Filters + Layout
  • Filters + Partition or Ranking
  • Timeline + almost anything else

With the current panels there are at least three obvious groups: one with filters, another with layout, and the third with the timeline. Generic contextual information is a fourth possible group, but could be placed in a non-intrusive location like the footer. Some panels, like statistics, could theoretically be at home in any group or even as a separate window that could be invoked from a menu.

Collapsing panels concept

Panels and groups of panels create two levels, so the “window” menu should have two levels too.

Removing unnecessary features

Reducing complexity can also be accomplished by removing features. We have detected at least one clear candidate for removal, but we may find more unnecessary complications to remove.

The “preview” panel of Gephi has been increasingly simplified over iterative updates. The goal of this feature is to provide a quick way to export cartographies. Users with competence in design tend to rely on third-party tools that provide finer-grained control over the visualization, like Illustrator and Scriptographer. Thus, the focus in Gephi is to provide a quick way to export images that can be manipulated in other tools.

We propose to further streamline preview functionality by removing some advanced label features: they infrequently used, complicated, and at times internally inconsistent with other preview settings. Furthermore, it is not necessary to facilitate changes to features like label and node size and color when such adjustments can be made much more easily using other tools.

User Interface

Donato Ricci has identified various flaws in the Gephi UI. Fixing them is a priority for the future.

User-centric features: reordering workflow

Users think in terms of results they want to obtain. They have an action in mind and they search how to do it. By displaying features according to their result, we can both improve user orientation and reduce the tool’s learning curve. A few examples of follow.

We propose to aggregate Partition and Ranking under the more accessible term “Appearance”, and to reverse the order of what is asked to the user. The current interface is organized in the following way: if the user has metadata that can rank the nodes then the user can visualize it using different attributes like color and size. The new interface inverts this approach: if the user wants to color nodes then he or she chooses which metadata to use. The panel may progress like a wizard to reduce cognitive load by drawing attention only to information that is necessary for a given step.

Unified panel appearance

Collapsing advanced layout settings

The current design of Gephi does not respect the general principle of drawing attention to information that is commonly used while obscuring information that is infrequently used.

Tools of the Overview: many problems to fix

The small tool buttons on the left side of the overview panel have a number of problems including:

  • Confusing icons that do not easily communicate the use of tools
  • Indistinct icons that do not sufficiently distinguish between different tools
  • Missing tools that are commonly expected

These issues are compounded by evidence that the tools themselves do not provide sufficient utility for common use.

We propose to alleviate some of these shortcomings by putting most of these tools in a collapsed panel and to have a normal panel dedicated to the settings of each tool. We also propose to implement a default tool cursor that draws on common mouse usage paradigms to provide intuitive functionality to users:

  • A click-drag starting on the background (or an edge) of the view makes a rectangle selection
  • A click-drag starting on a node moves this node
  • A click on a node selects the node, the shift key is used for multiple selection
  • A click on the background deselects
  • A set of meta-keys changes the click function, for instance the spacebar to switch to the view-panning tool (i.e. the hand in Photoshop)
  • The secondary-click works the same

Fixing highlight colors

Highlighting works by tweaking colors so that some nodes get more contrast than others. The contrast should depend on the background color, but this is not current implementation. At the very leastthe following should be done:

  • White background: highlighted nodes darker and other nodes lighter
  • Black background: highlighted nodes lighter and other nodes darker

Network-focus

As a network analysis tool, the network itself plays an obvious central role in Gephi. We have explored different ways to incorporate the network into the software’s presentation and have developed some suggestions for modifications that would increase interface coherency.

A different layout for the panels: network as background

In this approach, the network is contained by a “background sheet” and floating panels support functions. Such a philosophy has been successfully implemented in other systems like Photoshop or Google Maps. Using the root panel for the network, like grouping facet-dependent panels, fosters the feeling that we always deal with the same object, the network. This operates on the metaphor that we are always manipulating a primary canvas that consists of the network to be analyzed.

Network as background

Statistics as an invoked panel or window

Statistics tend to be used on demand, and thus do not need to be displayed permanently. Rather, a discrete menu or button could invoke the statistics panel when needed. Removing this visual information leaves more room to focus on what is important, i.e. the network.

Workspaces: more visible, on top

The workspaces need more attention. We propose to show them as tabs on the top of Gephi. It is more natural to have the workspaces above the facet selector in the hierarchy of panels. This is consistent with the prevalence of the “tab” paradigm in the browser space.

Workspaces-on-top

Filling in the Gaps: Providing Expected Functionality

In addition to the different aspects listed above, users need some well-known common features such as an “undo” function, even if they are complex to implement.

History and undo: feasible if limited to network structure

A visible trace of previous steps, like a proverbial breadcrumb trail, provides users with a sense of orientation and confidence when exploring and manipulating data in a speculative fashion. This also accelerates the learning process by alleviating cognitive load by not forcing users to have to remember a series of unfamiliar steps. This works in tandem with an “undo” feature, which facilitates experimentation without fear of permanently corrupting data.

History and Undo are complex to implement and burden the development of plugins and modules as these functions tend to be deeply embedded in a piece of software’s architecture. This partly explains why they are not currently available in Gephi. However a prudent approach in Gephi would be to focus recording and reversal of changes to the structure of the network: Nodes and edges, their attributes (including color, size and coordinates), but not the state of panels such as filters, statistics…

An initial approach would be to cover only a minimal set of modifications of the network structure. The history would then contain information about the type of the modification, but not its exact content nor the way it was done (manually, filter, data table…). For instance:

  • Modifying attribute X for node N / n nodes / all nodes
  • Modifying color / size / position for node N / n nodes / all nodes
  • Adding / removing: node N / n nodes / all nodes
  • Adding / removing: node attribute X / n attributes / all attributes
  • …and the same for links

The history would not include operations such as exporting files, taking screenshots, modification of views, changes to settings, or other changes that did not directly affect the structure of the network

Protecting irreversible operations

Some operations are irreversible: removing nodes, edges and attributes (and possibly more). Because these operations are definitive and may cause the loss of a certain quantity of work, they should be protected. A classical solution is to ask confirmation for any definitive operation. This is a simple guideline but the result is quite user hostile. We propose a better solution, as implemented in Photoshop: when an irreversible operation has been done, when the user tries to save the network the “Save as…” window appears instead and proposes a different name (with a suffix number or “Copy of X”).

Miscellany

A few additional points deserve to be listed, and are done so in no particular order.

Manual versioning

A basic versioning feature would be appreciated: just the opportunity to save with incrementing / adding a number suffix:

  • “My Network.gexf” is saved as “My Network 01.gexf”
  • “My Network 11.gexf” is saved as “My Network 12.gexf”

A common shortcut for this is Ctrl+Alt+S.

Generalizing zoom options (more internal consistency)

We can currently set how the zoom impacts text labels. The same feature would be useful for edges, for instance to keep 1 pixel lines whatever the zoom, as well as for nodes, for instance to keep small points whatever the zoom.

Generalized zoom options

Size nodes according to area

Our eyes perceive areas. Setting a ranking to the diameter of nodes is less intuitive than to apply it to the area of nodes. We propose to offer an option to customize ranking by either diameter or area, but set the default to area.

Removing unnecessary settings about labels

Node labels and edge labels should help the user identifying nodes. However, using the color or size of labels to visualize attributes is confusing. Gephi presently contains settings to manipulate labels in this way, these settings should be removed and replaced with a simpler interface.

UI Mockup sample (work in progress)

We present here a possible approach to integrate some of the different suggestions made in this document. Consider the following image as a way to help imagine the future of the Gephi user experience.

MockupA-01

As we stated earlier, the purpose of this document is to open up the floor to brainstorming ideas about improving the Gephi UX. Please share your ideas in the comments!

PS: Thanks to Niranjan Sivakumar for his excellent proof-reading 🙂

Gephi is asleep and up to awaken

gephi round graph

Gephi has been almost inactive since quite a long time: we did not release, we did not fix issues, we did not post on the blog. This lack of recent updates creates an increasing amount of difficulties, including installing Gephi on a recent Mac computer. A lot of users ask if the project is still alive. We understand why you wonder and decided to write this post to explain where we’re at and provide to the community a preview of what’s next in Gephi’s lifecycle.

In short, Gephi is still alive yet asleep, but its reawakening is in sight. However, a series of issues prevents us from doing better right now.

The ambitious yet incomplete 0.9 release

The next planned release is Gephi 0.9 and promises to be a major release with a complete rewrite of Gephi’s core module. Performance, and especially memory usage for large graphs has been a lingering issue since the first version of the software. As explained in this article written by Mathieu Bastian – Gephi’s lead developer – the solution resides in a more efficient graph structure implementation that we named “GraphStore”. This technology brings many new features and significantly reduces the memory usage but is a large development effort and requires all modules to be adapted. Indeed, the module which stores and manages the graph is pretty much used by every other module (e.g layout, filters, preview etc.).

This work on the core graph module was initiated as part of a larger vision focused towards a 1.0 release, which aims to address a much larger set of problems, missing measures and bugs. As you may know, the current version has a various set of problems. Some issues are preventing the normal use of the software, like the difficulties to install it on a recent Mac OS X (Yosemite and Maverick). Others are incomplete or missing features, such as various user interface design issues or the improper management of categories’ colors. Finally, some internal problems are hidden in the code but nevertheless real. For example, the technology used to code the user interface (Swing) has been replaced by a more modern technology named JavaFX. For the most part, these problems require a deep rework of the code. The good news is that the most difficult part in this 1.0 vision is probably the rewriting of Gephi’s core graph module, which is what the 0.9 version focus on already.

The current 0.9 developments have reached around 80% completion and many modules, but not all, have already been adapted to GraphStore. A stable version can’t be released until this reaches 100% and all the modules are converted to the new core implementation. Other important issues such as installation issues on recent Macs have already been addressed in this development version. Finally, a series of bugs will be fixed along with minor features and improvements. Finishing the last modules and releasing the 0.9 version is our current priority.

Limited resources

Like many other open source projects, Gephi’s development is for the most part unpaid and remains an activity on the side for all contributors. The notable exception is the Google Summer of Code, which sponsored students multiple years in a row to work on the project. Therefore, the project’s progress depends on the contributors’ professional and personal situations. Although individuals are ready and willing, time is limited and there was just not enough of it lately to make significant progress. Mathieu Bastian is Gephi’s architect and has been behind the software’s key iterations since 2007. This time again he holds the keys to its future and has been involved in the GraphStore project. This complex project requires all of his knowledge of Gephi’s code and is hardly a task someone else could do at this point. Therefore, a part of our development depends on his free time, and we accept it. This situation is temporary though. Indeed, Mathieu will eventually obtain more time to conclude the work on the 0.9 release and Gephi’s development will be less dependent on him in the future.

In addition, we are working on stabilizing some resources in the long run, but our strategy requires a readjustment. Gephi needs time and energy from good java developers, clear-minded designers, and seasoned software architects. We have to entice skilled people, support their involvement and get the best from their contributions. We aim to improve the management of our limited workforce to make the development more attractive and dynamic. This evolution is organized by our team but benefits from external support. For instance, the Sciences Po médialab, the institution I belong to, provides resources for organizing the project, rethinking the user interface and some coding. These changes may not be immediately visible but we’re committed for the medium and long term.

What is next

Releasing the Gephi 0.9 version is the immediate next step. This version will include compatibility fixes and the whole new core based on GraphStore. Then, an important project to rework the overall user experience will be kicked off. It requires a technology switch (from Swing to JavaFX) and the overhaul of a majority of the modules but aims to make Gephi simpler and more intuitive to use. We already have a good diagnostic of the user experience issues in Gephi but need to explore different designs. In an upcoming blog post, I will explain our thought process on this topic with the help of Professore Donato Ricci, senior interaction designer. Eventually, the 1.0 version will be worked on and released.

Gephi is almost 10 years old. It is usable but still plagued with many well-known issues. Though sometimes frustrating, it allows users to do incredible things. We think Gephi is still relevant to research, journalism, civil society and more. We are going to give it the renewal it deserves.

Mathieu Jacomy

ForceAtlas2, the new version of our home-brew Layout

The new version of the build-in layout ForceAtlas is now released. It is scaled for small to medium-size graphs, and is adapted to qualitative interpretation of graphs. The equations are the same as ForceAtlas 1, but there are more options and innovative optimizations that make it a very fast layout algorithm.

It is good enough to deal with very small graphs (10 nodes)  and fast enough to spatialize 10,000 nodes graphs in few minutes, with the same quality. If you have time, it can deal with even bigger graphs.

Update Gephi (Help > Check for Updates) to get this new layout.

Force Atlas 2:

  • Is a continuous algorithm, that allows you to manipulate the graph while it is rendering (a classic force-vector, like Fruchterman Rheingold, and unlike OpenOrd)
  • Has a linear-linear model (attraction and repulsion proportional to distance between nodes). The shape of the graph is between Früchterman & Rheingold’s layout and Noack’s LinLog.
  • Features a unique adaptive convergence speed that allows most graphs to converge more efficiently
  • Proposes summarized settings, focused on what impact the shape of the graph (scaling, gravity…). Default speed should be the good one.
  • Now features a Barnes Hut optimization (performance drops less with big graphs)

 

 

Force Atlas 2 features these settings:

  • Scaling: How much repulsion you want. More makes a more sparse graph.
  • Gravity: Attracts nodes to the center. Prevents islands from drifting away.
  • Dissuade Hubs: Distributes attraction along outbound edges. Hubs attract less and thus are pushed to the borders.
  • LinLog mode: Switch ForceAtlas’ model from lin-lin to lin-log (tribute to Andreas Noack). Makes clusters more tight.
  • Prevent Overlap: Use only when spatialized. Should not be used with “Approximate Repulsion”
  • Tolerance (speed): How much swinging you allow. Above 1 discouraged. Lower gives less speed and more precision.
  • Approximate Repulsion: Barnes Hut optimization: n² complexity to n.ln(n) ; allows larger graphs.
  • Approximation: Theta of the Barnes Hut optimization.
  • Edge Weight Influence: How much influence you give to the edges weight. 0 is “no influence” and 1 is “normal”.

 

 

Force Atlas 2 was created by Mathieu Jacomy at the Sciences Po Médialab (Paris), founding member of the Gephi Consortium.