Authors: Mathieu Jacomy, Alexis Jacomy, Paul Girard, and Benoît Simard
It is a spin-off
For a long time, one could think of the Gephi as both a piece of software and a project; the purpose of the project would be to develop and maintain the tool. But from the start, our project was about more than Gephi’s code: websites, tutorials, forums, plugins, events, social media, fundraising… We now take this further with an unprecedented decision (for us): the Gephi project will host 2 different tools.
Gephi Lite will be a web version of Gephi, with the same basic features, but reduced to a minimalistic package. It will remain similar to Gephi and be compatible with it, but will have less options. It will not be able to open big networks, but it will be simpler and more ergonomic.
It will not dilute our efforts to maintain and improve Gephi. If you think of the energy we can afford to dedicate to Gephi as a cake, you may think that we will now have to split the cake into 2 shares, one for Gephi and one for Gephi Lite. But in fact, we will have a bigger cake. Indeed, the development is taken in charge by a different team, namely Alexis Jacomy, Paul Girard and Benoît Simard from Ouestware.
Bootstrapping Gephi Lite during the Gephi Week
The Gephi community has known Ouestware for a long time thanks to their work on network visualization on the web, to which we must add Guillaume Plique from the Sciences Po médialab. Let us call them the JS community in this context. Over the years, these developers have been contributing to a long series of libraries, prototypes, and tools adjacent to Gephi. Here is a quick list of their collective contributions to network visualization:
- Libraries: Graphology, to handle graphs; and SigmaJS, to render them on the web.
- Prototypes: ManyLines, to share networks as explorable slides; MiniVan, to share networks online as browsable documents; and a top-secret project to make up networks quickly.
- Tools: Retina, an online network visualizer with filtering and search; and Gephisto, a one-click network visualizer for teaching.
We have also given a joint talk about Gephi and JS at the FOSDEM. We mention all of this to highlight that there was a fertile ground for something new. The skills were there, but everyone had pushed their own projects and explored different directions. Could we coordinate our efforts in the future?
Long story short: we met at the Gephi Week, it happened, and the outcome is Gephi Lite. Here is an account of the project after the Gephi Week and a follow-up sprint at OuestWare.
What makes Gephi Gephi?
We explored that question because we had to answer this: what should Gephi Lite be like? Indeed, what makes Gephi Lite different from the tools above is that it tries to stick to the Gephi recipe. But what is that recipe?
We distilled Gephi to this feature set:
- Load data
- Render layouts
- Compute metrics
- Apply filters
- Select data
- Set the semiotics (appearance panel)
- Save data
- Export as images
- Export on the web
- Manual intervention (create and edit data, for ex. attributes)
- Plugins (Note: we will leave this aside for Gephi Lite for now)
To which we add:
- Gephi Lite must interoperate with Gephi
- Reuse the Gephi look and feel when possible (consistency)
We used this list as a starting point to decide Gephi Lite’s scope.
About the name: what does “Lite” mean?
Lite means that Gephi Lite will always be at a higher level than Gephi desktop. Further from the metal, as we say. More blackboxed, with more layers of software. More usable, but at the core, less efficient. This is why it will have lower scaling capacities in terms of size of graph.
Lite also means that we aim at less complex usages than Gephi. This principle has to be taken as a general guideline and not a strict rule. Indeed Gephi Lite differs also by the fact that it is on the web. It is a drawback at times, but it also brings opportunities to do things differently and add new features. So Lite does not mean that Gephi Lite is Gephi with missing pieces. It has its own feature set..
We considered the name “Gephi Web” but we decided that making it clear that it would not be as scalable as Gephi would help manage people’s expectations. The discussion is not entirely closed, though.
Here is where Gephi Lite will differ from Gephi. The semiotic work on the visualization (node color and size, edge thickness…) will always be tied to the data. That is why we call it “data-driven”.
In Gephi, you can do whatever you want. You can manually paint a bunch of nodes in red. You can paint some nodes with a gradient of colors representing their degree, and other nodes with a color representing a cluster they belong to. You can play with this feature. You can do art. You can do things so weird you could not even explain. It’s flexible and powerful, but it comes with complications. And notably, you cannot always have a caption, because Gephi cannot keep track of what the colors or sizes mean.
Gephi Lite has a limited set of features, and it sometimes creates opportunities. We decided to allow no manual coloring of the nodes and edges, and to use a rule-based mapping of colors and sizes. For example, you can only apply colors according to an attribute or a simple combination of attributes. Think of the rule as something like “Democrat blogs in blue and Republicans in red”, or something a bit more complicated but not too much. In Gephi Lite, the appearance of nodes and edges will be fully determined by such rules. As a benefit, it can keep track of what the colors mean, apply them dynamically, and build a caption. For most users, it will be simpler.
For the record, to make this system work, we settled on this set of features:
- No manual coloring.
- Add quali/quanti tags on node/edge attributes to help users make meaningful semiotic choices.
- Nodes/edge appearance is dynamic: we watch attribute changes.
- Appearance is determined by rules which are always applied, even if not in the filtered version. In other words, the modalities taken into account to set colors account for hidden nodes as well. It does not depend on how the network is currently filtered.
- Missing values are systematically handled.
- Gephi Lite will be able to draw a caption.
- Gephi Lite will contribute and use the GEXF new v1.3 spec by prototyping it and contributing to its specifications.
- Original GEXF viz attributes will be used as special “gexf_viz” prefixed data attribute when no caption is present (GEXF <= 1.2) to be able to reuse it at export and use those default in the default appearance state.
Like in Gephi, a dedicated space in the user interface will allow setting the semiotics of the map. In Gephi, it is the appearance panel. A similar space will exist in Gephi (whether that’s a panel or something else).
As we have just seen, the semiotics are rule-based and dynamic. In addition to this, we decided the following:
- Appearance gathers all visual variables to draw nodes, edges and their labels (size, color).
- Nodes and edges labels sizing will be dealt with in the appearance bloc.
- Considered feature: applying different rules to different parts of the graph (see “partitions” later).
- Appropriately handle missing values, anomalous values (ex: a string among numbers), unexpected values (ex: negative weight), and errors.
- Always ask the user how to render undefined values. Undefined values could be the cases above or valid values that have not been set for different reasons. Those can typically be dealt with using a default color and size.
Something we have been discussing but we have not solved yet: where is the caption? We could generate a caption on-demand, but since the appearance is fully dynamic, we could as well have a caption accessible at all times. Is the caption part of the appearance panel? If not, is it redundant? We will be iterating over this question.
The filters UI in Gephi is both too complex for the scope of Gephi Lite, and inconsistent with web UX design. WE have an opportunity to do better, albeit simpler. We chose a new abstraction, that is less flexible but much simpler to manipulate:
- Filters are a stack (and not a tree like in Gephi)
- Each filter is applied on the graph resulting from the previous filter (they cascade)
- The filters can be on nodes or edges
- A filter can be related to an attribute, a custom script (written or pasted by the user), or a topological filter (ex: the main component filter).
We have to experiment with the design of filtering, but let us acknowledge that filtering has to be simple for the user. Our priority is to keep our user experience straightforward.
Gephi Lite will feature statistics (computable metrics), although with less choice as Gephi. Those statistics are those included in Graphology, and if we add new ones they will also be included in Graphology. We want to make it possible to choose the name of the attribute where the output is generated (with a warning if it already exists).
Similarly to statistics, we will use the Graphology layout algorithms and possibly extend them.
Not only related to the layout, but let us share with you a complicated question that can give you a practical idea of the design challenges we face. In Gephi, the node size is relative to the layout. They use the same coordinate system. By contrast, in other contexts like with Sigma, the node size may vary independently. The scaling of the layout is, after all, arbitrary. But because Gephi Lite is a companion to Gephi, we want to enforce consistency between them. Therefore the node size should behave similarly to Gephi. It turns out that this behavior conflicts with the current architecture of Sigma, due how the renderer layer behaves. We see no other solution than to introduce a breaking change in Sigma (the v3 is therefore in preparation).
The simplification of the filters and appearance systems gets in the way of some popular scenarios that we want to make possible in Gephi Lite. We are therefore adding a new concept to support these advanced uses, in a way that would be transparent to beginners. We call this feature buckets.
A bucket is a set of nodes and edges. A subgraph, technically. But you may think of it as a partition of your network, or a layer, or a selection. It’s just a way to handle a subset of your network in a few places where we think it is necessary.
The users who do not understand this concept can ignore it completely. It is not required to know what a bucket is in most situations. However, you may need it if you feel limited by the way filters and appearance settings work. For example if you have a two-mode network and you want to set the size of the nodes according to their degree, but with a different scale for each type of node, because one type consists of a few nodes with many links (they would be too big) and the other of many nodes with a few links (they would be too small). If you meet this kind of problem, then using buckets is the way to go.
We are aware that this feature makes the memory structure of Gephi Lite a bit more complicated to write, but we consider that it is worth the effort.
We identified a few problems. For instance:
- The appearance panel will be heavier than in Gephi
- It is just too heavy to display all panels all the time
- We cannot have tabs because the browser already has tabs
- We do not want to split the app into pages (screens) for similar reasons
The solution we found (currently) entails:
- A compact sidebar on the left with shortcuts to different panels: metrics, layout, appearance and filter.
- When clicking on a shortcut in the compact sidebar, the corresponding panel opens in an additional sidebar next to the compact sidebar (the panel unfolds in a collapsible column).
- A column on the right with graph context (visible nodes and edges) and contextual information and actions (for instance, what is selected etc.)
Our approach is to design by drafting (no mockups) but leaving aside all graphic design choices for the moment. Those will be designed later on in “live wireframes” (or as we call it, “ugly soulless prototypes”).
We have three personas in mind when making our design decisions:
- The cartographer
- The data scientist
- The collaborator: someone who wants to share the exploration of a network
Cloud file management
Because we are on the web, it can be really useful to save the project’s file on the cloud. To achieve that we identified the following needs:
- To sign-in
- To list and/or search the files that are compatible with Gephi Lite
- To load a file
- To save a file
The first implementation that we want to provide is Github Gist. Gephi has a plugin to publish a graph on the web that generates a GEXF file and saves it in Github as a Gist (we will post about that at a later point). Github Gist allows CORS (a major constraint of this approach), so an internet application can load a gist file like Retina does.
How we see the lifecycle of a file in Gephi Lite:
- Use Gephi
- Export on the web
- Use Gephi Lite
- Import the GEXF into Gephi
The last part (importing a remote GEXF file in Gephi) doesn’t exist yet, but it’s easy to develop as a plugin. Using Github Gist gives us also the opportunity to see revisions of a file (history management, rollback…). This system is compatible with other providers that we could add in the future like Nextcloud, Google Drive, Dropbox, etc.
Our design intentions, as stated in this post, can be seen as a long-term road map for the project. Our short-term goal is the MVP, the “minimum viable product”. The MVP is the smallest version of Gephi Lite that can be useful, the point before which it makes no sense to release the tool. Therefore, developing the features of the MVP are the priority. The rest is “nice to have” because it requires the MVP to work. But deciding what belongs to the MVP is not just a matter of technical constraints, it is also a subjective call about what “necessary” and “useful” mean in the context of network analysis.
No final choice has been made for the moment, and we need a better view of the implementation complexity of the various components we have to do. But some things are already clear to us.
We do want the following features in the MVP:
- Load and save a GEXF file, local or GIST
- Visualize (zoom, pan the view, search)
- Appearance (node color and size)
- Filter (at least one)
- Statistics (at least one)
- Layout (Force Atlas 2)
We can wait later for:
- Custom scripts
- Data edition
The next work iteration on Gephi Lite should happen in early 2023. There will be no prototype release before then. We will communicate on our advances at that point. See you there!
– Mathieu, Alexis, Paul and Benoît
can you provide a worked example of gephi in action?