From 29 August to 2 September 2022, about 20 people met in Paris and online to make the Gephi codebase more sustainable, discuss the project, experiment with potential features, improve the design, and get closer to the 1.0 version. It was a follow-up to the 2021 code sustainability retreat, and its theme was community detection. In this post we present what we have done.
The event was sponsored by the SoBigData++ project, hosted by the Sciences Po médialab, and live-streamed by Nicolas Bouchaib from First Link. Tommaso Venturini, Axel Meunier and Simon Bourdieu-Apartis carried the burden of organization. We warmly thank all of them for having made this event possible!
We have covered a broad spectrum of topics. Find the list below. Just keep in mind that not every project could be finalized during the week. About a half of the contributions will need some time to be released to the users. The forthcoming 0.10.0 version will include the rest, and is to be expected for the end of the year (2022).
These features and experimentations will be developed in upcoming blog posts:
- Gephi Lite, an upcoming web version of Gephi. Alexis Jacomy has been paving the road map and leading the discussion about its features.
- Revamping the icons in Gephi. Côme Brocas reworked the icons system and Mathieu Bastian reworked the implementation. This will also contribute to the upcoming dark mode!
- New web export based on OuestWare’s Retina, with a plugin developed by Clément Levallois and Alexis Jacomy.
- New Neo4J plugin, developed by an expert of that technology, Benoît Simard.
- Rethinking how we visualize community detection in Gephi, and notably when it comes to the ambiguity about the groups with which each node can be associated with. Tommaso Elli, Andrea Benedetti, Mathieu Jacomy and Guillaume Plique reflected on visualizing the process of the algorithm. Benjamin Ooghe-Tabanou, Étienne Côme and Guillaume reflected on metrics and visualizations to assess the ambiguity itself.
- Video presenting the codebase, by Mathieu Bastian. To be released soon!
These features are developed below in this post:
- Allowing the export of node coordinates as columns in the data, an often demanded feature added by Sukankana Chakraborty.
- Making the edge types editable in the data laboratory, which matters to multigraphs, by Matthieu Totet.
- Exporting the same node borders as in the overview, by Roberto Luna-Garcia.
- Exporting with a transparent background, by Roberto Luna-Garcia.
- Adding arrows to curved edges when you export an image. Mathieu Jacomy tackled this seemingly simple issue, but it was more complicated than it looked.
- Revamping the online documentation for developers notably, by Mathieu Bastian and Matthieu Totet.
A few general points before diving into the details.
Gephi is expanding to the web. We will develop this in an upcoming post, but in short, we are committed to stabilize a web version of Gephi, with a reduced scope but a more modern UX, called Gephi Lite. The team at OuestWare (Alexis Jacomy, Benoît Simard and Paul Girard) has taken the lead of this branch of our project. It will be based on Graphology and SigmaJS, and benefit from the invaluable help of Guillaume Plique.
Gephi is popular, and many people are willing to help the project. This second edition attracted more participants than the last. More varied people, too: designers, researchers, data analysts, content creators, OSINT practitioners, and developers. Those categories are not mutually exclusive.
It is hard to recruit Java developers. One of the reasons seems to be that Java Desktop and Swing are not sexy, but more importantly, we are not that well connected to the Java dev world. We find our contributors either through plugins, or in the overlap of data science and dev: people who use Gephi and happen to also know how to dev. We will keep communicating about our need to stabilize a community of developers, and we believe that a lively non-dev community around Gephi (users, content creators, designers…) contributes indirectly to a more lively dev community.
We still need to stabilize the codebase. We are not ready yet to move to finalizing a version 1.0, at least because we still need to rework the visualization engine to get rid of unmaintained dependencies. This will require a separate effort later on this year.
The Gephi Week was very beneficial to the project. Although we struggle to be attractive to Java developers, the Gephi Week was an occasion for everyone to improve their knowledge about the codebase. Some contributors like myself were rusty, and it was for us an opportunity to exercise our coding muscles again, under the excellent coaching of Mathieu Bastian, who also recorded a guided tour of the codebase. Newcomers could also learn the basics, and the codebase received more scrutiny. Little by little, we build the ability to help and support each other, and improve our autonomy. And beyond the central concern about the sustainability of the codebase, the project immensely improved in many unexpected directions, such as rethinking the design of popular features like community detection, revamping big parts of the visual identity (icons), and building a web sibling to the Java version, Gephi Lite. Even beyond these developments, the coding retreat spawned satellite events like a meeting with the local OSINT community (many Gephi users!), live-streaming with YouTubers, and discussing with renowned researchers. Around the coding retreat, something like a mini-festival is growing by itself.
We remain committed to keeping this event yearly, and we expect it to grow again next year.
As our wrap-up was live streamed, we had the opportunity to share that moment with you. The stream has been cleaned up and sliced. We published it on our YouTube channel as a playlist (see below). The playlist, about 100 minutes-long, is in the order it was recorded. For a more thematic approach, each video will be featured separately as we explain what we have done during the Gephi Week, starting in the next section.
More about what we have done
Allowing the export of node coordinates
An often demanded feature added by Sukankana “Schuh” Chakraborty. The (x,y) coordinates of nodes are native to Gephi. They are not like any other attribute insofar as they are used to draw the layout. As an unfortunate consequence, they used to be omitted during the export of data. Which is a problem, notably if you want to draw the nodes in another environment like Tableau. Schuh addressed this issue and added the option.
Making the type of edges editable
In multigraphs, each edge has a given “type”, also sometimes called “kind”. Those differentiate the represented relations, for example mother, sister, niece… Like for Schuh’s issue just before, the edge type was a special attribute, and we could not change it in the data laboratory. As Mathieu Bastian explains below, Matthieu Totet addressed the issue (he could not be present during the wrap-up).
Exporting the same node borders as in the overview
You may have noticed that the nodes have a different look in the Overview and in the Preview. The Preview (the image exporter) can generally do more than the Overview, but one feature was missing: having node borders colored with a darker version of the node color. Roberto Luna-Garcia added this option to the settings.
Exporting a PDF with transparent background
Roberto also addressed the need to export network map with a transparent background:
Adding arrows to curved edges
When Mathieu Jacomy picked this seemingly simple issue, he thought it would take a few hours. Alas, deep down the rabbit hole, a much more fearsome beast awaited. Bezier curves had to be replaced with circle arcs, which came with their own share of implementation weirdness, as each renderer speaks three different languages: SVG, PDF, and Java2D.
Revamping the online documentation
How to write accessible documentation for developers? Matthieu Totet and Mathieu Bastian drew inspiration from the OpenRefine community, and reworked the system around Gephi, with a good share of automation.
This event is supported by the European Union – Horizon 2020 Program under the scheme “INFRAIA-01-2018-2019 – Integrating Activities for Advanced Communities”, Grant Agreement n.871042, “SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics” (http://www.sobigdata.eu).