Community Detection in Networks

Community detection stands as a robust mechanism to spot densely interconnected node groups within a network sharing similarities. This tool proves invaluable in realms like supply chain management—facilitating the segmentation of vast supply chain networks into smaller clusters based on supplier or customer parallels. This ensures resource optimization and elevates risk management. In the sphere of chemical process management, community detection unveils reactant groups that either partake in a specific reaction route or boast analogous chemical attributes, promoting enhanced process design and optimization.

Using Graphistry and igraph (CPU):


import graphistry
import pandas as pd

# Load the sample chemical process network
edges = pd.read_csv('process_network.csv')
g = graphistry.edges(edges, 'src', 'dst').materialize_nodes()

# Run the Louvain algorithm for community detection
g = g.compute_igraph('louvain')

# Map the detected communities to a categorical color palette
color_palette = {
    0: "red",
    1: "blue",
    2: "green",
    3: "yellow",
    # ... Extend for other community indices
}
g = g.encode_point_color('louvain', as_categorical=True, categorical_mapping=color_palette, default_mapping="#CCC")

# Visualize
    g.plot()

Using Graphistry and cuGraph (GPU):


# The initial steps for loading and setting up the graph remain the same

# Convert the graph to an igraph object and run the Louvain algorithm for community detection at GPU scale
g = g.compute_cugraph('louvain')

# Visualize the graph with the nodes colored by community
g = g.bind(point_color='louvain')
g.plot()

This code exemplifies loading a sample chemical process network from an edge list file, transitioning it to a Graphistry graph object. The subsequent steps involve running the Louvain algorithm for community detection and graph visualization, coloring nodes by community. The example showcases the usage of compute_igraph() to derive communities through igraph's implementation. Additionally, the bind() function assigns node attributes to the name column of the original dataset and the Louvain column of the deduced communities.