Greedy

Greedy-family methods iteratively construct the seed set by maximizing the marginal spread gain in each round. They directly optimize:

\[\Delta(v \mid S) = f(S \cup \{v\}) - f(S)\]

where \(f(\cdot)\) is the expected influence spread under the selected diffusion model.

Under classical assumptions where the influence objective is monotone and submodular, the standard greedy algorithm provides a \((1 - 1/e)\) approximation guarantee. In engineering practice, the objective value is estimated by Monte-Carlo simulation. FS_GPlib provides both the vanilla version and a caching-enhanced variant.

Available Algorithms

GreedyIM

class fs_gplib.InfluenceMaximization.greedy.GreedyIM(model, seed_size: int, influenced_type: List[int] | None = None, MC: int = 1000, iterations_times: int = 100, verbose: bool = True)[source]

Bases: BaseInfluenceMaximizer

Greedy Influence Maximization algorithm.

Selects seeds iteratively, each time picking the node that maximizes the marginal influence spread. Under classical assumptions where the influence objective is monotone and submodular, this provides a (1 - 1/e) approximation guarantee for the optimal seed set. In practice, spread is estimated by Monte-Carlo simulation.

Algorithm:

S = {} for i in range(k):

for each node v not in S:

compute marginal_gain = spread(S ∪ {v}) - spread(S)

S = S ∪ {argmax_v marginal_gain}

Parameters:
  • model -- A diffusion model from fs_gplib.Epidemics.

  • seed_size -- Number of seeds to select.

  • influenced_type -- Set of node state values that count as influenced.

  • MC -- Monte-Carlo simulations per spread estimate.

  • iterations_times -- Diffusion steps per simulation.

  • verbose -- Whether to show progress during selection.

Example:

from fs_gplib.InfluenceMaximization import GreedyIM
from fs_gplib.Epidemics import SIRModel

model = SIRModel(
    data=graph_data,
    seeds=[],
    infection_beta=0.05,
    recovery_lambda=0.01
)

im = GreedyIM(
    model=model,
    seed_size=10,
    influenced_type=[1, 2],
    MC=500,
    iterations_times=100
)
seeds = im.fit()
fit() List[int][source]

Run the greedy algorithm to select seed nodes.

Returns:

List of selected seed node indices.

Return type:

list[int]

GreedyIMWithCaching

class fs_gplib.InfluenceMaximization.greedy.GreedyIMWithCaching(model, seed_size: int, influenced_type: List[int] | None = None, MC: int = 1000, iterations_times: int = 100, verbose: bool = True)[source]

Bases: BaseInfluenceMaximizer

Greedy IM with spread caching to reduce redundant simulations.

This version caches spread estimates for previously evaluated seed sets, significantly reducing computation when the same seed combinations are evaluated multiple times.

Parameters:
  • model -- A diffusion model from fs_gplib.Epidemics.

  • seed_size -- Number of seeds to select.

  • influenced_type -- Set of node state values that count as influenced.

  • MC -- Monte-Carlo simulations per spread estimate.

  • iterations_times -- Diffusion steps per simulation.

  • verbose -- Whether to show progress.

Example:

im = GreedyIMWithCaching(
    model=model,
    seed_size=10,
    influenced_type=[1, 2],
    MC=500
)
seeds = im.fit()
estimator = im.get_estimator()
print(f"Total spread estimates: {estimator.get_call_count()}")
fit() List[int][source]

Run greedy with caching to select seed nodes.

Returns:

List of selected seed node indices.

Return type:

list[int]

get_estimator() SpreadEstimator[source]

Return the spread estimator for accessing cache statistics.

Returns:

The SpreadEstimator instance.

Return type:

SpreadEstimator

Advanced Interface

GreedyIMWithCaching exposes get_estimator() for inspecting cache statistics (for example total spread-estimation calls and cache size).