Advanced Documentation

In the steps below we show you some specific use cases of our LN traffic simulator.

We suppose that you have already

  • installed lnsimulator and downloaded the related LN data. If it is not the case then should follow the instructions in the Getting Started section first.
  • understood the data preparation steps needed for payment simulation. We refer you to the Basic Documentation if you missed this step.

Preparation

Quickly execute all steps (e.g. imports, loading channel and merchant data) needed before advanced experiments

from lnsimulator.ln_utils import preprocess_json_file
import lnsimulator.simulator.transaction_simulator as ts

data_dir = "../ln_data/" # path to the ln_data folder that contains the downloaded data
directed_edges = preprocess_json_file("%s/sample.json" % data_dir)

import pandas as pd
node_meta = pd.read_csv("%s/1ml_meta_data.csv" % data_dir)
providers = list(node_meta["pub_key"])

# the number of simulated payments and the payment value is fixed in this notebook

amount = 60000
count = 7000

Parameters explained

Here is the list of main parameters. By the word “transaction” we refer to LN payments.

Parameter Description Default value
amount value of each simulated transaction in satoshis Must be set
count number of random transactions to sample Must be set
epsilon ratio of merchants among transactions endpoints 0.8
drop_disabled drop temporarily disabled channels True
drop_low_cap drop channels with capacity less than amount True
with_depletion the available channel capacity is maintained for both endpoints True

The following examples will help you to understand the effect of each parameter. As amount and count are very straightforward parameters we will start with how to set merchant ratio for payment receivers.

Merchant ratio for payment receivers

The number of unique receivers is the highest when receivers sampled uniformly at random (epsilon=0.0) while you have the chance to send payments only to merchants (epsilon=1.0). In most of our experiments we sample merchant receivers with high probability (epsilon=0.8) but we also select random receivers as well with small probability.

only_merchant_receivers = ts.TransactionSimulator(directed_edges, providers, amount, count, epsilon=1.0)
many_merchant_receivers = ts.TransactionSimulator(directed_edges, providers, amount, count, epsilon=0.8)
uniform_receivers = ts.TransactionSimulator(directed_edges, providers, amount, count, epsilon=0.0)
print(only_merchant_receivers.transactions["target"].nunique())
print(many_merchant_receivers.transactions["target"].nunique())
print(uniform_receivers.transactions["target"].nunique())

Control channel exclusion with drop_disabled and drop_low_cap

Channels can be temporarily disabled for a given snapshot while active for others. If you want to enable disabled channels in your experiments then use drop_disabled=False. But

default_sim = ts.TransactionSimulator(directed_edges, providers, amount, count, drop_disabled=True, drop_low_cap=True)
with_disabled_sim = ts.TransactionSimulator(directed_edges, providers, amount, count, drop_disabled=False, drop_low_cap=True)
print(default_sim.edges.shape)
print(with_disabled_sim.edges.shape)

A payment can only be forwarded on a given channel if the channel capacity is at least the value of the payment (drop_low_cap=True). But in the simulation you have the possibility to disabled this condition (drop_low_cap=False).

with_lowcap_sim = ts.TransactionSimulator(directed_edges, providers, amount, count, drop_disabled=False, drop_low_cap=False)
print(with_lowcap_sim.edges.shape)

Updating node balances with payments

Individual balances of LN nodes is a private data but lnsimulator can keep track of capacity imbalances (with_depletion=True) as payments are executed on the fly. After distributing capacities randomly between related channel endpoints (initialization step), our simulator can monitor whether a node has enough outbound capacity on a given channel to forward the upcoming payment with respect to the payment value. This feature has several advantages:

  • ability to detect node capacity depletions in case of heavy one-way traffic
  • better understanding of payment failures

In case you disable this feature (with_depletion=False) payments can pass a channel in a fixed direction infinitely many times as long as the payment value is at most the channel capacity.

In the next example we observe the payment failure ratio with respect to the with_depletio parameter.

sim_with_dep = ts.TransactionSimulator(directed_edges, providers, amount, count, with_depletion=True)
_, _, _, _ = sim_with_dep.simulate(weight="total_fee")


sim_wout_dep = ts.TransactionSimulator(directed_edges, providers, amount, count, with_depletion=False)
_, _, _, _ = sim_wout_dep.simulate(weight="total_fee")

Transaction success rates are lower if capacity depletion is enabled (with_depletion=True). This indicates that there are channels with heavy one-way traffic.

print("Succes rate with depletions:", sim_with_dep.transactions["success"].mean())
print("Succes rate without depletions:", sim_wout_dep.transactions["success"].mean())

Advanced simulation features

In the past experiment after initializing your simulator the simulate() function executed cheapest path routing by default without modifying the available channel data. Now let’s see some additional use cases.

sim = ts.TransactionSimulator(directed_edges, providers, amount, count)

Routing algorithm

For now you can choose between two routing algorithms by setting the weight parameter

  • cheapest path routing (weight="total_fee" - DEFAULT SETTING)
  • shortest path routing (weight=None)
shortest_paths, _, _, _ = sim.simulate(weight=None)
cheapest_paths, _, _, _ = sim.simulate(weight="total_fee")

Filter out payments that could not be routed (they are denoted with length==-1)

Then observe the average path length for the simulated payments

print(shortest_paths[shortest_paths["length"]>0]["length"].mean())
print(cheapest_paths[cheapest_paths["length"]>0]["length"].mean())

Node removal

You can observe the effects of node removals as well by providing a list of LN node public keys. In this case every channel adjacent to the given nodes will be removed during payment simulation.

In this example we exclude the top 5 nodes with highest routing income

_, _, all_router_fees, _ = sim.simulate(weight="total_fee")
print("Succes rate BEFORE exclusion:", sim.transactions["success"].mean())

top_5_stats = all_router_fees.groupby("node")["fee"].sum().sort_values(ascending=False).head(5)
print(top_5_stats)
top_5_nodes = list(top_5_stats.index)

You can observe how the payment success rate dropped by removing 5 important routers

_, _, _, _ = sim.simulate(weight="total_fee", excluded=top_5_nodes)
print("Succes rate AFTER exclusion:", sim.transactions["success"].mean())

Node capacity reduction

You can observe the traffic changes for a node by reducing its capacity to a given fraction of its original value.

In this example we reduce capacity to 10% of the top 5 nodes with highest routing income. Then we compare their new income with the original.

_, _, reduced_fees, _ = sim.simulate(weight="total_fee", cap_change_nodes=top_5_nodes, capacity_fraction=0.1)
print("Succes rate AFTER capacity reduction:", sim.transactions["success"].mean())

new_stats = reduced_fees.groupby("node")["fee"].sum()
old_and_new = top_5_stats.reset_index().merge(new_stats.reset_index(), on="node", how="left", suffixes=("_old","_new"))
print(old_and_new.fillna(0.0))

Longer path (genetic) routing

In our paper we proposed a genetic algorithm to find cheap paths with at least a given length (required_length parameter). By default genetic routing is disabled (required_length=None).

We note that..

  • using a higher value for required_length could increase the running time significantly.
  • payment paths with length 1 (direct channels) won’t be forced to longer as with zero intermediary node there is no privacy issue here
  • if the genetic algorithm cannot find a path with the required length then it will return a path that is lower than this bound

In this example we will observe the path length distribution for different values of the required_length parameter

sim_for_routing = ts.TransactionSimulator(directed_edges, providers, amount, count)

min_path_l = [2,3,4]
length_distrib_map = {}

for length_value in min_path_l:
    cheapest_paths, _, _, _ = sim.simulate(weight="total_fee", required_length=length_value)
    length_distrib_map[length_value] = cheapest_paths["length"].value_counts()
    print(length_value)

Observe the fraction of path with a given length (rows). The columns represent values of the required_length parameter. The fraction of path in (3,3) and (4,4) cell of the heatmap are indeed high due to longer path routing. We note that the row with index -1 represent the fraction of failed payments.

import seaborn as sns

distrib_df = pd.DataFrame(length_distrib_map)
distrib_df = distrib_df / distrib_df.sum()
sns.heatmap(distrib_df.loc[[-1,1,2,3,4]], cmap="coolwarm", annot=True)

Base fee optimization

In the Lightning Network data that we observed more than 60% of the nodes charged the default base fee. From a node’s position in the network lnsimulator can estimate the base fee increment needed to achieve optimal routing by setting with_node_removals=True and calling calc_optimal_base_fee function afterwards. For now optimal base fee search is enabled only for cheapest path routing (weight="total_fee). We also recommend you apply parallelization by setting a higher value for the max_threads parameter.

sim_fee_opt = ts.TransactionSimulator(directed_edges, providers, amount, count)
shortest_paths, alternative_paths, all_router_fees, _ = sim_fee_opt.simulate(weight="total_fee", with_node_removals=True, max_threads=2)
opt_fee_df, _ = ts.calc_optimal_base_fee(shortest_paths, alternative_paths, all_router_fees)
print(opt_fee_df.head())

The result of calc_optimal_base_fee contains the following informations.

Column Description
node LN node public key
total_income routing income
total_traffic number of routed transactions
failed_traffic_ratio ratio of failed transactions out of total_traffic payments if node is removed from LN
opt_delta estimated optimal increase in base fee
income_diff estimated increase in daily routing income after applying optimal base fee increment

In the next step we transform the opt_delta column into a categorical feature that represent the increment magnitude.

def to_category(x):
    if x > 100:
        return 3
    elif x > 10:
        return 2
    elif x > 0:
        return 1
    else:
        return 0
    
print("The magnitude distribution of base fee increments:")
print(opt_fee_df["opt_delta"].apply(to_category).value_counts())