Worked examples

Four short, end-to-end recipes. All four use the ECSI customer-satisfaction dataset that ships with the test suite (tests/data/satisfaction.csv); the structural model has 6 LVs and 27 indicators, with about 250 respondents.

If you have the repo checked out, the snippets run as is. Otherwise grab satisfaction.csv from the repo.

1. Customer satisfaction (the ECSI 6-LV model)

The canonical PLS-SEM example. Image, Expectations, Quality, Value feed into Satisfaction; Satisfaction and Image feed into Loyalty.

import pandas as pd
import openpls.config as c
from openpls import Plspm
from openpls.mode import Mode
from openpls.scheme import Scheme

satisfaction = pd.read_csv("tests/data/satisfaction.csv", index_col=0)

structure = c.Structure()
structure.add_path(["IMAG"], ["EXPE", "SAT", "LOY"])
structure.add_path(["EXPE"], ["QUAL", "VAL", "SAT"])
structure.add_path(["QUAL"], ["VAL", "SAT"])
structure.add_path(["VAL"], ["SAT"])
structure.add_path(["SAT"], ["LOY"])

config = c.Config(structure.path(), scaled=False)
for lv in ["IMAG", "EXPE", "QUAL", "VAL", "SAT", "LOY"]:
    config.add_lv_with_columns_named(lv, Mode.A, satisfaction, lv.lower())

fit = Plspm(satisfaction, config, Scheme.CENTROID)

print("Path coefficients:")
print(fit.path_coefficients().round(3))

print("\nHTMT (discriminant validity, lower is better, threshold 0.85):")
print(fit.htmt().matrix().round(3))

print("\nModel fit:")
print(f"SRMR  = {fit.model_fit().srmr():.4f}")
print(f"d_ULS = {fit.model_fit().d_uls():.4f}")

What to look for:

SAT and QUAL should both have R squared above 0.7 (this is a well-fitting textbook model).
HTMT values should mostly sit below 0.85.
SRMR is typically around 0.05 to 0.07 here, comfortably below the 0.08 conventional threshold.

2. IPMA on the satisfaction model

For the SAT target, IPMA tells you which predecessor LV has the best combination of high importance (large total effect on SAT) and low performance (low rescaled score), the candidates for management intervention.

ipma = fit.ipma(target="SAT", scale_min=1.0, scale_max=10.0)

print("LV-level importance and performance for SAT:")
print(ipma.latent_variables().round(3))

print("\nIndicator-level breakdown:")
print(ipma.indicators().round(3))

scale_min=1.0 and scale_max=10.0 match the ECSI 10-point Likert scale used in the data. If you omit both, each indicator is rescaled from its observed min and max instead.

The LV table shows importance (standardized total effect) and performance (mean of the 0-100-rescaled LV score). Plot performance against importance to get the four-quadrant IPMA chart: high importance plus low performance is the high-leverage quadrant.

3. Moderation: IMAG x EXPE -> SAT

Does the effect of Customer Expectations on Satisfaction depend on Image? Two-stage moderation fits a base model, then refits with the standardized product of IMAG and EXPE scores as a single-indicator interaction LV.

from openpls.moderation import Moderation

mod = Moderation(
    satisfaction,
    config,
    predictor="EXPE",
    moderator="IMAG",
    target="SAT",
)

print(f"Interaction LV name: {mod.interaction_name}")
print("\nBase model paths into SAT:")
base_paths = mod.base().path_coefficients().loc["SAT"]
print(base_paths[base_paths != 0].round(3))

print("\nRefit paths into SAT (with interaction):")
refit_paths = mod.refit().path_coefficients().loc["SAT"]
print(refit_paths[refit_paths != 0].round(3))

print("\nInteraction effect (interaction -> SAT):")
print(mod.interaction_effect().round(4))

A non-zero estimate on the interaction path means the EXPE-to-SAT effect varies with the level of IMAG: when IMAG is high, EXPE pushes SAT harder (or weaker), depending on the sign. The OLS-derived t and p values in interaction_effect() are convenience reporting; for inference, bootstrap the refit yourself with LongBootstrap.

4. FIMIX-PLS segmentation

Are there hidden subgroups in the data whose structural paths differ from the pooled estimates? FIMIX runs EM with multiple random restarts to detect K mixture components.

fmx = fit.fimix(n_classes=3, n_restarts=5, seed=42)

print("Class sizes (mixture proportions):")
print(fmx.class_sizes().round(3))

print("\nHard class assignments (first 10 respondents):")
print(fmx.hard_assignments().head(10))

print("\nFit criteria (lower is better for AIC/BIC; higher is better for EN):")
print(fmx.fit_criteria().round(2))

print("\nClass-specific path coefficients (first 10 rows):")
print(fmx.class_paths().head(10).round(3))

To choose K, run FIMIX for K = 2, 3, 4, … and look at the BIC column from fit_criteria(): the K with the lowest BIC is usually the right answer. The normalized entropy EN should stay close to 1; if it drops below about 0.5, your classes are not well separated and the segmentation result is not trustworthy regardless of which IC says is best.

Once you have hard assignments, you can rerun the model on each subgroup to inspect the per-class structural model with full PLS-SEM reporting:

labels = fmx.hard_assignments()
for k in sorted(labels.unique()):
    sub = satisfaction.loc[labels == k].reset_index(drop=True)
    sub_fit = Plspm(sub, config, Scheme.CENTROID)
    print(f"\n--- class {k}, n={len(sub)} ---")
    print(sub_fit.path_coefficients().round(3))