Quickstart
This walks through a complete PLS-SEM fit on the ECSI customer-satisfaction model: 6 latent variables, 27 indicators, around 250 observations. The dataset ships with the test suite as tests/data/satisfaction.csv.
If you do not have it locally, you can grab it from the repo: tests/data/satisfaction.csv.
1. Load the data
The CSV has one row per respondent. Indicators are named by construct: imag1, imag2, … for the Image LV; expe1, … for Customer Expectations; etc. The first column is a row index.
import pandas as pd
satisfaction = pd.read_csv("tests/data/satisfaction.csv", index_col=0)print(satisfaction.shape)# (250, 28)2. Define the structural model
The structural (inner) model says which LV affects which. In ECSI:
- IMAG (Image) feeds into Expectations, Satisfaction, and Loyalty.
- EXPE (Expectations) feeds into Quality, Value, and Satisfaction.
- QUAL (Quality) feeds into Value and Satisfaction.
- VAL (Value) feeds into Satisfaction.
- SAT (Satisfaction) feeds into Loyalty.
Structure builds this from add_path calls, then Config consumes the resulting path matrix.
import openpls.config as cfrom openpls.mode import Mode
structure = c.Structure()structure.add_path(["IMAG"], ["EXPE", "SAT", "LOY"])structure.add_path(["EXPE"], ["QUAL", "VAL", "SAT"])structure.add_path(["QUAL"], ["VAL", "SAT"])structure.add_path(["VAL"], ["SAT"])structure.add_path(["SAT"], ["LOY"])3. Attach indicators to LVs
Each LV needs its measurement model. With the ECSI naming convention (lowercase prefix matching the LV name), add_lv_with_columns_named is the shortcut: it picks up every column starting with that prefix. All six LVs here are reflective (Mode A).
config = c.Config(structure.path(), scaled=False)for lv in ["IMAG", "EXPE", "QUAL", "VAL", "SAT", "LOY"]: config.add_lv_with_columns_named(lv, Mode.A, satisfaction, lv.lower())If your indicators do not share a prefix, use Config.add_lv(lv_name, Mode.A, MV("col1"), MV("col2"), ...) instead. See the API reference for details.
4. Fit the model
from openpls import Plspmfrom openpls.scheme import Scheme
result = Plspm(satisfaction, config, Scheme.CENTROID)Plspm runs the full PLS algorithm and computes the standard metrics eagerly. Q squared, IPMA, PLSpredict, moderation, and FIMIX are computed lazily on demand (next sections).
5. Inspect the inner model
print(result.inner_summary()) type r_squared r_squared_adj block_communality mean_redundancy aveIMAG Exogenous 0.000000 0.000000 0.582287 0.000000 0.582287EXPE Endogenous 0.335194 0.332514 0.563023 0.188704 0.563023QUAL Endogenous 0.719173 0.718041 0.660628 0.475327 0.660628VAL Endogenous 0.547778 0.544133 0.652035 0.357241 0.652035SAT Endogenous 0.706505 0.701696 0.756834 0.534452 0.756834LOY Endogenous 0.461894 0.457543 0.638674 0.295005 0.638674For path coefficients (one row per endogenous LV, one column per LV that points into it):
print(result.path_coefficients())For the long-format inner model with t and p values (these are OLS estimates; for bootstrap CIs use Plspm(..., bootstrap=True) or LongBootstrap):
print(result.inner_model())6. Inspect the outer model
print(result.outer_model()) weight loading communality redundancyimag1 0.2426 0.7167 0.5137 0.0000imag2 0.1827 0.5797 0.3360 0.0000imag3 0.3034 0.7710 0.5945 0.0000imag4 0.2587 0.7401 0.5478 0.0000imag5 0.2596 0.7596 0.5770 0.0000...For the cross-loadings matrix (indicator vs every LV):
print(result.crossloadings())7. Discriminant validity (HTMT)
The Heterotrait-Monotrait ratio is the modern standard for discriminant validity. Pairs with HTMT below 0.85 (or 0.90 for conceptually similar constructs) are considered distinct.
htmt = result.htmt()print(htmt.matrix()) # square matrixprint(htmt.pairs()) # long-format pair list8. Model fit (SRMR, d_ULS)
fit = result.model_fit()print(fit.srmr()) # Standardized Root Mean Square Residualprint(fit.d_uls()) # unweighted least-squares discrepancySRMR below 0.08 is the conventional threshold for “good” model-data alignment in PLS-SEM.
9. Predictive relevance (Stone-Geisser Q squared)
print(result.q_squared())Returns one Q squared per endogenous LV via blindfolding. Q squared above zero means the model has predictive relevance for that LV.
What’s next
- Run a worked example end to end, including IPMA, moderation, or FIMIX.
- Read the API reference for the full surface.
- If you are new to PLS-SEM, Core concepts is the place to start.