PHREEQC from Scratch #9: Ionic Strength and Activity Coefficients — Why Seawater and Freshwater Yield Different Results

Unveil what PHREEQC is doing behind the scenes. We thoroughly explain the Debye-Hückel, Davies, and Pitzer models, the true nature of ionic strength, and why the same pH yields different results in different waters.
PHREEQC
Python
Geochemistry
Thermodynamics
English
Author

DeepFlow

Published

May 5, 2026

Introduction: The “Unsolved Mystery” Since Part 2

From analyzing seawater speciation in Part 2 to calculating Gibbsite solubility in Part 7, a single underlying question has likely lingered in your mind throughout these calculations:

“Why do seawater and freshwater yield different calculation results even when the pH and temperature are identical?”

The answer lies in Activity. PHREEQC does not use concentration (mol/kgw) directly to calculate chemical equilibrium; instead, it uses activity, which is concentration multiplied by a “correction factor.” This correction factor is known as the activity coefficient, \(\gamma\) (gamma).

The Master Equation of this Series
ai = γi · mi
ai: Activity of species i (dimensionless)
γi: Activity coefficient (0 < γ ≤ 1)
mi: Molality (mol/kgw)

In dilute solutions, \(\gamma \approx 1\), meaning activity \(\approx\) concentration. However, in seawater (which has an ionic strength \(I \approx 0.7\) mol/kgw), \(\gamma\) drops below \(0.1\) for certain species. This creates an error that cannot be ignored.

NoteWhat You Will Learn in This Article
  • The definition and calculation of Ionic Strength (\(I\))
  • The mechanics and applications of the four activity models: Debye-Hückel, Extended Debye-Hückel, Davies, and Pitzer
  • How to configure activity coefficient models in PHREEQC (-activity_coefficients)
  • Python code to compare results across the four models for the same solution
  • A decision flow for choosing the right model

Theory Part 1: What is Ionic Strength?

Definition

Ionic strength (\(I\)) is a measure of the “charge-weighted concentration” of all ions in a solution:

\[I = \frac{1}{2} \sum_i m_i z_i^2\]

Where \(m_i\) is the molality of ion \(i\) (mol/kgw), and \(z_i\) is the valence (charge) of ion \(i\).

Water Type Composition I (mol/kg) Applicable Model
Distilled / Rainwater Nearly pure water < 0.001 Debye-Hückel
River / Groundwater Ca²⁺, HCO₃⁻ dominant 0.001–0.05 Extended Debye-Hückel
Brackish / Hot Springs Increased Na⁺, Cl⁻ 0.05–0.5 Davies
Seawater NaCl ≈ 0.5 M ≈ 0.7 Davies / Pitzer
Salt Lakes / Brines Hyper-saline > 1 Pitzer required

Hand-Calculating the Ionic Strength of Seawater

# Major Ion Composition of Seawater (mol/kgw)
Na⁺ = 0.4689, z = +1 → 0.4689 × 1² = 0.4689
Cl⁻ = 0.5453, z = −1 → 0.5453 × 1² = 0.5453
Mg²⁺ = 0.0528, z = +2 → 0.0528 × 4 = 0.2112
SO₄²⁻= 0.0283, z = −2 → 0.0283 × 4 = 0.1132
Ca²⁺ = 0.0103, z = +2 → 0.0103 × 4 = 0.0412
K⁺ = 0.0102, z = +1 → 0.0102 × 1 = 0.0102
──────────────────────────────────────
Sum = 1.3900
I = 1.3900 / 2 = 0.695 mol/kg

Theory Part 2: The Four Activity Coefficient Models

Model 1: Debye-Hückel (DH)

The simplest theoretical equation, considering only the Coulombic attraction between ions:

\[\log \gamma_i = -A z_i^2 \sqrt{I}\]

\(A = 0.509\) (for water at 25°C). This is valid only for extremely dilute solutions where \(I < 0.005\) mol/kg.

Model 2: Extended Debye-Hückel (LLNL form)

An improved version that incorporates the effective ionic radius (\(a_i\)):

\[\log \gamma_i = \frac{-A z_i^2 \sqrt{I}}{1 + B a_i \sqrt{I}}\]

Where \(a_i\) is the effective radius (in Å) and \(B = 0.328\). Valid for \(I < 0.1\) mol/kg. This model is used by the standard PHREEQC databases (phreeqc.dat, llnl.dat).

Model 3: Davies

Adds an empirical correction term for higher ionic strengths:

\[\log \gamma_i = -A z_i^2 \left( \frac{\sqrt{I}}{1 + \sqrt{I}} - 0.3 I \right)\]

Valid up to \(I < 0.5\) mol/kg. It is highly suitable for brackish waters and soil waters. In PHREEQC, you can manually specify it using -activity_model davies.

Model 4: Pitzer

A highly accurate model using empirically derived interaction parameters (\(\beta^{(0)}\), \(\beta^{(1)}\), \(C^{\phi}\)) for specific ion pairs:

\[\ln \gamma_{\pm} = f(I) + \sum_{ij} B_{ij}(I) m_j + \sum_{ijk} C_{ijk} m_j m_k + \cdots\]

Mandatory for brines, salt lakes, and geothermal fluids where \(I > 0.5\) mol/kg. In PHREEQC, you must load the pitzer.dat database to use this model.


Comparing the Four Models with Python

The code below performs calculations using phreeqc.dat and llnl.dat.

However, an error occurs with phreeqc.dat when using Phreeqc Interactive 3.8.6-17100. Therefore, it is recommended to use wateq4f.dat or the newly added PHREEQC_ThermoddemV1.10_15Dec2020.dat instead. (Note that this issue does not occur in Phreeqc Interactive 3.7.3-15968.)

The following code assumes the use of Phreeqc Interactive 3.8.6-17100 and utilizes wateq4f.dat. Please pay close attention to the database file paths on your system.

For example, if llnl.dat is located in a folder named Phreeqc on your Desktop, the path would be specified as follows:

r"C:\Users\Desktop\Phreeqc\llnl.dat"

Please modify the file paths for DB_PHREEQC and DB_LLNL in the code below accordingly.

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from phreeqpy.iphreeqc.phreeqc_dll import IPhreeqc

# ==============================
# 1. Database Paths Configuration
# ==============================
# ***Please modify these to match your local paths***
DB_PHREEQC = r"C:\...\wateq4f.dat" 
DB_LLNL    = r"C:\...\llnl.dat"    

# ==============================
# 2. PHREEQC Input String
# ==============================
# We add 1.1 moles to ensure ionic strength exceeds 1.0
input_string = """
SOLUTION 1 Pure water
    temp 25
    pH 7 charge

REACTION 1
    NaCl 1.1
    1.1 moles in 100 steps

SELECTED_OUTPUT
    -reset false
    -user_punch true

USER_PUNCH
    -headings I Gamma_Na
    -start
    10 PUNCH MU, GAMMA("Na+")
    -end
"""

# ==============================
# 3. PHREEQC Execution Function
# ==============================
def run_phreeqc(db_path, label):
    db_path = os.path.abspath(db_path)
    print(f"Loading DB: {os.path.basename(db_path)} ...", end=" ")

    if not os.path.exists(db_path):
        raise FileNotFoundError(f"\n❌ Database not found: {db_path}")

    ip = IPhreeqc()

    try:
        ip.load_database(db_path)
    except Exception as e:
        raise RuntimeError(f"\n❌ Python Exception during load_database: {e}")

    error_str = ip.get_error_string()
    if error_str:
        raise RuntimeError(f"\n❌ Failed to load {label} DB (DLL Error):\n{error_str}")
    
    print("OK")

    try:
        ip.run_string(input_string)
    except Exception as e:
        error_str = ip.get_error_string()
        raise RuntimeError(f"\n❌ PHREEQC execution error for {label}:\n{e}\n{error_str}")
    
    out = ip.get_selected_output_array()
    
    if len(out) > 1:
        df = pd.DataFrame(out[1:], columns=out[0]).astype(float)
        return df
    return pd.DataFrame()

# ==============================
# 4. Theoretical Calculations
# ==============================
def calc_theoretical(I):
    A = 0.509  
    z = 1.0    

    gamma_dh = 10 ** (-A * (z**2) * np.sqrt(I))
    gamma_davies = 10 ** (-A * (z**2) * (np.sqrt(I) / (1 + np.sqrt(I)) - 0.3 * I))

    return gamma_dh, gamma_davies

# ==============================
# 5. Execution and Data Collection
# ==============================
if __name__ == "__main__":
    try:
        df_phreeqc = run_phreeqc(DB_PHREEQC, "PHREEQC")
        df_llnl    = run_phreeqc(DB_LLNL, "LLNL")

        # Exclude Step 0 (pure water) where gamma output might be 0
        df_phreeqc = df_phreeqc[df_phreeqc['Gamma_Na'] > 0]
        df_llnl    = df_llnl[df_llnl['Gamma_Na'] > 0]

        I_theory = np.linspace(0.001, 1.0, 100)
        gamma_dh, gamma_davies = calc_theoretical(I_theory)

        # ==============================
        # 6. Plotting
        # ==============================
        plt.rcParams['font.family'] = 'sans-serif'
        plt.rcParams['axes.unicode_minus'] = False

        fig, ax = plt.subplots(figsize=(8, 5))
        ax.set_facecolor('#F9FAFB')
        ax.grid(color='#E5E7EB', linestyle='-', linewidth=0.8)
        for spine in ax.spines.values():
            spine.set_color('#9CA3AF')

        ax.axhline(1.0, color='#D97706', linestyle='--', alpha=0.5, label='_nolegend_')
        ax.axvline(0.7, color='#DC2626', linestyle='--', alpha=0.5, label='_nolegend_')
        ax.text(0.71, 0.95, 'Seawater I ≈ 0.7', color='#DC2626', style='italic', fontsize=10)

        # Plot Data
        ax.plot(I_theory, gamma_dh, color='#6B7280', linestyle=':', linewidth=2, label='Debye-Hückel (Theoretical)')
        ax.plot(I_theory, gamma_davies, color='#16A34A', linestyle='--', linewidth=2, label='Davies (Theoretical)')
        ax.plot(df_phreeqc['I'], df_phreeqc['Gamma_Na'], color='#2563EB', linestyle='-', linewidth=2, label='WATEQ Eq (Extended D-H (wateq4f.dat))')
        ax.plot(df_llnl['I'], df_llnl['Gamma_Na'], color='#CA8A04', linestyle='-', linewidth=2, label='B-dot Eq (llnl.dat)')

        ax.set_xlabel('Ionic Strength I (mol/kgw)', fontsize=12, color='#374151')
        ax.set_ylabel(r'Activity Coefficient $\gamma$ (Na$^+$, z=1)', fontsize=12, color='#374151')
        
        ax.set_xlim(0, 1.0)
        ax.set_ylim(0, 1.05)
        ax.tick_params(colors='#6B7280')

        ax.legend(loc='lower right', facecolor='white', framealpha=0.9, edgecolor='#E5E7EB', fontsize=10)
        plt.title('Comparison of Activity Coefficient Models (25°C, Na⁺)', color='#374151', pad=15)

        plt.tight_layout()
        plt.savefig('Activity_Coefficients.png', dpi=300)
        plt.savefig('Activity_Coefficients.svg', bbox_inches='tight')
        plt.show()

        print("\n✅ Saved as 'Activity_Coefficients.png / .svg'")

    except Exception as e:
        print(f"\n🚨 Error occurred during execution:\n{e}")

Calculation results

As ionic strength increases, the divergence between the activity coefficient models becomes significant.


Verifying Activity Coefficients in PHREEQC

You can back-calculate the exact \(\gamma\) values from SELECTED_OUTPUT using -activities and -totals:

# Checking Activity Coefficients
SOLUTION 1
    temp  25
    pH    8.22
    units mol/kgw
    Na    0.4689
    Cl    0.5453
    Mg    0.0528
    Ca    0.0103
    K     0.0102
    Alkalinity 2.3e-3 as HCO3
    S(6)  0.0283

SELECTED_OUTPUT 1
    -file     activity_check.txt
    -totals   Na Ca Mg Cl S(6) C(4)
    -activities Na+ Ca+2 Mg+2 Cl- SO4-2 HCO3-

USER_PUNCH 1
    -headings  Species  Molality  Activity  Gamma
    -start
    10 PUNCH "Na+",  MOL("Na+"),  ACT("Na+"),  ACT("Na+") /MOL("Na+")
    20 PUNCH "Ca+2", MOL("Ca+2"), ACT("Ca+2"), ACT("Ca+2")/MOL("Ca+2")
    30 PUNCH "Mg+2", MOL("Mg+2"), ACT("Mg+2"), ACT("Mg+2")/MOL("Mg+2")
    40 PUNCH "Cl-",  MOL("Cl-"),  ACT("Cl-"),  ACT("Cl-") /MOL("Cl-")
    50 PUNCH "SO4-2",MOL("SO4-2"),ACT("SO4-2"),ACT("SO4-2")/MOL("SO4-2")
END

Interpreting the Results

Species Valence z Molality m Activity a Gamma γ Meaning
Na⁺ ±1 0.4689 0.361 0.77 Activity is 77% of concentration.
Ca²⁺ ±2 0.0103 0.00284 0.28 Activity is only 28% of conc!
Mg²⁺ ±2 0.0528 0.0148 0.28 Divalent ions are suppressed heavily.
SO₄²⁻ ±2 0.0283 0.0071 0.25 Only effectively "acts" as 1/4th.
NoteWhy is \(\gamma\) so low for Divalent (Charge=2) Ions?

Because the Debye-Hückel equation contains \(z^2\), a divalent ion is suppressed by a factor of 4 compared to a monovalent ion. It may seem counterintuitive that sulfate in seawater behaves as if only 1/4th of it is present, but this is exactly why neglecting activity leads to massive errors. The solubility product of Calcite (\(K_{sp}\)) is defined as \(a_{\mathrm{Ca^{2+}}} \cdot a_{\mathrm{CO_3^{2-}}}\). Evaluating it strictly by concentration leads to fundamentally flawed thermodynamic interpretations.


Visualizing Activity vs. Ionic Strength in Python

# ============================================================
#  activity_coeff_plot.py
# ============================================================
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams.update({
    "font.family": "sans-serif",
    "axes.unicode_minus": False,
    "figure.dpi": 150,
})

# ---- Debye-Hückel Parameters (25°C, Water) ----
A = 0.509   
B = 0.328e8 # Careful with units

# ---- Calculating Gamma across Models ----
I = np.linspace(0, 1.0, 500)

def gamma_dh(I, z=1):
    """Simple Debye-Hückel"""
    return 10 ** (-A * z**2 * np.sqrt(I))

def gamma_edh(I, z=1, a_param=4.0):
    """Extended Debye-Hückel"""
    B_prime = 0.328  
    return 10 ** (-A * z**2 * np.sqrt(I) / (1 + B_prime * a_param * np.sqrt(I)))

def gamma_davies(I, z=1):
    """Davies Model"""
    return 10 ** (-A * z**2 * (np.sqrt(I)/(1+np.sqrt(I)) - 0.3*I))

def gamma_pitzer_approx(I, z=1):
    """Pitzer Approximation (fitted to NaCl empirical data)"""
    if z == 1:
        return 10 ** (-0.509 * np.sqrt(I) / (1 + 1.316 * np.sqrt(I)) + 0.09 * I)
    else:
        return 10 ** (-0.509 * z**2 * np.sqrt(I) / (1 + 1.316 * np.sqrt(I)) + 0.09 * I * z**2 * 0.3)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# ---- Left: Comparing Models for z=1 ----
ax = axes[0]
ax.plot(I, gamma_dh(I, 1),          color="#6B7280", lw=1.8, ls="--", label="Debye-Hückel")
ax.plot(I, gamma_edh(I, 1),         color="#CA8A04", lw=2.0,          label="Extended D-H (a=4Å)")
ax.plot(I, gamma_davies(I, 1),      color="#16A34A", lw=2.0, ls="-.", label="Davies")
ax.plot(I, gamma_pitzer_approx(I,1),color="#2563EB", lw=2.5,          label="Pitzer (Approx)")
ax.axvline(0.7, color="#DC2626", lw=1, ls=":", alpha=0.7)
ax.text(0.72, 0.95, "Seawater\nI≈0.7", color="#DC2626", fontsize=9)
ax.set(xlim=(0,1), ylim=(0,1.05), xlabel="Ionic Strength I (mol/kg)",
       ylabel=r"$\gamma$", title="Models for Monovalent Ions (z = ±1)")
ax.legend(fontsize=9); ax.grid(True, ls="--", lw=0.5, color="#E5E7EB")

# ---- Right: Comparing Valences (Extended DH) ----
ax = axes[1]
for z, color, label in [(1,"#16A34A","z = ±1 (Na⁺, Cl⁻)"),
                         (2,"#EA580C","z = ±2 (Ca²⁺, SO₄²⁻)"),
                         (3,"#DC2626","z = ±3 (Al³⁺, PO₄³⁻)")]:
    ax.plot(I, gamma_edh(I, z), color=color, lw=2.2, label=label)
ax.axvline(0.7, color="#9CA3AF", lw=1, ls=":", alpha=0.7)
ax.set(xlim=(0,1), ylim=(0,1.05), xlabel="Ionic Strength I (mol/kg)",
       ylabel=r"$\gamma$", title="Effect of Valence (Extended D-H)")
ax.legend(fontsize=9); ax.grid(True, ls="--", lw=0.5, color="#E5E7EB")

plt.suptitle("Dependence of Activity Coefficient γ on Ionic Strength (25°C)", fontsize=13, y=1.02)
plt.tight_layout()
plt.savefig("activity_coefficient.svg", bbox_inches="tight")
plt.show()

Activity Coefficients vs. Ionic Strength

Summary: The Importance of Recognizing “Activity”

Impact of Activity Coefficient Errors on Solubility Calculations Freshwater I < 0.1 γ(Ca²⁺) ≈ 0.65–1.0 Error ≤ 5% Ext. D-H is sufficient Brackish / Hot Springs I = 0.1–0.5 γ(Ca²⁺) ≈ 0.30–0.65 Error 5–15% Davies Recommended Seawater / Brine I > 0.5 γ(Ca²⁺) ≈ 0.20–0.30 Error > 15% Pitzer Required
TipComing Up Next — Part 10: “Mastering the Saturation Index (SI)”

With a solid understanding of how to correctly compute activity, we can properly interpret the Saturation Index: \(SI = \log(IAP/K_{sp})\) for various minerals. If \(SI > 0\), precipitation occurs; if \(SI < 0\), dissolution happens. It turns out that every calculation we made from Part 2 to Part 7 was secretly relying on this exact logic!

By completing this article, the persistent “why?” that has been building up since Part 2 should finally be resolved. Knowing that PHREEQC automatically handles these complex corrections under the hood will entirely change how you read your simulation results.


References

Appelo, CAJ, and Dieke Postma. 2005. Geochemistry, Groundwater and Pollution. Second. Balkema, Rotterdam, p. 634.
Parkhurst, David L, and CAJ Appelo. 2013. Description of Input and Examples for PHREEQC Version 3—a Computer Program for Speciation, Batch-Reaction, One-Dimensional Transport, and Inverse Geochemical Calculations. US Geological Survey Techniques; Methods, book 6, chap. A43, 497 p.
Yamamoto, S. 1983. Method of the Groundwater Survey. Kokon Shoin, Tokyo (in Japanese), 490 p.
Yang, Heejun, T Mishima, S Katazakai, and M Kagabu. 2023. “Analytical Approach Using a Chemical Equilibrium Formula and Geochemical Modeling for Alkalinity Measurements of Small Natural Water Samples.” Applied Geochemistry 148: 105535.

DeepFlow | Science beneath the surface