Week 47

regression

calculation

data

plotting

Code

try:
    import fysisk_biokemi
    print("Already installed")
except ImportError:
    %pip install -q "fysisk_biokemi[colab] @ git+https://github.com/au-mbg/fysisk-biokemi.git"

Determination of reaction orders.

Code

import matplotlib.pyplot as plt
import numpy as np

(a) Load the dataset

Load the dataset deter-reacti-orders.xlsx using the widget below

from fysisk_biokemi.widgets import DataUploader
from IPython.display import display 
uploader = DataUploader()
uploader.display()

Run the next cell after uploading the file

df = uploader.get_dataframe()
display(df)

(b) SI units.

Add two new columns with the concentrations given in M.

# Make the new columns in this cell. 


display(df)

(c) Plotting

For each reactant make a plot of

\(t\) vs. \([\text{A}]\)

Note

We will be adding more elements to the plot in the following parts of the exercise, therefore we will wrap the code for plotting in a function such that it can be easily reused.

The cell below defines the function for plotting, add the missing code in places with ....

def plot_dataframe(ax, df):
    # Extract from dataframe
    t = df['Time_(s)']
    A1 = df['[A1]_(M)']
    A2 = df['[A2]_(M)']

    ax.plot(t, A1, 'o', label='[A1]')
    ... # Your code here to plot A2

    for ax in axes:
        ax.set_xlabel('t [s]')
        ax.legend()

Once we’ve written the plotting function we can make the plot as done in the cell below.

fig, ax = plt.subplots(1, 1, figsize=(6, 4))

plot_dataframe(ax, df)

(d) Determining reaction orders

Now we want to determine the reaction orders by making fits to the data under the assumption of different integrated rate laws.

In the cell below finish the code to define functions for the integrated rate laws of zeroth, first and second order.

def zeroth_order(t, k, A0):
    result = ...
    return result

def first_order(t, k, A0):
    result = ...
    return result

def second_order(t, k, A0):
    result = ...
    return result

rate_laws = {0: zeroth_order, 1: first_order, 2:second_order}

We can then use these to make fits, here it is helpful to define a small function to help us with that.

from scipy.optimize import curve_fit

def make_fit(x_data, y_data, x_eval, A0, order):
    # Don't worry about this line for now. 
    # It just picks the correct function for the given order and sets A0.
    func = lambda t, k: rate_laws[order](t, k, A0)

    # Make fit
    popt, pcov = curve_fit(func, x_data, y_data)

    # Evaluate fit
    k = popt[0]
    y_fit = func(x_eval, k)
    return k, y_fit

Next we will define the known constants and extract a few things from the DataFrame. In the cell below put the initial concentrations in M.

A1_0 = ... # Set the initial concentration [A1]_0
A2_0 = ... # Set the initial concentration [A2]_0

# Extract data
t = df['Time_(s)']
A1 = df['[A1]_(M)']
A2 = df['[A2]_(M)']
orders = [0, 1, 2]
t_eval = np.linspace(0, t.max()*1.1)

Now we can make and plot the fits

fig, axes = plt.subplots(1, 3, figsize=(8, 4), sharey=True, layout='constrained')

axes[0].set_ylabel('Concentration (M)')

for ax in axes:
    plot_dataframe(ax, df)

for ax, order in zip(axes, orders):
    k1, y_fit = make_fit(t, A1, t_eval, A1_0, order)
    ax.plot(t_eval, y_fit, color='C2', label=rf'k = {k1:.2e}')

    k2, y_fit = make_fit(t, A2, t_eval, A2_0, order)
    ax.plot(t_eval, y_fit, color='C3', label=rf'k = {k2:.2e}')
    ax.legend()
    ax.set_title(f'Reaction order: {order}')

Based on this plot;

What do you believe is the reaction order for [A1]?
What is the unit of the rate constant for [A1]?
What do you believe is the reaction order for [A2]?
What is the unit of the rate constant for [A2]?

Note

The cell above used a for-loop. These are a powerful way of performing the same operation with different inputs, you will learn more about them later on.

(e) Residuals

By inspecting the plot we can make a qualitative judgement of the reaction orders. To get a more quantitative results we can look at the residuals, recall they are defined as

\[ r_i = y_i - f(x_i | \theta) \]

We can calculate these in much the same way we calculate the fits, except we don’t evaluate the fit at a new set of points but at the observations \(x_i\).

def calculate_residuals(x_data, y_data, A0, order):
    # Don't worry about this line for now. 
    # It just picks the correct function for the given order and sets A0.
    func = lambda t, k: rate_laws[order](t, k, A0)

    # Make fit
    popt, pcov = curve_fit(func, x_data, y_data)


    # Evaluate fit
    k = popt[0]
    y_fit = ... # Calculate the fit at the x_data using `func`
    residuals = ... # Calculate the residuals.
    return residuals

We can then plot the residuals, here using a histogram where the x-axis is the value of the residual and the bars indicate the number of residuals within that range of values.

fig, axes = plt.subplots(2, 3, figsize=(8, 4), sharey=True, layout='constrained', sharex=True)

axes[0, 0].set_ylabel('Count')
axes[1, 0].set_ylabel('Count')

for ax_col, order in zip(axes.T, orders):

    residuals = calculate_residuals(t, A1, A1_0, order)
    ax_col[0].hist(residuals, alpha=0.75, color='C0')

    residuals = calculate_residuals(t, A2, A2_0, order)
    ax_col[1].hist(residuals, alpha=0.75, color='C1')

    ax_col[1].set_xlabel('Residual')

Does this support your conclusions about the reaction order from (d)?

Analysis of a reversible reaction

Code

import matplotlib.pyplot as plt
import numpy as np

Consider the following reaction

\[ A \underset{k_{2}}{\stackrel{k_1}{\rightleftharpoons}} B \]

The magnitudes of the rate constants are \(k_1 = 10 \ \mathrm{s}^{-1}\) and \(k_2 = 1 \ \mathrm{s}^{-1}\).

(a) Reaction order

What is the reaction order in each direction?

(b) Derive equilibrium constant

Show mathematically how the equilibrium constant \(K_{\mathrm{eq}}\) is given by the ratio between the two rate constants.

(c) Calculate \(K_{\mathrm{eq}}\)

k1 = ...
k2 = ...
K_eq = ...
print(f"{K_eq = }")

(d) Equilibrium concentrations

Calculate the concentrations of A and B at equilibrium, \([\mathrm{A}]_{\mathrm{eq}}\) and \([\mathrm{B}]_\mathrm{eq}\), if \([\mathrm{A}]_0 = 10^{-3} \text{M}\)

A0 = ...
A_eq = ...
B_eq = ...
print(f"{A_eq = }")
print(f"{B_eq = }")

(e) Rates at equilibrium

Calculate the forward and reverse rates at equilibrium at this concentration.

r_fwd = A_eq * k1
r_bwd = B_eq * k2
print(f"{r_fwd = :.5f}")
print(f"{r_bwd = :.5f}")

(f) Initial rate of formation

If \([\mathrm{A}]_0 = 10^{-3} \text{M}\) and \([\mathrm{B}]_0 = 0 \text{M}\), calculate the initial rate of formation of B.

r_fwd = A0 * k1
print(f"{r_fwd = :.5f}")

r_fwd = 0.01000

(g) Time-dependence

We now want to calculate and plot the time-dependent concentrations using the above equations.

In the cell below finish the implementation of the function A_time that calculates the concentration [A] as a function of time.

def A_time(t, A0, k_f, k_b):
    return ...

Caution

Be careful to correctly set parantheses when implementing this function!

And then we can use that function to calculate and plot the concentrations as a function of time.

fig, ax = plt.subplots()

t_max = ... # Set the time in seconds
t = np.linspace(0, t_max, 100)

At = ... # Use your function to calculate [A](t)
Bt = ... # Calculate [B](t)

ax.plot(t, At, label='[A](t)')
ax.plot(t, Bt, label='[B](t)')
ax.set_ylabel('Concentrations (M)')
ax.set_xlabel('Time (s)')
ax.legend()

Determination of reaction order and activation energy

Code

import numpy as np  
import matplotlib.pyplot as plt

The irreversible isomerization of compound A to compound B results in a decreasing absorbance. The isomerization was followed in a time course at two different temperatures (T1 = 25 °C and T2 = 40 °C). The absorbance (\(\epsilon\) = 16700 \(\mathrm{cm}^{-1} \mathrm{M}^{-1}\)) was used to calculate the concentration of compound A in a spectrophotometer with a pathlength of 1 cm. The obtained dataset is given in the file deter-reacti-order-activ.csv.

(a) Temperatures

What are the two temperatures in Kelvin? Set them as varilabes in the cell below.

T1 = 25 + 273.15
T2 = 40 + 273.15

(b) Load the dataset

Load the dataset using the widget below

from fysisk_biokemi.widgets import DataUploader
from IPython.display import display 
uploader = DataUploader()
uploader.display()

Run the next cell after uploading the file

df = uploader.get_dataframe()
display(df)

(c) SI Units

Calculate the concentration of A at each timepoint in SI units, by adding new columns to the DataFrame.

extinction_coeff = ...
L = 1 # Path length

df['[A]_(M)_25C'] = ... # For 25 C
... # For 40 C
display(df)

(d) Plot the data

We will be resuing the plot, so we will put the code for it in a function.

def plot_dataframe(ax, df):
    # Extract from dataframe
    t = ...
    A_25 = ...
    A_40 = ...

    # First subfigure: t vs [A]
    ax.plot(..., ..., 'o', label='[A]_25C')
    ... # Add the code to plot the 40C data.
    ax.set_xlabel('t (s)')
    ax.set_ylabel('[A] (M)')
    ax.legend()

You can run this next cell to see your plot and adjust as neccesarry.

fig, ax = plt.subplots()
plot_dataframe(ax, df)

(e) Determine reaction order

In the exercise on “Determination of reaction orders” we saw how we can fit to the expressions for the integrated rate laws to determine the reaction order.

In that exercise we wrote the functions in the next two cells, we will reuse them here

def zeroth_order(t, k, A0):
    return A0 - k*t

def first_order(t, k, A0):
    return A0 * np.exp(-k*t)

def second_order(t, k, A0):
    return A0 / (1 + 2*k*t*A0)

rate_laws = {0: zeroth_order, 1: first_order, 2:second_order}

from scipy.optimize import curve_fit

def make_fit(x_data, y_data, x_eval, A0, order):
    func = lambda t, k: rate_laws[order](t, k, A0)
    popt, pcov = curve_fit(func, x_data, y_data)

    # Evaluate fit at new points
    k = popt[0]
    y_fit_new = func(x_eval, k)
    
    # Evaluate fit at data-points to calculate residuals
    y_fit_data = func(x_data, k)
    residuals = y_data - y_fit_data

    return k, y_fit_new, residuals

Now we need to set a few things before we can do the fitting

t = ... # Extract the time from the dataframe
A_25 = ... # Extract the concentration at 25C
A_40 = ... # Extract the concentration at 40C
A0_25 = ... # Set the initial concentration at 25C
A0_40 = ... # Set the initial concentration at 40C

Tip

You’ve already calculated the initial concentration, it is contained in the DataFrame.

Now we can use our plotting and fitting functions to analyze the data

fig, axes = plt.subplots(2, 3, figsize=(8, 6), sharey='row', layout='constrained')

for ax in axes.T:
    plot_dataframe(ax[1], df)

orders = [0, 1, 2]
t_eval = np.linspace(0, t.max()*1.05)

for ax_col, order in zip(axes.T, orders):
    # Calculate the fits
    k25, y_fit_25, residuals_25 = make_fit(t, A_25, t_eval, A0_25, order)
    k40, y_fit_40, residuals_40 = make_fit(t, A_40, t_eval, A0_40, order)

    # Plot the fits
    ax = ax_col[1]
    ax.plot(t_eval, y_fit_25, color='k')
    ax.plot(t_eval, y_fit_40, color='k')
    ax.text(0.05, 0.15, f'k (25 C) = {k25:.4f}', transform=ax.transAxes)
    ax.text(0.05, 0.05, f'k (40 C) = {k40:.4f}', transform=ax.transAxes)

    ax.legend()

    # Plot the residuals
    ax = ax_col[0]
    ax.hist(residuals_25, facecolor='C0', alpha=0.75, edgecolor='k', label='25 C')
    ax.hist(residuals_40, facecolor='C1', alpha=0.75, edgecolor='k', label='40 C')
    ax.set_xlabel('Residual')    
    ax.set_ylabel('Count')
    ax.legend()

    ax.set_title(f'Reaction order: {order}')

Based on these plots

What do you think the reaction order of the reaction is?
What’s the rate constant?
What the unit of the rate constant?

(f) Half-life

What is the half-life of the reaction at these temperatures and is it constant through-out the reaction?

(g) Activation energy

With the assumption that the Arrhenius constant A and the activation energy are temperature independent in the interval measured, use the Arrhenius equation to calculate the activation energy of the isomerization of the compound A.

Perform the calculation in the cell below.

...

The pH effect on an active site histidine residue

Assume that a compound R can react with the unprotonated form of Histidine, \(\text{His}\), to form a covalent reaction product, \(\text{P}\): \[ \text{His} + \text{R} \rightarrow \text{P} \]

The protonated form of Histidine, \(\text{HisH}^+\), is in equilibrium with \(\text{His}\):

\[ \text{HisH}^+ \rightleftharpoons \text{His} + \text{H}^+ \]

The pKa value for this acid-base equilibrium is 6.0. Further assume that the total concentration of Histidine, is

\[ [\text{HisH}^+] + [\text{His}] = 10^{-3} \ \text{M} \]

(a) Percentage unprotonated

Using the Henderson-Hasselbach equation, calculate the percentage of unprotonated Histidine, \(\text{His}\), at pH 6.0.

pKa = ...
pH = ...

... # Possible intermediate calculation, otherwise delete this line.

percentage_unprotonated = ...
print(f"{percentage_unprotonated = } %")

(b) Concentration

Calculate \([\text{His}]\) at pH 6

total_conc = ...
his_conc = ...
print(f"{his_conc = } M")

(c) Reaction order

The reaction equation for the reaction between \(\text{His}\) and \(\text{R}\) is

\[ v = - \frac{d[\text{His}]}{dt} = k \cdot [\text{His}] \cdot [\text{R}] \]

What’s the reaction order?

(d) High concentrations

If \([\text{R}]\) is much higher than \([\text{His}]\), what can then be concluded regarding the order of the reaction?

(e) New rate constant

Show how a new rate constant, \(k'\), can be defined in these conditions. How does \(k'\) depend on \(\text{R}\)?

(f) SI Units

At pH 6.0 the reaction rate \(v = 1 \ \text{mM}\cdot \text{s}^{-1}\) and constant \([\text{R}] = 0.2 \ \text{M}\)

Convert the reaction rate to SI-units given in \(\text{M}\cdot \text{s}^{-1}\).

...

(g) Calculate the rate constant

Use the concentration of \([\text{His}]\) at pH 6 calculated in question (b) and calculate the rate constant k.

...

Design of an enzyme kinetics experiment (Unfinished)

Code

import matplotlib.pyplot as plt

In the Excel document tø12_week49_data you will find a data set in which an enzyme catalyzed formation of product P, with varying start concentration of substrates, [S], was followed over time. The product absorbs light at a specific wavelength with an extinction coefficient of 0.068 \(\mu\text{M}^{-1}\cdot \text{cm}^{-1}\), and the absorbance was measured in a light path of 1 cm throughout the time course.

You can load the dataset using the cells below;

from fysisk_biokemi.widgets import DataUploader
from IPython.display import display 
uploader = DataUploader()
uploader.display()

Run the next cell after uploading the file

df = uploader.get_dataframe()
display(df)

The headers, like Abs_S1 refer to the substrate concentration so S1 means a substrate concentration of 1 \(\mu\text{M}\).

(a)

Convert the extinction coefficient to units given in \(\text{M}^{-1}\cdot \text{cm}^{-1}\) and assign it to a variable. Also assign the light path length to a variable.

...
...

(b) Calculate concentrations

Using Lambert-Beers law, calculate the concentration of product, [P], in M for each time series.

Loops

Loops are one of the most useful parts of programming, loops allows us to repeat an operation on many different elements. This allows us to not have to repeat code many times, making it simpler to understand and less prone to error - and at the same time more flexible.

A common type of loop in Python is the for loop, which does something for every item it is told to consider. The syntax of a for-loop is like so:

total = 0
for count in [1, 2, 3, 4, 5]:
    print(f"I have {count} apples.")

Which will print

I have 1 apples
I have 2 apples
I have 3 apples
I have 4 apples
I have 5 apples

So the for-loop did the operation of printing a string for each value of count.

When working with a DataFrame like in this exercise, we may for example want to do something for each column, one way of doing that is to loop over the column names, say we want we have a DataFrame df with columns named "col1", "col2", "col3" we can loop over that as

for name in ['col1', 'col2', 'col3']
    df[name] = ... # Some operation

The cell below setups a loop calculate the concentrations for each of these current columns in the dataframe.

substrate_concentrations = [1, 3, 5, 7, 9, 11, 13, 15]
for s in substrate_concentrations:
    abs_col_name = f'Abs_S{s}' # This creates a string where the value of 's' is put instead of "{s}"".
    conc_col_name = f'C_S{s}' # Same type string creation 
    df[conc_col_name] = ... # Your code here

    # We can print in each iteration of the loop to see what's going on.
    print(s, abs_col_name, conc_col_name)

And we can check that the columns we expect have been added to the DataFrame.

display(df)

(c) Plot

Plot [P] as a function of time for all experiments in one single graph.

Here it is again very useful to use a for-loop

fig, ax = plt.subplots(figsize=(7, 4))

x_axis = df['time_(s)']
substrate_concentrations = [1, 3, 5, 7, 9, 11, 13, 15]
for s in substrate_concentrations:
    conc_col_name = f'C_S{s}'
    y_axis = df[conc_col_name]

    ax.plot(x_axis, y_axis, label=conc_col_name)

ax.legend()

(d)

How could you use Excel to determine \(V_0\) for each concentration of \(S\)? Create a table of \(V_0\) vs \([S]\)

(e)

Why is it important to use \(V_0\) rather than \(V\) at a later time point when creating the Michaelis-Menten plot?

(f)

Plot \(V_0\) against substrate concentration and estimate \(k_{cat}\) and \(K_M\) visually (remember units)

Analysis of a data set obeying Michaelis-Menten kinetics (Unfinished)

Code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
pd.set_option('display.max_rows', 6)

The kinetics for an enzyme were investigated using the absorbance of the product to calculate the initial velocities for a range of different initial concentrations of the substrate. The data given in the dataset analys-data-set-obeyin.xlsx was obtained. Use these data to answer the following questions.

from fysisk_biokemi.widgets import DataUploader
from IPython.display import display 
uploader = DataUploader()
uploader.display()

Run the next cell after uploading the file

df = uploader.get_dataframe()
display(df)

(a) SI Units

Convert the concentrations of substrate and the velocities to the SI units \(\text{M}\) and \(\text{M}\cdot\text{s}^{-1}\), respectively.

...
...
display(df)

(b) Plot & estimate

Plot the initial velocities, \(V_0\), as a function of substrate concentrations, \([S]\). Estimate \(K_M\) and \(V_\mathrm{max}\) from this plot.

fig, ax = plt.subplots()

... # Your code here. 

# Some settings to make the plot nicer..
ax.set_xlabel('[S] (M)')
ax.set_ylabel('$V_0$ (M/s)')
ax.grid('on')
ax.set_ylim(0, df['V0_(M/s)'].max()*1.2)

(c) Fit

Now we want to fit using the Michaelis-Menten equation, as per usual when the task is fitting we have to define the function we are fitting with

def michaelis_menten(S, Vmax, Km):
    return ... # Replace ...

Caution

This is one of those places were parentheses are crucial.

And then we can follow our usual procedure to make the fit

x_data = ...
y_data = ...

popt, pcov = ...
Vmax, Km = popt

print(f"{Vmax = :03f}")
print(f"{Km = :03f}")

How do these values compare to your estimate?

(d) Enzyme concentration

The \(k_{cat}\) of the enzyme was determined to be \(20000 \ \text{s}^{-1}\). Calculate, in the cell below, the concentration of the enzyme used in the assay.

kcat = ...
... # Calculate the enzyme_concentration
print(...) # Print the concentration.