Yet another derivation of the Born–Oppenheimer approximation2018-11-02
There are plenty of existing discussions of the Born–Oppenheimer approximation, but none that I've read so far are entirely satisfying.
They tend to use confusing notation, conflate operators with their representations, gloss over some crucial aspects, and so on.
The following is my attempt at a succinct derivation of the approximation that touches on all the important details.
Specifically, here are some of the questions that arose when I was first learning about this (and that I try to answer below):
- What is the exact nature of the parametric dependence of the fast wavefunctions on the slow coordinate ? Do the fast wavefunctions form an orthonormal basis in some way?
- How does the wavefunction expansion differ from a standard basis expansion? Is it a Schmidt decomposition?
- How can the kinetic energy operator on the slow space result in a derivative of the fast wavefunctions?
- Why do all the surfaces seem to have the same energy?
Consider a system with degrees of freedom that we will group into "slow" and "fast".
They don't actually need to be slow and fast, but these are the labels we will use.
The prototypical example is a molecule with slow nuclei and fast electrons.
States of this system live in the tensor product Hilbert space
On the slow space, we have the (multivariate) continuous representation , and on the fast space we have ; together, that's .
This doesn't have to be the position representation, but it almost certainly will be.
For the time being, we'll keep the Hamiltonian fairly general:
We have kinetic energies for the slow and fast degrees of freedom and a potential energy term that operates on the entire space.
One requirement that we'll impose is that the potential energy operator must be diagonal in the continuous representation we've chosen:
This allows us to express the Hamiltonian as
This isn't quite in the position representation, since we haven't given the kinetic energy operators a form yet.
If they looked like
we could express the Hamiltonian properly in the continuous representation:
We'll come back to this form later, so you should keep it in mind, but we'll stick to being more general for now.
We define a parametrized potential operator with the following eigenvalue equation:
Using this operator, we construct another Hamiltonian, which we will call the fast Hamiltonian:
The fast Hamiltonian is parameterized by and acts only on .
Conceptually, this is the Hamiltonian that describes the remaining (fast) system when we freeze out the slow degrees of freedom (by removing ) and pin them at a specific position (by parameterizing ).
For every , the operator is a perfectly legitimate Hamiltonian for the fast system.
That means that we could construct the Hamiltonian
but this is a useless object!
It describes a complicated system in the fast degrees of freedom and a collection of free particles in the slow degrees of freedom; there is no coupling whatsoever between the two.
What we'll do instead is note that the above definitions allow us to write
This may look like we've simply thrown a onto the Hamiltonian that we've only just ridiculed, but there's a vital difference: the parameter of depends on the value in the bra.
This is what gives rise to the coupling between the slow and fast degrees of freedom, and it's at least a little weird to think about.
The infinitely many fast Hamiltonians give rise to infinitely many orthonormal bases for .
For any choice of , the states satisfy the eigenvalue equation
where the kets are also parameterized by .
The wavefunctions for these states are commonly written as
To be perfectly clear, we have defined a basis for each value of .
There is a basis , and another basis , and so forth; there is nothing we can say in general about the overlap
Given a wavefunction , we can expand it as
where the expansion coefficients are given by
and is arbitrary.
Since we're not mathematicians, we can (and will) take continuity for granted.
It's fairly safe to assume that the potential varies continuously as is changed; after all, an arbitrarily large change in the potential when the configuration undergoes an infinitesimal shift would be unphysical.
Hence, the Hamiltonian and its eigenfunctions should also be continuous in the parameter , as should the expansion coefficients for any state.
One wrinkle that we do expect is that funny things can happen at degeneracies.
The adiabatic theorem tells us that if we vary sufficiently slowly (compared to the gap between and adjacent energies ), then the ordering of the eigenvalues remains the same and we can treat each as roughly independent.
In that case, we have what looks like multiple hypersurfaces in space floating above one another like sheets.
However, if the energies become equal, the gap between them vanishes, so these sheets touch and cease to be independent.
In making the Born–Oppenheimer approximation, we'll be implicitly assuming that this won't happen, so we won't dwell on this.
The representation is complete for , and every basis is complete for .
Thus, we are free to pick a specific and use the states as a basis for , but this would be silly, since is not generally an eigenstate of when .
Instead, we want to use the states
where the same appears in both kets.
To see that also forms a basis for (technically some sort of half-basis, half-representation mutant), we show that the transformation matrix
is unitary for any fixed .
The requirements for this are
In the last step we used the sampling property of the Dirac delta function outside an integral, with the understanding that it only exists inside an integral anyway.
A consequence of the states forming a complete basis is that a state has the wavefunction
Alternatively, we could write this as
The last of these is a strange animal, simultaneously serving both the roles of a basis expansion coefficient and a wavefunction for the slow space.
Since it only has a single index, this expansion looks suspiciously like a Schmidt decomposition, but is it one?
To express in Schmidt form, we would need to be able to write
where are orthogonal states on and are orthogonal states on .
While our are orthogonal for a fixed , they have an explicit dependence on , which is not allowed.
so these functions aren't even orthogonal.
Conversely, we also have
Now that we believe that the states form a basis, we can try to find the matrix elements of the Hamiltonian:
We have a complicated kinetic energy term, but a very diagonal potential energy term.
To proceed, we'll choose the form that was mentioned earlier for the kinetic energy:
Then, it follows that
where is the second (distributional) derivative of the delta function in the th direction, which satisfies
Thus, we can apply the Hamiltonian to a generic state as follows:
We have used the product rule, which states that
More pertinently, we have used the continuously-varying parametric dependence of on to allow the kinetic energy operator to take its derivative remotely through the derivative of the delta function.
For convenience, we use the gradient vector with elements
and we drop the unitful quantities to make the expressions below look clean.
If this makes you feel dirty, don't hesitate to pencil them in where appropriate.
With this in mind, we can write
are the non-adiabatic couplings and the terms containing them are the non-adiabatic coupling terms (NACTs).
Because the derivative operator is antihermitian, we find that , so is a skew-Hermitian matrix (in and ).
A consequence of this is that all its diagonal terms vanish: .
Because the second derivative operator is Hermitian, we also find that , so is a Hermitian matrix.
Hence, all its diagonal terms are real.
Note how the terms in the big square brackets smell like a Hamiltonian for the slow degrees of freedom, parameterized by and , and expressed in the position representation.
If we define the matrix with elements that have the position representation
then the overall Hamiltonian looks like a matrix of Hamiltonians for the slow degrees of freedom, indexed by the surfaces.
On the diagonal, we have simply
where the last two terms are plain old potentials.
On the off-diagonal, we instead have
which is a bit strange, because instead of a second derivative, it has first derivatives.
Nevertheless, this has the effect of turning the single Schrödinger equation
into a collection of coupled differential equations, indexed by :
where we have given a name to the differential operator
as a shorthand.
In the matrix picture, this looks like
If one is able to find the functions that simultaneously satisfy these equations, one can then assemble the eigenfunction
of the full Hamiltonian .
Before we continue, a few brief words about the Hamiltonians .
It is tempting to say that these are partial matrix elements of in the basis, but that direction is full of potential pitfalls.
For starters, which basis do we mean?
After all, there is a different one for each , and no matter which one we pick, it would be a mistake to claim that is the object of interest, since its position representation is not useful for us.
We could also try
which is definitely not what we wanted.
No, this sort of thinking just will not do.
Now that we have the Hamiltonian in the adiabatic representation, all that remains is to assume that the non-adiabatic couplings are sufficiently small that neglecting the NACTs entirely is a good approximation.
This leaves us with a Hamiltonian that is diagonal in surfaces:
In other words, each is the complete Hamiltonian for the slow degrees of freedom on surface .
It is then clear that the resulting collection of differential equations is completely uncoupled:
and they may be solved independently.
In the matrix picture, that's
It's somewhat peculiar that all of these seemingly independent Hamiltonians share the same eigenspectrum!
It's easy to show that implies that .
is the overlap between the familiar state and the weird object .
Because the form a complete basis, having be orthogonal to all implies that is the zero element.
Hence, the derivative of is zero; if this is true for all , doesn't depend on , so we can write just .
Now we can quickly show in two related ways that the above conclusion isn't a figment of our imagination:
In fact, because all the are degenerate, any linear combination
is also an eigenstate of , including those that only include a single term.
(I think that the above implies that , so .
Proving this or giving a counterexample is left as an exercise for the reader.)
However, in practice the NACTs don't disappear by themselves.
If that were the case, the word "approximation" wouldn't appear on this page.
Instead, we create a new Hamiltonian which has the same diagonal elements as , but with the couplings artificially set to zero.
In this more realistic scenario, it's not the case that all the surfaces are identical.
Still, because they are not coupled in this approximation, they may be dealt with independently.
When the non-adiabatic couplings are sufficiently strong that they can't be neglected, more sophisticated methods are necessary to treat more than one surface at a time; for example, switching to a diabatic representation.
This is commonly termed going "beyond the Born–Oppenheimer approximation", and is beyond the scope of the present work.