Finite element method

In numerical analysis, the finite element method (FEM) is used for solving partial differential equations (PDE) approximately. Solutions are approximated by either eliminating the differential equation completely (steady state problems), or rendering the PDE into an equivalent ordinary differential equation, which is then solved using standard techniques such as finite differences, etc. The use of the finite element method in engineering for the analysis of physical systems is commonly known as finite element analysis.
Finite element methods have also been developed to approximately solve integral equations such as the heat transport equation.
The method was introduced by Richard Courant to model the effect of torsion on the shape of a cylinder. Courant's contribution was evolutionary, drawing on a large body of earlier results for PDEs developed by Rayleigh, Ritz and Galerkin. Development of the method began in earnest in the middle to late 1950s for airframe and structural analysis, and picked up a lot of steam at Berkeley in the 1960s for use in civil engineering. The method was provided with a rigorous mathematical foundation in 1973 with the publication of Strang and Fix's An Analysis of The Finite Element Method.
Finite element methods are used in a wide variety of engineering disciplines, e.g., electromagnetics.
In solving partial differential equations, the primary challenge is to create an equation which approximates the equation to be studied, but which is numerically stable, meaning that errors in the input data and intermediate calculations do not accumulate and cause the resulting output to be garbage.
Contents 
Preliminaries
For this discussion, we assume a basic understanding of differential calculus in several variables. If f is a function, then the notation f_{x} will denote the partial derivative of f with respect to x.
The best way to introduce the subject is to give a simple example (a "model problem"). We shall use the Laplace equation on the torus [0,1)x[0,1). First some notation. Let
 <math>
\mathrm{T} := \left[ 0,1\right) \times\left[ 0,1\right)=\{\left( x,y\right); 0 \leq x,y < 1\}. <math>
Now let g be a function from T to F, the field of scalars (either the real line R or the complex plane C.) Our problem is to find a function u from the entire plane R^{2} to F so that
 u is twice differentiable over R^{2} in the sense of multivariate calculus.
 u_{xx}+u_{yy}=g in T
 u(x,y)=u(x+1,y)=u(x,y+1), that is, u is periodic in x and in y, and of period 1 in both directions [more generally, the boundary condition should be homogeneous, i.e., if u and v both satisfy it, then so should au + bv, a,b arbitrary numbers in F].
This problem can be rephrased in terms of linear maps and vector spaces.
 u is in V, the vector space of twice differentiable functions of the torus T.
 Lu=g with g is some given vector in V.
Of course, here L is the differential operator given by:
 <math>
\mathrm{L}f=f_{xx}+f_{yy}. <math>
L is known as the Laplace operator. We are now looking for a u in V so that Lu=g.
Weak formulation
Now let φ be any functional of V, that is, φ is a function (mapping) from V to F so that for any t in F and u,v in V:
 <math>
\phi\left(tu+v\right)=t\phi\left(u\right)+\phi\left(v\right) <math>
(linear functional).
We denote by V^{*} the set of all such functionals. Then, certainly, the following statements are equivalent:
 <math>
\mathrm{L} u = g <math>
and
 <math>
\forall \phi \in \mathrm{V}^*: \phi\left(\mathrm{L}u\right)=\phi\left(g\right). <math>
The latter statement is said to be the weak form of our problem. One can think of it as emphasizing that the simple equality
 <math>
\phi\left(\mathrm{L}u\right)=\phi\left(g\right) <math>
for a single φ in V^{*} is insufficient to imply Lu = g, and thus it is weaker. The true meaning of the nomenclature is related to the socalled weak topology for Banach spaces, which is induced by V^{*}, but this is beyond the scope of this article.
The bilinear form of L, test functions
The set V^{*} is larger than it absolutely needs to be. It turns out that it suffices to use functionals of the form
 <math>
\phi_v\left(f\right)=\int_0^1\int_0^1 v\left(x,y\right) f\left(x,y\right) dxdy, <math>
where v satisfies the specified (periodic) boundary conditions.
That is, we are now replacing the original problem with that of finding u in V so that
 <math>
\phi_v\left(\mathrm{L}u\right)=\phi_v\left(g\right) <math>
for all v in V. From now on, we will use ∫_{T} for the double integral ∫_{0}^{1}∫_{0}^{1}. One can see, via integration by parts, that
 <math>
\int_T v u_{xx}\;dxdy=\int_0^1\left[v u_x\right]_0^1\;dy  \int_T\left(v_x u_x\right)\;dxdy<math>
 <math>
\int_T v u_{yy}\;dxdy=\int_0^1\left[v u_y\right]_0^1\;dx  \int_T\left(v_y u_y\right)\;dxdy <math>
and noting that because of the periodic boundary condition the first right hand side fg terms
 <math>
\int_0^1 \left[ v u_x \right]_0^1\;dy \mbox{ and } \int_0^1 \left[ v u_y \right]_0^1\;dx <math>
vanish, that
 <math>
\int_T v\left(u_{xx}+u_{yy}\right)\;dxdy = \int_T \left(v_x u_x + v_y u_y \right)\;dxdy. <math>
Thus we find that the earlier equality is equivalent to:
 <math>
\psi\left(u,v\right) := \phi_v\left(\mathrm{L}u\right) = \int_T\left(u_x v_x + u_y v_y\right) = \int_T g\cdot v = \phi_v \left(g\right). <math>
The function ψ of u and v is in fact bilinear, and it is the bilinear form associated with L. The functions v are called test functions.
We note here that ψ only uses first derivatives, and it would therefore be possible to discuss solutions to the original problem while only assuming first derivatives. In most cases, u will have two more derivatives than f (that is, if f is k times differentiable, then u will probably be k+2 times differentiable.)
A minimal set of test functions
From functional analysis, we know that we do not in fact need to try every possible test function v in V. In fact, if E={e_{1},e_{2},...} is a subset of V so that its linear span is dense in V (in a suitable topology) then we can use only those functions as test functions. That is, we are now trying to find a solution to the problem
<math> \textrm{(*)}\ \ \ \forall e_j \in \mathrm{E} : \psi\left(u,e_j\right)=\int_T g\cdot e_j. <math>
First discretization step
In order to turn this process into an algorithm that can be run on actual hardware, one chooses a finite subset of E. Now we have F=F_{n}={e_{1},e_{2},...,e_{n}} a finite set, F a subset of E, and we wish to solve the problem
<math> \textrm{(**)}\ \ \forall e_j \in \mathrm{F}_n : \psi\left(u_n,e_j\right)=\int_T g\cdot e_j <math>
with the goal that, as n increases to infinity (and F_{n} increases to E), the solutions u_{n} should converge to the solution u of (*)
This problem is underdetermined, so we take another discretization step.
Second discretization step
Let us now look for a solution in the linear span of F. While the actual solution of (*) is almost certainly not in the linear span of F, we are once more hoping for some sort of convergence property. If everything were occurring inside the linear span of F, we would have
 <math>
u_n = \sum_{k=1}^n a_k e_k<math>
 <math>
g_n = \sum_{k=1}^n b_k e_k. <math>
Substituting into (**) and expanding, we obtain:
<math> \textrm{(***) } \sum_{k=1}^n a_k \psi\left(e_k, e_j\right) = \sum_{k=1}^n b_k \int_T e_k e_j,\; j=1,\dots,n. <math>
There is now the question of how to invent a suitable g_{n} and there are many approaches, depending on how the set E was chosen. If E is chosen to be some Fourier basis, then g_{n} can be obtained as the projection of g onto the linear span of F, but other approaches are possible.
The desire of course is again to make certain that the resulting u_{n} will converge to a solution of (*).
Matrix version
The astute reader will have noted that the last problem can be formulated as a matrix problem as follows:
 <math>
\mathrm{P}a=\mathrm{Q}b <math>
where a, b are the column vectors
<math> a = \begin{bmatrix} a_1\\ a_2\\ \vdots\\ a_n \end{bmatrix}, b = \begin{bmatrix} b_1\\ b_2\\ \vdots\\ b_n \end{bmatrix}. <math>
The matrices P and Q are given by (***):
 <math>
\mathrm{P}_{jk} = \psi\left(e_k, e_j\right),<math>
 <math>
\mathrm{Q}_{jk} = \int_T e_k e_j. <math>
P is called the stiffness matrix and Q is called the mass matrix.
Algorithm
The Finite Element algorithm is thus as follows:
 Compute the Stiffness Matrix P.
 Compute the Mass Matrix Q.
 Approximate g and compute b.
 Solve the matrix problem Pa=Qb for the unknown vector a.
 Possibly convert the vector a back into a solution u (e.g. for viewing with a graphical computer program).
Note on the choice of basis
If one chooses a Fourier basis for E, then one gets a socalled spectral method, which from our point of view is closely related to the finite element method. Typically, in the finite element method, the set F is chosen directly, and is not usually construed to be a subset of some E.
Instead, the functions of F can be chosen to be piecewise linear. A tessellation of the domain T is chosen, which decomposes T into, e.g., triangles, and the functions of F are those that are linear on each component of the tessellation of T (see the illustration below). Each triangle is referred to as an "element." In this case the three triangle points will serve as nodes. The test functions will then be trianglewise linear functions which are continuous over the triangles' edges.
Finite_element_triangulation.gif
A piecewise linear function
A tessellation of a domain T into triangles (bottom), and the graph of a piecewise linear function on this domain (top).
An often suitable way to tessellate a domain into triangles is Delaunay triangulation.
To obtain better algorithms, one can attempt to vary the primitives of the tessellation; it may be more natural to use rectangular elements, and in some cases curvilinear elements are called for. Conversely, once the elements are chosen, one still has a choice of how to define the test functions on each element. Test functions are usually, but not always chosen to be piecewise polynomial.
If the test functions are chosen in a cunning way, the matrices P and Q will be structured sparsely, so that the linear problem Pa = Qb can be solved very quickly. To understand the problem here, the calculation of Qb requires apparently O(n^{2}) multiplications and additions.
If one chooses, e.g., the Fourier basis, one notes that Q is the identity matrix, so there's nothing to be done to compute Qb. The coefficients of b can then be found using the fast Fourier transform of g, which runs in O(n logn) time. The matrix P is diagonal with easily calculated entries, so solving for a is trivial. One would then typically do an inverse Fourier transform on a, which requires O(n logn) to obtain u.
If one chooses a finite element basis (using, for instance, the triangular elements discussed above) so that each test function is supported on a small number of elements, then one can see that the matrices P and Q will be sparse  that is, most of the entries are zero. Then the calculation Qb can be done in O(n) time, and solving for a can be done efficiently using an iterative algorithm (such as the conjugate gradient method, or Krylov subspace methods).
We note that if we choose piecewise linear elements, they will not even be once differentiable. The elements are piecewise differentiable, and continuous; it may appear that the above argument shows this is sufficient. In fact, a careful argument requires the use of functional analysis tools and Sobolev spaces.
See also
de:FiniteElementeMethode it:metodo degli elementi finiti ja:有限要素法 pl:Metoda elementów skończonych fr:Méthode des éléments finis