This is a pretty important concept in PDEs and its numerical approximations. Specifically, tt shows up in Bramble-Hilbert lemma, and domain decomposition analysis. Most of this post is pretty much written right after reading Toselli and Widlund’s book, so there are a lot of resemblance.

Let $\Omega$ be a bounded domain in $\mathbb{R}^n$ which is ‘nice’ (say Lipschitz boundary) with radius $h$. Now let $u, v \in H^1(\Omega)$ such that \begin{align*} |v|_{H^1(\Omega)} \le C||u||_{H^1(\Omega)} \end{align*} and we wish to obtain the $h$ dependence from $C$.

What we do is to first consider a scaled domain $\hat \Omega$ which is just $\Omega$ scaled to be of radius 1, with the change of basis $x = h\hat x$. If we find the corresponding inequality on $\hat \Omega$, then the constant $C$ will not depend on $h$. Let $\hat v(\hat x) := v(h\hat x)$, then we note that $\hat \nabla \hat v(\hat x) = h\hat \nabla v(h\hat x)$ where $\hat \nabla $ is the gradient with respect to $\hat x$. Then, \begin{align*} |v|^2_{H^1(\Omega)} &= \int_\Omega |\nabla v(x)|^2 \, dx &= \int_{\hat \Omega} |\hat\nabla v(h \hat x)|^2 h^n \, d\hat x &= \int_{\hat \Omega} |\hat\nabla \hat v(\hat x)|^2 h^{-2} h^n \, d\hat x = h^{n-2}|\hat v|_{H^1(\hat \Omega)}^2 \end{align*}

But for $L^2$ norm, there is no $h^2$ scaling, hence \begin{align*} ||u||_{L^2(\Omega)}^2 &= \int_\Omega |u(x)|^2 \, dx &= \int_{\hat \Omega} |u(h \hat x)|^2 h^n \, d\hat x = h^n ||\hat u||_{L^2(\hat \Omega)}^2. \end{align*} This is why derivatives mixing causes scaling issues.