8  Cartpole

Cartpole (Picture from Tedrake (2024))

Derivation from here.

8.1 Dynamics

  • Parameters

    • \(\mathcal{X}\): state space
    • \(\mathcal{U}\): action space
    • \(l\): length of pendulum [m]
    • \(m_c\): mass of cart [kg]
    • \(m_p\): mass of pendulum [kg]
    • \(g\): gravity constant [m/s^2]
  • State: \(\mathbf{x}= \begin{pmatrix} x, \theta, \dot x, \dot \theta \end{pmatrix}^\top \in \mathcal{X}\subset \mathbb R \times SO(2) \times \mathbb R^2\)

    • position \(x\) [m]
    • counter-clockwise angle of the pendulum (0=straight down) \(\theta\) [rad]
    • velocity \(\dot x\) [m/s]
    • angular velocity \(\dot \theta\) [rad/s]
  • Action: \(\mathbf{u}= (f_x) \in \mathcal{U}\)

    • force \(f_x\) [N]
  • Dynamics:

    In standard manipulator equation form:

    \[\begin{aligned} \mathbf{M}&= \left[\begin{matrix}m_{c} + m_{p} & l m_{p} \cos{\left(\theta \right)}\\l m_{p} \cos{\left(\theta \right)} & l^{2} m_{p}\end{matrix}\right]\\ \mathbf{C}&= \left[\begin{matrix}0 & - \dot \theta l m_{p} \sin{\left(\theta \right)}\\0 & 0\end{matrix}\right]\\ \tau_g &= \left[\begin{matrix}0\\- g l m_{p} \sin{\left(\theta \right)}\end{matrix}\right]\\ \mathbf{B}&= \left[\begin{matrix}1\\0\end{matrix}\right]\\ \end{aligned}\]

    Explicit expressions for \(\ddot x, \ddot \theta\) are:

    \[\begin{aligned} x_{0} &= \sin{\left(\theta \right)}\\ x_{1} &= \frac{1}{m_{c} + m_{p} x_{0}^{2}}\\ x_{2} &= \dot \theta^{2} l m_{p} x_{0} + f_{x}\\ \ddot x &= x_{1} \left(\frac{g m_{p} \sin{\left(2 \theta \right)}}{2} + x_{2}\right)\\ \ddot \theta &= - \frac{x_{1} \left(g x_{0} \left(m_{c} + m_{p}\right) + x_{2} \cos{\left(\theta \right)}\right)}{l}\\ \end{aligned}\]
    x0 = sin(theta);
    x1 = 1.0/(m_c + m_p*pow(x0, 2));
    x2 = pow(dtheta, 2)*l*m_p*x0 + f_x;
    ddx = x1*((1.0/2.0)*g*m_p*sin(2*theta) + x2);
    ddtheta = -x1*(g*x0*(m_c + m_p) + x2*cos(theta))/l;
    x0 = theta.sin();
    x1 = (m_c + m_p*x0.powi(2)).recip();
    x2 = dtheta.powi(2)*l*m_p*x0 + f_x;
    ddx = x1*((1_f64/2.0)*g*m_p*(2*theta).sin() + x2);
    ddtheta = -x1*(g*x0*(m_c + m_p) + x2*theta.cos())/l;
    x0 = math.sin(theta)
    x1 = 1/(m_c + m_p*x0**2)
    x2 = dtheta**2*l*m_p*x0 + f_x
    ddx = x1*((1/2)*g*m_p*math.sin(2*theta) + x2)
    ddtheta = -x1*(g*x0*(m_c + m_p) + x2*math.cos(theta))/l

8.2 Differential Flatness

8.3 Invariance

8.4 Controllers

8.4.1 Geometric Controller

8.4.2 Action Mixing

8.5 Useful Parameters

8.5.1 cartpole_v0

\[\begin{aligned} l &= 1\\ m_{c} &= 1\\ m_{p} &= 1\\ g &= 10\\ \end{aligned}\]
\[\begin{aligned} x_{0} &= \sin{\left(\theta \right)}\\ x_{1} &= \frac{1}{x_{0}^{2} + 1}\\ x_{2} &= \dot \theta^{2} x_{0} + f_{x}\\ \ddot x &= x_{1} \left(x_{2} + 5 \sin{\left(2 \theta \right)}\right)\\ \ddot \theta &= x_{1} \left(- 20 x_{0} - x_{2} \cos{\left(\theta \right)}\right)\\ \end{aligned}\]
x0 = sin(theta);
x1 = 1.0/(pow(x0, 2) + 1);
x2 = pow(dtheta, 2)*x0 + f_x;
ddx = x1*(x2 + 5*sin(2*theta));
ddtheta = x1*(-20*x0 - x2*cos(theta));
x0 = theta.sin();
x1 = (x0.powi(2) + 1).recip();
x2 = dtheta.powi(2)*x0 + f_x;
ddx = x1*(x2 + 5*(2*theta).sin());
ddtheta = x1*(-20*x0 - x2*theta.cos());
x0 = math.sin(theta)
x1 = 1/(x0**2 + 1)
x2 = dtheta**2*x0 + f_x
ddx = x1*(x2 + 5*math.sin(2*theta))
ddtheta = x1*(-20*x0 - x2*math.cos(theta))