8 Cartpole
Derivation from here.
8.1 Dynamics
Parameters
- \(\mathcal{X}\): state space
- \(\mathcal{U}\): action space
- \(l\): length of pendulum [m]
- \(m_c\): mass of cart [kg]
- \(m_p\): mass of pendulum [kg]
- \(g\): gravity constant [m/s^2]
State: \(\mathbf{x}= \begin{pmatrix} x, \theta, \dot x, \dot \theta \end{pmatrix}^\top \in \mathcal{X}\subset \mathbb R \times SO(2) \times \mathbb R^2\)
- position \(x\) [m]
- counter-clockwise angle of the pendulum (0=straight down) \(\theta\) [rad]
- velocity \(\dot x\) [m/s]
- angular velocity \(\dot \theta\) [rad/s]
Action: \(\mathbf{u}= (f_x) \in \mathcal{U}\)
- force \(f_x\) [N]
Dynamics:
In standard manipulator equation form:
\[\begin{aligned} \mathbf{M}&= \left[\begin{matrix}m_{c} + m_{p} & l m_{p} \cos{\left(\theta \right)}\\l m_{p} \cos{\left(\theta \right)} & l^{2} m_{p}\end{matrix}\right]\\ \mathbf{C}&= \left[\begin{matrix}0 & - \dot \theta l m_{p} \sin{\left(\theta \right)}\\0 & 0\end{matrix}\right]\\ \tau_g &= \left[\begin{matrix}0\\- g l m_{p} \sin{\left(\theta \right)}\end{matrix}\right]\\ \mathbf{B}&= \left[\begin{matrix}1\\0\end{matrix}\right]\\ \end{aligned}\]Explicit expressions for \(\ddot x, \ddot \theta\) are:
\[\begin{aligned} x_{0} &= \sin{\left(\theta \right)}\\ x_{1} &= \frac{1}{m_{c} + m_{p} x_{0}^{2}}\\ x_{2} &= \dot \theta^{2} l m_{p} x_{0} + f_{x}\\ \ddot x &= x_{1} \left(\frac{g m_{p} \sin{\left(2 \theta \right)}}{2} + x_{2}\right)\\ \ddot \theta &= - \frac{x_{1} \left(g x_{0} \left(m_{c} + m_{p}\right) + x_{2} \cos{\left(\theta \right)}\right)}{l}\\ \end{aligned}\]x0 = sin(theta); x1 = 1.0/(m_c + m_p*pow(x0, 2)); x2 = pow(dtheta, 2)*l*m_p*x0 + f_x; ddx = x1*((1.0/2.0)*g*m_p*sin(2*theta) + x2); ddtheta = -x1*(g*x0*(m_c + m_p) + x2*cos(theta))/l;x0 = theta.sin(); x1 = (m_c + m_p*x0.powi(2)).recip(); x2 = dtheta.powi(2)*l*m_p*x0 + f_x; ddx = x1*((1_f64/2.0)*g*m_p*(2*theta).sin() + x2); ddtheta = -x1*(g*x0*(m_c + m_p) + x2*theta.cos())/l;x0 = math.sin(theta) x1 = 1/(m_c + m_p*x0**2) x2 = dtheta**2*l*m_p*x0 + f_x ddx = x1*((1/2)*g*m_p*math.sin(2*theta) + x2) ddtheta = -x1*(g*x0*(m_c + m_p) + x2*math.cos(theta))/l
8.2 Differential Flatness
8.3 Invariance
8.4 Controllers
8.4.1 Geometric Controller
8.4.2 Action Mixing
8.5 Useful Parameters
8.5.1 cartpole_v0
\[\begin{aligned} l &= 1\\ m_{c} &= 1\\ m_{p} &= 1\\ g &= 10\\ \end{aligned}\]
\[\begin{aligned}
x_{0} &= \sin{\left(\theta \right)}\\
x_{1} &= \frac{1}{x_{0}^{2} + 1}\\
x_{2} &= \dot \theta^{2} x_{0} + f_{x}\\
\ddot x &= x_{1} \left(x_{2} + 5 \sin{\left(2 \theta \right)}\right)\\
\ddot \theta &= x_{1} \left(- 20 x_{0} - x_{2} \cos{\left(\theta \right)}\right)\\
\end{aligned}\]
x0 = sin(theta);
x1 = 1.0/(pow(x0, 2) + 1);
x2 = pow(dtheta, 2)*x0 + f_x;
ddx = x1*(x2 + 5*sin(2*theta));
ddtheta = x1*(-20*x0 - x2*cos(theta));x0 = theta.sin();
x1 = (x0.powi(2) + 1).recip();
x2 = dtheta.powi(2)*x0 + f_x;
ddx = x1*(x2 + 5*(2*theta).sin());
ddtheta = x1*(-20*x0 - x2*theta.cos());x0 = math.sin(theta)
x1 = 1/(x0**2 + 1)
x2 = dtheta**2*x0 + f_x
ddx = x1*(x2 + 5*math.sin(2*theta))
ddtheta = x1*(-20*x0 - x2*math.cos(theta))