209 lines
7.1 KiB
Markdown
209 lines
7.1 KiB
Markdown
# Updated LevPodEnv - Physical System Clarification
|
||
|
||
## System Architecture
|
||
|
||
### Physical Configuration
|
||
|
||
**Two U-Shaped Magnetic Yokes:**
|
||
- **Front Yoke**: Located at X = +0.1259m
|
||
- Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
|
||
- Force is applied at center: X = +0.1259m, Y = 0m
|
||
|
||
- **Back Yoke**: Located at X = -0.1259m
|
||
- Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
|
||
- Force is applied at center: X = -0.1259m, Y = 0m
|
||
|
||
**Four Independent Coil Currents:**
|
||
1. `curr_front_L`: Current around front yoke's left (+Y) end
|
||
2. `curr_front_R`: Current around front yoke's right (-Y) end
|
||
3. `curr_back_L`: Current around back yoke's left (+Y) end
|
||
4. `curr_back_R`: Current around back yoke's right (-Y) end
|
||
|
||
**Current Range:** -15A to +15A (from Ansys CSV data)
|
||
- Negative current: Strengthens permanent magnet field → stronger attraction
|
||
- Positive current: Weakens permanent magnet field → weaker attraction
|
||
|
||
### Collision Geometry in URDF
|
||
|
||
**Yoke Ends (4 boxes):** Represent the tips of the U-yokes where gap is measured
|
||
- Front Left: (+0.1259m, +0.0508m, +0.08585m)
|
||
- Front Right: (+0.1259m, -0.0508m, +0.08585m)
|
||
- Back Left: (-0.1259m, +0.0508m, +0.08585m)
|
||
- Back Right: (-0.1259m, -0.0508m, +0.08585m)
|
||
|
||
**Sensors (4 cylinders):** Physical gap sensors at different locations
|
||
- Center Right: (0m, +0.0508m, +0.08585m)
|
||
- Center Left: (0m, -0.0508m, +0.08585m)
|
||
- Front: (+0.2366m, 0m, +0.08585m)
|
||
- Back: (-0.2366m, 0m, +0.08585m)
|
||
|
||
## RL Environment Interface
|
||
|
||
### Action Space
|
||
**Type:** `Box(4)`, Range: [-1, 1]
|
||
|
||
**Actions:** `[pwm_front_L, pwm_front_R, pwm_back_L, pwm_back_R]`
|
||
- PWM duty cycles for the 4 independent coils
|
||
- Converted to currents via RL circuit model: `di/dt = (V_pwm - I*R) / L`
|
||
|
||
### Observation Space
|
||
**Type:** `Box(4)`, Range: [-inf, inf]
|
||
|
||
**Observations:** `[sensor_center_right, sensor_center_left, sensor_front, sensor_back]`
|
||
- **Noisy sensor readings** (not direct yoke measurements)
|
||
- Noise: Gaussian with σ = 0.1mm (0.0001m)
|
||
- Agent must learn system dynamics from sensor data alone
|
||
- Velocities not directly provided - agent can learn from temporal sequence if needed
|
||
|
||
### Force Application Physics
|
||
|
||
For each timestep:
|
||
|
||
1. **Measure yoke end gap heights** (from 4 yoke collision boxes)
|
||
2. **Average left/right ends** for each U-yoke:
|
||
- `avg_gap_front = (gap_front_L + gap_front_R) / 2`
|
||
- `avg_gap_back = (gap_back_L + gap_back_R) / 2`
|
||
|
||
3. **Calculate roll angle** from yoke end positions:
|
||
```python
|
||
roll_front = arctan((gap_right - gap_left) / y_distance)
|
||
roll_back = arctan((gap_right - gap_left) / y_distance)
|
||
roll = (roll_front + roll_back) / 2
|
||
```
|
||
|
||
4. **Predict forces** using maglev_predictor:
|
||
```python
|
||
force_front, torque_front = predictor.predict(
|
||
curr_front_L, curr_front_R, roll_deg, gap_front_mm
|
||
)
|
||
force_back, torque_back = predictor.predict(
|
||
curr_back_L, curr_back_R, roll_deg, gap_back_mm
|
||
)
|
||
```
|
||
|
||
5. **Apply forces at Y=0** (center of each U-yoke):
|
||
- Front force at: `[+0.1259, 0, 0.08585]`
|
||
- Back force at: `[-0.1259, 0, 0.08585]`
|
||
|
||
6. **Apply roll torques** from each yoke independently
|
||
|
||
### Key Design Decisions
|
||
|
||
**Why 4 actions instead of 2?**
|
||
- Physical system has 4 independent electromagnets (one per yoke end)
|
||
- Allows fine control of roll torque
|
||
- Left/right current imbalance on each yoke creates torque
|
||
|
||
**Why sensor observations instead of yoke measurements?**
|
||
- Realistic: sensors are at different positions than yokes
|
||
- Adds partial observability challenge
|
||
- Agent must learn system dynamics to infer unmeasured states
|
||
- Sensor noise simulates real measurement uncertainty
|
||
|
||
**Why not include velocities in observation?**
|
||
- Agent can learn velocities from temporal sequence (frame stacking)
|
||
- Reduces observation dimensionality
|
||
- Tests if agent can learn dynamic behavior from gap measurements alone
|
||
|
||
**Current sign convention:**
|
||
- No conversion needed - currents fed directly to predictor
|
||
- Range: -15A to +15A (from Ansys model)
|
||
- Coil RL circuit naturally produces currents in this range
|
||
|
||
### Comparison with Original Design
|
||
|
||
| Feature | Original | Updated |
|
||
|---------|----------|---------|
|
||
| **Actions** | 2 (left/right coils) | 4 (front_L, front_R, back_L, back_R) |
|
||
| **Observations** | 5 (gaps, roll, velocities) | 4 (noisy sensor gaps) |
|
||
| **Gap Measurement** | Direct yoke positions | Noisy sensor positions |
|
||
| **Force Application** | Front & back yoke centers | Front & back yoke centers ✓ |
|
||
| **Current Range** | Assumed negative only | -15A to +15A |
|
||
| **Roll Calculation** | From yoke end heights | From yoke end heights ✓ |
|
||
|
||
## Physics Pipeline (Per Timestep)
|
||
|
||
1. **Action → Currents**
|
||
```
|
||
PWM[4] → RL Circuit Model → Currents[4]
|
||
```
|
||
|
||
2. **State Measurement**
|
||
```
|
||
Yoke End Positions[4] → Gap Heights[4] → Average per Yoke[2]
|
||
```
|
||
|
||
3. **Roll Calculation**
|
||
```
|
||
(Gap_Right - Gap_Left) / Y_distance → Roll Angle
|
||
```
|
||
|
||
4. **Force Prediction**
|
||
```
|
||
(currL, currR, roll, gap) → Maglev Predictor → (force, torque)
|
||
Applied separately for front and back yokes
|
||
```
|
||
|
||
5. **Force Application**
|
||
```
|
||
Forces at Y=0 for each yoke + Roll torques
|
||
```
|
||
|
||
6. **Observation Generation**
|
||
```
|
||
Sensor Positions[4] → Gap Heights[4] → Add Noise → Observation[4]
|
||
```
|
||
|
||
## Info Dictionary
|
||
|
||
Each `env.step()` returns comprehensive diagnostics:
|
||
|
||
```python
|
||
{
|
||
'curr_front_L': float, # Front left coil current (A)
|
||
'curr_front_R': float, # Front right coil current (A)
|
||
'curr_back_L': float, # Back left coil current (A)
|
||
'curr_back_R': float, # Back right coil current (A)
|
||
'gap_front_yoke': float, # Front yoke average gap (m)
|
||
'gap_back_yoke': float, # Back yoke average gap (m)
|
||
'roll': float, # Roll angle (rad)
|
||
'force_front': float, # Front yoke force (N)
|
||
'force_back': float, # Back yoke force (N)
|
||
'torque_front': float, # Front yoke torque (mN·m)
|
||
'torque_back': float # Back yoke torque (mN·m)
|
||
}
|
||
```
|
||
|
||
## Testing
|
||
|
||
Run the updated test script:
|
||
```bash
|
||
cd "/Users/adipu/Documents/lev_control_4pt_small/RL Testing"
|
||
/opt/miniconda3/envs/RLenv/bin/python test_env.py
|
||
```
|
||
|
||
Expected behavior:
|
||
- 4 sensors report gap heights with small noise variations
|
||
- Yoke gaps (in info) match sensor gaps approximately
|
||
- All 4 coils build up current over time (RL circuit dynamics)
|
||
- Forces should be ~50-100N upward at 10mm gap with moderate currents
|
||
- Pod should begin to levitate if forces overcome gravity (5.8kg × 9.81 = 56.898 N needed)
|
||
|
||
## Next Steps for RL Training
|
||
|
||
1. **Frame Stacking**: Use 3-5 consecutive observations to give agent velocity information
|
||
```python
|
||
from stable_baselines3.common.vec_env import VecFrameStack
|
||
env = VecFrameStack(env, n_stack=4)
|
||
```
|
||
|
||
2. **Algorithm Selection**: PPO or SAC recommended
|
||
- PPO: Good for continuous control, stable training
|
||
- SAC: Better sample efficiency, handles stochastic dynamics
|
||
|
||
3. **Reward Tuning**: Current reward weights may need adjustment based on training performance
|
||
|
||
4. **Curriculum Learning**: Start with smaller gap errors, gradually increase difficulty
|
||
|
||
5. **Domain Randomization**: Vary sensor noise, mass, etc. for robust policy
|