built environment

This commit is contained in:
2025-12-10 15:50:20 -06:00
parent f2ae33db8c
commit c74c086ef7
11 changed files with 1433 additions and 6 deletions

208
RL Testing/ENV_UPDATE.md Normal file
View File

@@ -0,0 +1,208 @@
# Updated LevPodEnv - Physical System Clarification
## System Architecture
### Physical Configuration
**Two U-Shaped Magnetic Yokes:**
- **Front Yoke**: Located at X = +0.1259m
- Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
- Force is applied at center: X = +0.1259m, Y = 0m
- **Back Yoke**: Located at X = -0.1259m
- Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
- Force is applied at center: X = -0.1259m, Y = 0m
**Four Independent Coil Currents:**
1. `curr_front_L`: Current around front yoke's left (+Y) end
2. `curr_front_R`: Current around front yoke's right (-Y) end
3. `curr_back_L`: Current around back yoke's left (+Y) end
4. `curr_back_R`: Current around back yoke's right (-Y) end
**Current Range:** -15A to +15A (from Ansys CSV data)
- Negative current: Strengthens permanent magnet field → stronger attraction
- Positive current: Weakens permanent magnet field → weaker attraction
### Collision Geometry in URDF
**Yoke Ends (4 boxes):** Represent the tips of the U-yokes where gap is measured
- Front Left: (+0.1259m, +0.0508m, +0.08585m)
- Front Right: (+0.1259m, -0.0508m, +0.08585m)
- Back Left: (-0.1259m, +0.0508m, +0.08585m)
- Back Right: (-0.1259m, -0.0508m, +0.08585m)
**Sensors (4 cylinders):** Physical gap sensors at different locations
- Center Right: (0m, +0.0508m, +0.08585m)
- Center Left: (0m, -0.0508m, +0.08585m)
- Front: (+0.2366m, 0m, +0.08585m)
- Back: (-0.2366m, 0m, +0.08585m)
## RL Environment Interface
### Action Space
**Type:** `Box(4)`, Range: [-1, 1]
**Actions:** `[pwm_front_L, pwm_front_R, pwm_back_L, pwm_back_R]`
- PWM duty cycles for the 4 independent coils
- Converted to currents via RL circuit model: `di/dt = (V_pwm - I*R) / L`
### Observation Space
**Type:** `Box(4)`, Range: [-inf, inf]
**Observations:** `[sensor_center_right, sensor_center_left, sensor_front, sensor_back]`
- **Noisy sensor readings** (not direct yoke measurements)
- Noise: Gaussian with σ = 0.1mm (0.0001m)
- Agent must learn system dynamics from sensor data alone
- Velocities not directly provided - agent can learn from temporal sequence if needed
### Force Application Physics
For each timestep:
1. **Measure yoke end gap heights** (from 4 yoke collision boxes)
2. **Average left/right ends** for each U-yoke:
- `avg_gap_front = (gap_front_L + gap_front_R) / 2`
- `avg_gap_back = (gap_back_L + gap_back_R) / 2`
3. **Calculate roll angle** from yoke end positions:
```python
roll_front = arctan((gap_right - gap_left) / y_distance)
roll_back = arctan((gap_right - gap_left) / y_distance)
roll = (roll_front + roll_back) / 2
```
4. **Predict forces** using maglev_predictor:
```python
force_front, torque_front = predictor.predict(
curr_front_L, curr_front_R, roll_deg, gap_front_mm
)
force_back, torque_back = predictor.predict(
curr_back_L, curr_back_R, roll_deg, gap_back_mm
)
```
5. **Apply forces at Y=0** (center of each U-yoke):
- Front force at: `[+0.1259, 0, 0.08585]`
- Back force at: `[-0.1259, 0, 0.08585]`
6. **Apply roll torques** from each yoke independently
### Key Design Decisions
**Why 4 actions instead of 2?**
- Physical system has 4 independent electromagnets (one per yoke end)
- Allows fine control of roll torque
- Left/right current imbalance on each yoke creates torque
**Why sensor observations instead of yoke measurements?**
- Realistic: sensors are at different positions than yokes
- Adds partial observability challenge
- Agent must learn system dynamics to infer unmeasured states
- Sensor noise simulates real measurement uncertainty
**Why not include velocities in observation?**
- Agent can learn velocities from temporal sequence (frame stacking)
- Reduces observation dimensionality
- Tests if agent can learn dynamic behavior from gap measurements alone
**Current sign convention:**
- No conversion needed - currents fed directly to predictor
- Range: -15A to +15A (from Ansys model)
- Coil RL circuit naturally produces currents in this range
### Comparison with Original Design
| Feature | Original | Updated |
|---------|----------|---------|
| **Actions** | 2 (left/right coils) | 4 (front_L, front_R, back_L, back_R) |
| **Observations** | 5 (gaps, roll, velocities) | 4 (noisy sensor gaps) |
| **Gap Measurement** | Direct yoke positions | Noisy sensor positions |
| **Force Application** | Front & back yoke centers | Front & back yoke centers ✓ |
| **Current Range** | Assumed negative only | -15A to +15A |
| **Roll Calculation** | From yoke end heights | From yoke end heights ✓ |
## Physics Pipeline (Per Timestep)
1. **Action → Currents**
```
PWM[4] → RL Circuit Model → Currents[4]
```
2. **State Measurement**
```
Yoke End Positions[4] → Gap Heights[4] → Average per Yoke[2]
```
3. **Roll Calculation**
```
(Gap_Right - Gap_Left) / Y_distance → Roll Angle
```
4. **Force Prediction**
```
(currL, currR, roll, gap) → Maglev Predictor → (force, torque)
Applied separately for front and back yokes
```
5. **Force Application**
```
Forces at Y=0 for each yoke + Roll torques
```
6. **Observation Generation**
```
Sensor Positions[4] → Gap Heights[4] → Add Noise → Observation[4]
```
## Info Dictionary
Each `env.step()` returns comprehensive diagnostics:
```python
{
'curr_front_L': float, # Front left coil current (A)
'curr_front_R': float, # Front right coil current (A)
'curr_back_L': float, # Back left coil current (A)
'curr_back_R': float, # Back right coil current (A)
'gap_front_yoke': float, # Front yoke average gap (m)
'gap_back_yoke': float, # Back yoke average gap (m)
'roll': float, # Roll angle (rad)
'force_front': float, # Front yoke force (N)
'force_back': float, # Back yoke force (N)
'torque_front': float, # Front yoke torque (mN·m)
'torque_back': float # Back yoke torque (mN·m)
}
```
## Testing
Run the updated test script:
```bash
cd "/Users/adipu/Documents/lev_control_4pt_small/RL Testing"
/opt/miniconda3/envs/RLenv/bin/python test_env.py
```
Expected behavior:
- 4 sensors report gap heights with small noise variations
- Yoke gaps (in info) match sensor gaps approximately
- All 4 coils build up current over time (RL circuit dynamics)
- Forces should be ~50-100N upward at 10mm gap with moderate currents
- Pod should begin to levitate if forces overcome gravity (5.8kg × 9.81 = 56.898 N needed)
## Next Steps for RL Training
1. **Frame Stacking**: Use 3-5 consecutive observations to give agent velocity information
```python
from stable_baselines3.common.vec_env import VecFrameStack
env = VecFrameStack(env, n_stack=4)
```
2. **Algorithm Selection**: PPO or SAC recommended
- PPO: Good for continuous control, stable training
- SAC: Better sample efficiency, handles stochastic dynamics
3. **Reward Tuning**: Current reward weights may need adjustment based on training performance
4. **Curriculum Learning**: Start with smaller gap errors, gradually increase difficulty
5. **Domain Randomization**: Vary sensor noise, mass, etc. for robust policy