built environment

2025-12-10 15:50:20 -06:00
parent f2ae33db8c
commit c74c086ef7
11 changed files with 1433 additions and 6 deletions
--- a/Testing/ENV_UPDATE.md
+++ b/Testing/ENV_UPDATE.md
@@ -0,0 +1,208 @@
+# Updated LevPodEnv - Physical System Clarification
+
+## System Architecture
+
+### Physical Configuration
+
+**Two U-Shaped Magnetic Yokes:**
+- **Front Yoke**: Located at X = +0.1259m
+  - Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
+  - Force is applied at center: X = +0.1259m, Y = 0m
+  
+- **Back Yoke**: Located at X = -0.1259m
+  - Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
+  - Force is applied at center: X = -0.1259m, Y = 0m
+
+**Four Independent Coil Currents:**
+1. `curr_front_L`: Current around front yoke's left (+Y) end
+2. `curr_front_R`: Current around front yoke's right (-Y) end
+3. `curr_back_L`: Current around back yoke's left (+Y) end
+4. `curr_back_R`: Current around back yoke's right (-Y) end
+
+**Current Range:** -15A to +15A (from Ansys CSV data)
+- Negative current: Strengthens permanent magnet field → stronger attraction
+- Positive current: Weakens permanent magnet field → weaker attraction
+
+### Collision Geometry in URDF
+
+**Yoke Ends (4 boxes):** Represent the tips of the U-yokes where gap is measured
+- Front Left: (+0.1259m, +0.0508m, +0.08585m)
+- Front Right: (+0.1259m, -0.0508m, +0.08585m)
+- Back Left: (-0.1259m, +0.0508m, +0.08585m)
+- Back Right: (-0.1259m, -0.0508m, +0.08585m)
+
+**Sensors (4 cylinders):** Physical gap sensors at different locations
+- Center Right: (0m, +0.0508m, +0.08585m)
+- Center Left: (0m, -0.0508m, +0.08585m)
+- Front: (+0.2366m, 0m, +0.08585m)
+- Back: (-0.2366m, 0m, +0.08585m)
+
+## RL Environment Interface
+
+### Action Space
+**Type:** `Box(4)`, Range: [-1, 1]
+
+**Actions:** `[pwm_front_L, pwm_front_R, pwm_back_L, pwm_back_R]`
+- PWM duty cycles for the 4 independent coils
+- Converted to currents via RL circuit model: `di/dt = (V_pwm - I*R) / L`
+
+### Observation Space
+**Type:** `Box(4)`, Range: [-inf, inf]
+
+**Observations:** `[sensor_center_right, sensor_center_left, sensor_front, sensor_back]`
+- **Noisy sensor readings** (not direct yoke measurements)
+- Noise: Gaussian with σ = 0.1mm (0.0001m)
+- Agent must learn system dynamics from sensor data alone
+- Velocities not directly provided - agent can learn from temporal sequence if needed
+
+### Force Application Physics
+
+For each timestep:
+
+1. **Measure yoke end gap heights** (from 4 yoke collision boxes)
+2. **Average left/right ends** for each U-yoke:
+   - `avg_gap_front = (gap_front_L + gap_front_R) / 2`
+   - `avg_gap_back = (gap_back_L + gap_back_R) / 2`
+
+3. **Calculate roll angle** from yoke end positions:
+   ```python
+   roll_front = arctan((gap_right - gap_left) / y_distance)
+   roll_back = arctan((gap_right - gap_left) / y_distance)
+   roll = (roll_front + roll_back) / 2
+   ```
+
+4. **Predict forces** using maglev_predictor:
+   ```python
+   force_front, torque_front = predictor.predict(
+       curr_front_L, curr_front_R, roll_deg, gap_front_mm
+   )
+   force_back, torque_back = predictor.predict(
+       curr_back_L, curr_back_R, roll_deg, gap_back_mm
+   )
+   ```
+
+5. **Apply forces at Y=0** (center of each U-yoke):
+   - Front force at: `[+0.1259, 0, 0.08585]`
+   - Back force at: `[-0.1259, 0, 0.08585]`
+
+6. **Apply roll torques** from each yoke independently
+
+### Key Design Decisions
+
+**Why 4 actions instead of 2?**
+- Physical system has 4 independent electromagnets (one per yoke end)
+- Allows fine control of roll torque
+- Left/right current imbalance on each yoke creates torque
+
+**Why sensor observations instead of yoke measurements?**
+- Realistic: sensors are at different positions than yokes
+- Adds partial observability challenge
+- Agent must learn system dynamics to infer unmeasured states
+- Sensor noise simulates real measurement uncertainty
+
+**Why not include velocities in observation?**
+- Agent can learn velocities from temporal sequence (frame stacking)
+- Reduces observation dimensionality
+- Tests if agent can learn dynamic behavior from gap measurements alone
+
+**Current sign convention:**
+- No conversion needed - currents fed directly to predictor
+- Range: -15A to +15A (from Ansys model)
+- Coil RL circuit naturally produces currents in this range
+
+### Comparison with Original Design
+
+| Feature | Original | Updated |
+|---------|----------|---------|
+| **Actions** | 2 (left/right coils) | 4 (front_L, front_R, back_L, back_R) |
+| **Observations** | 5 (gaps, roll, velocities) | 4 (noisy sensor gaps) |
+| **Gap Measurement** | Direct yoke positions | Noisy sensor positions |
+| **Force Application** | Front & back yoke centers | Front & back yoke centers ✓ |
+| **Current Range** | Assumed negative only | -15A to +15A |
+| **Roll Calculation** | From yoke end heights | From yoke end heights ✓ |
+
+## Physics Pipeline (Per Timestep)
+
+1. **Action → Currents**
+   ```
+   PWM[4] → RL Circuit Model → Currents[4]
+   ```
+
+2. **State Measurement**
+   ```
+   Yoke End Positions[4] → Gap Heights[4] → Average per Yoke[2]
+   ```
+
+3. **Roll Calculation**
+   ```
+   (Gap_Right - Gap_Left) / Y_distance → Roll Angle
+   ```
+
+4. **Force Prediction**
+   ```
+   (currL, currR, roll, gap) → Maglev Predictor → (force, torque)
+   Applied separately for front and back yokes
+   ```
+
+5. **Force Application**
+   ```
+   Forces at Y=0 for each yoke + Roll torques
+   ```
+
+6. **Observation Generation**
+   ```
+   Sensor Positions[4] → Gap Heights[4] → Add Noise → Observation[4]
+   ```
+
+## Info Dictionary
+
+Each `env.step()` returns comprehensive diagnostics:
+
+```python
+{
+    'curr_front_L': float,      # Front left coil current (A)
+    'curr_front_R': float,      # Front right coil current (A)
+    'curr_back_L': float,       # Back left coil current (A)
+    'curr_back_R': float,       # Back right coil current (A)
+    'gap_front_yoke': float,    # Front yoke average gap (m)
+    'gap_back_yoke': float,     # Back yoke average gap (m)
+    'roll': float,              # Roll angle (rad)
+    'force_front': float,       # Front yoke force (N)
+    'force_back': float,        # Back yoke force (N)
+    'torque_front': float,      # Front yoke torque (mN·m)
+    'torque_back': float        # Back yoke torque (mN·m)
+}
+```
+
+## Testing
+
+Run the updated test script:
+```bash
+cd "/Users/adipu/Documents/lev_control_4pt_small/RL Testing"
+/opt/miniconda3/envs/RLenv/bin/python test_env.py
+```
+
+Expected behavior:
+- 4 sensors report gap heights with small noise variations
+- Yoke gaps (in info) match sensor gaps approximately
+- All 4 coils build up current over time (RL circuit dynamics)
+- Forces should be ~50-100N upward at 10mm gap with moderate currents
+- Pod should begin to levitate if forces overcome gravity (5.8kg × 9.81 = 56.898 N needed)
+
+## Next Steps for RL Training
+
+1. **Frame Stacking**: Use 3-5 consecutive observations to give agent velocity information
+   ```python
+   from stable_baselines3.common.vec_env import VecFrameStack
+   env = VecFrameStack(env, n_stack=4)
+   ```
+
+2. **Algorithm Selection**: PPO or SAC recommended
+   - PPO: Good for continuous control, stable training
+   - SAC: Better sample efficiency, handles stochastic dynamics
+
+3. **Reward Tuning**: Current reward weights may need adjustment based on training performance
+
+4. **Curriculum Learning**: Start with smaller gap errors, gradually increase difficulty
+
+5. **Domain Randomization**: Vary sensor noise, mass, etc. for robust policy