adipu/guadaloop_lev_control

Fork 0

Files

pulipakaa24 c74c086ef7 built environment

2025-12-10 15:50:20 -06:00

7.1 KiB

Raw Blame History

Updated LevPodEnv - Physical System Clarification

System Architecture

Physical Configuration

Two U-Shaped Magnetic Yokes:

Front Yoke: Located at X = +0.1259m
- Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
- Force is applied at center: X = +0.1259m, Y = 0m
Back Yoke: Located at X = -0.1259m
- Has two ends: Left (+Y = +0.0508m) and Right (-Y = -0.0508m)
- Force is applied at center: X = -0.1259m, Y = 0m

Four Independent Coil Currents:

curr_front_L: Current around front yoke's left (+Y) end
curr_front_R: Current around front yoke's right (-Y) end
curr_back_L: Current around back yoke's left (+Y) end
curr_back_R: Current around back yoke's right (-Y) end

Current Range: -15A to +15A (from Ansys CSV data)

Negative current: Strengthens permanent magnet field → stronger attraction
Positive current: Weakens permanent magnet field → weaker attraction

Collision Geometry in URDF

Yoke Ends (4 boxes): Represent the tips of the U-yokes where gap is measured

Front Left: (+0.1259m, +0.0508m, +0.08585m)
Front Right: (+0.1259m, -0.0508m, +0.08585m)
Back Left: (-0.1259m, +0.0508m, +0.08585m)
Back Right: (-0.1259m, -0.0508m, +0.08585m)

Sensors (4 cylinders): Physical gap sensors at different locations

Center Right: (0m, +0.0508m, +0.08585m)
Center Left: (0m, -0.0508m, +0.08585m)
Front: (+0.2366m, 0m, +0.08585m)
Back: (-0.2366m, 0m, +0.08585m)

RL Environment Interface

Action Space

Type: Box(4), Range: [-1, 1]

Actions: [pwm_front_L, pwm_front_R, pwm_back_L, pwm_back_R]

PWM duty cycles for the 4 independent coils
Converted to currents via RL circuit model: di/dt = (V_pwm - I*R) / L

Observation Space

Type: Box(4), Range: [-inf, inf]

Observations: [sensor_center_right, sensor_center_left, sensor_front, sensor_back]

Noisy sensor readings (not direct yoke measurements)
Noise: Gaussian with σ = 0.1mm (0.0001m)
Agent must learn system dynamics from sensor data alone
Velocities not directly provided - agent can learn from temporal sequence if needed

Force Application Physics

For each timestep:

Measure yoke end gap heights (from 4 yoke collision boxes)
Average left/right ends for each U-yoke:
- avg_gap_front = (gap_front_L + gap_front_R) / 2
- avg_gap_back = (gap_back_L + gap_back_R) / 2

Calculate roll angle from yoke end positions:

roll_front = arctan((gap_right - gap_left) / y_distance)
roll_back = arctan((gap_right - gap_left) / y_distance)
roll = (roll_front + roll_back) / 2

Predict forces using maglev_predictor:

force_front, torque_front = predictor.predict(
    curr_front_L, curr_front_R, roll_deg, gap_front_mm
)
force_back, torque_back = predictor.predict(
    curr_back_L, curr_back_R, roll_deg, gap_back_mm
)

Apply forces at Y=0 (center of each U-yoke):
- Front force at: [+0.1259, 0, 0.08585]
- Back force at: [-0.1259, 0, 0.08585]
Apply roll torques from each yoke independently

Key Design Decisions

Why 4 actions instead of 2?

Physical system has 4 independent electromagnets (one per yoke end)
Allows fine control of roll torque
Left/right current imbalance on each yoke creates torque

Why sensor observations instead of yoke measurements?

Realistic: sensors are at different positions than yokes
Adds partial observability challenge
Agent must learn system dynamics to infer unmeasured states
Sensor noise simulates real measurement uncertainty

Why not include velocities in observation?

Agent can learn velocities from temporal sequence (frame stacking)
Reduces observation dimensionality
Tests if agent can learn dynamic behavior from gap measurements alone

Current sign convention:

No conversion needed - currents fed directly to predictor
Range: -15A to +15A (from Ansys model)
Coil RL circuit naturally produces currents in this range

Comparison with Original Design

Feature	Original	Updated
Actions	2 (left/right coils)	4 (front_L, front_R, back_L, back_R)
Observations	5 (gaps, roll, velocities)	4 (noisy sensor gaps)
Gap Measurement	Direct yoke positions	Noisy sensor positions
Force Application	Front & back yoke centers	Front & back yoke centers ✓
Current Range	Assumed negative only	-15A to +15A
Roll Calculation	From yoke end heights	From yoke end heights ✓

Physics Pipeline (Per Timestep)

Action → Currents

PWM[4] → RL Circuit Model → Currents[4]

State Measurement

Yoke End Positions[4] → Gap Heights[4] → Average per Yoke[2]

Roll Calculation

(Gap_Right - Gap_Left) / Y_distance → Roll Angle

Force Prediction

(currL, currR, roll, gap) → Maglev Predictor → (force, torque)
Applied separately for front and back yokes

Force Application

Forces at Y=0 for each yoke + Roll torques

Observation Generation

Sensor Positions[4] → Gap Heights[4] → Add Noise → Observation[4]

Info Dictionary

Each env.step() returns comprehensive diagnostics:

{
    'curr_front_L': float,      # Front left coil current (A)
    'curr_front_R': float,      # Front right coil current (A)
    'curr_back_L': float,       # Back left coil current (A)
    'curr_back_R': float,       # Back right coil current (A)
    'gap_front_yoke': float,    # Front yoke average gap (m)
    'gap_back_yoke': float,     # Back yoke average gap (m)
    'roll': float,              # Roll angle (rad)
    'force_front': float,       # Front yoke force (N)
    'force_back': float,        # Back yoke force (N)
    'torque_front': float,      # Front yoke torque (mN·m)
    'torque_back': float        # Back yoke torque (mN·m)
}

Testing

Run the updated test script:

cd "/Users/adipu/Documents/lev_control_4pt_small/RL Testing"
/opt/miniconda3/envs/RLenv/bin/python test_env.py

Expected behavior:

4 sensors report gap heights with small noise variations
Yoke gaps (in info) match sensor gaps approximately
All 4 coils build up current over time (RL circuit dynamics)
Forces should be ~50-100N upward at 10mm gap with moderate currents
Pod should begin to levitate if forces overcome gravity (5.8kg × 9.81 = 56.898 N needed)

Next Steps for RL Training

Frame Stacking: Use 3-5 consecutive observations to give agent velocity information

from stable_baselines3.common.vec_env import VecFrameStack
env = VecFrameStack(env, n_stack=4)

Algorithm Selection: PPO or SAC recommended
- PPO: Good for continuous control, stable training
- SAC: Better sample efficiency, handles stochastic dynamics
Reward Tuning: Current reward weights may need adjustment based on training performance
Curriculum Learning: Start with smaller gap errors, gradually increase difficulty
Domain Randomization: Vary sensor noise, mass, etc. for robust policy

7.1 KiB Raw Blame History Unescape Escape