* clean up achievements; fix value accrual time; report flows better
* use pause, remove value accrual time
* make clients sleep correct time, add more speed and pausing methods to instance, add tests
* server adminlist
* clean up code, add more Instance methods, render pause message, tests passing'
* add tests for elapsed ticks
* fix run_eval
* game control;
* tests
* tests
* task info
* game control and medium electric poles
* change prints, max achieved throughput
* sessions based
* try out caching + no sleep
* update fixture usage
* better reset usge
* state less on tech, probably breaking change
* better fixtures + decouple resets
* use pytest-xdist w 2 servers
* using diff grouping for dep
* formatting
* formatting
* caching for image
* formatting
* formatting
* use uv
* use uv caching
* remove docker caching (its slower)
* how about 4 workers?
* no redundant resets
* parameterize
* change names
* update all_technologies_researched usage
change log:
- used uv and cache dependencies
- used 2 factorio headless server instances
- added pytest-xdist & used 2 pytest workers
- parametrized the slowest test -- `test_sleep.py` so as to balance it across workers
- clarified resets in `instance.py` so separate instances arent needed for research testing
- better fixture usage, with autouse reset
- added configure_game callback for per test file setup of inventories & research state.
- updated task abc all_technologies_researched usage, its now a param for reset
- using 4 workers instead of 2, can probably double it again lol
- pytest parameterized a slow test
- fixed redundant reset in conftest
final speedup: 9m 4s -> 1m, ≈9.07× faster
merging now because main is broken without it.
* registry.py changes to dataclass
* Flatten JSON task definitions and update registry
- Remove config wrapper from all task definition JSON files
- Move all config fields to top level alongside task_type and num_agents
- Update registry.py to read flattened structure
- Applied to lab_play/, multiagent/, and unbounded/ directories
* Fix remaining config reference in get_environment_info
- Update get_environment_info to use flattened task_data structure
- Remove reference to task_data['config'] which no longer exists
* Fix TaskFactory to work with flattened JSON structure
- Remove dependency on config wrapper in task JSON files
- Extract task config by filtering out task_type and num_agents
* Aug 14, 2025 at 13:15
* retain scope
* undo changes
* add back dataclass
* split scopes
* checkpoint
* intermediate
* more changes
* Aug 20, 2025 at 18:13
* model_dump
* Aug 20, 2025 at 18:27
* task_type
* first iteration
* change to support openai api endpoints
* Refactor APIFactory to use OpenAI-compatible endpoints
- Unified all providers to use OpenAI client format
- Eliminated provider-specific conditional branches
- Simplified provider detection using dict ordering
- Removed unused parameters and added missing return
- 90% reduction in code complexity
* Further simplify APIFactory
- Remove redundant MODELS_WITH_IMAGE_SUPPORT array
- Use provider config supports_images instead
- Inline _prepare_messages logic
- Extract _get_reasoning_length helper
- Add missing default return
- 20+ line reduction while maintaining functionality
* removecomment
* Inline reasoning length logic
- Remove _get_reasoning_length helper method
- Inline reasoning effort logic in o1/o3 handling
- Keep code simpler and more direct
* add provider sorting for openrouter to get fastest throughput
* add nitro
* add usage tracking
* usage
* undo changes that added logging
* update config paths
* remove offset
* offset
* Aug 20, 2025 at 20:25
* fix run_idx port offset
* make sure there is keyerror if no port
* fix
* fix: remove duplicate burner mining drill line from crafting statistics
* fix: Remove duplicate 'Useful statistics' from task descriptions
- Statistics already included in system prompt from instance.get_system_prompt()
- Removed CRAFTING_STATISTICS duplication from task goal_description
- Cleaned up unused include_stats parameter in UnboundedThroughputTask
* fix: Remove goal description duplication and restore useful statistics
- Remove duplicate {goal_description} from GYM_AGENT_INSTRUCTIONS template
- Add CRAFTING_STATISTICS to agent.md for system prompt inclusion
- Goal now appears only once in Task section, not duplicated in Instructions
- Statistics are properly included in the system prompt via agent.md
* refactor changes
* add back system prompt
* replace goal description in proper palce
* remove redundant goal statement
* whitespace
* move rstrip to after string generation
* feat: Add modular system prompt architecture
Create flexible component-based system prompt generation allowing
agent designs to customize prompts based on specific needs.
Key Features:
• SystemPromptBuilder for flexible prompt composition
• Component-based architecture (task, stats, constraints, patterns)
• Agent-specific optimizations (minimal 100 chars to comprehensive)
• Separation of task logic from prompt generation
• Backward compatibility with existing systems
Components:
- TaskDefinitionComponent: Task-specific instructions
- ProductionStatisticsComponent: Crafting/production rates
- ResponseFormatComponent: Different formats (Gym, MCTS, custom)
- MultiAgentComponent: Coordination instructions when needed
- ImplementationPatternsComponent: Code examples
- ConstraintsComponent: Behavioral rules and limitations
- APIReferenceComponent: Method docs (full or summary)
Changes:
- Enhanced ThroughputTask with build_system_prompt()
- Enhanced FactorioInstance with get_api_documentation()
- New fle.env.system_prompt package with builders and examples
Addresses need for customizable system prompts based on agent design,
similar to how observations can be tailored per agent type.
* gpt-5
* set_speed
* from tests
* set_speed
* gym registry: Uses instance_id parameter and direct indexing: tcp_ports[instance_id]
run_eval: Passes instance_id=run_idx to gym.make()
config: Added instance_id field to track which container to use
* Fix RCON client disconnection by eliminating duplicate gym.make() calls
- **Root Cause**: Two gym.make() calls were creating separate FactorioInstance objects
trying to connect to the same container, causing RCON conflicts
- **Problem**:
- Main process: gym.make() → creates FactorioInstance → connects to container
- Subprocess: gym.make() → creates ANOTHER FactorioInstance → conflicts!
- **Solution**: Eliminate main process gym.make() by:
- Getting task directly via TaskFactory.create_task()
- Generating system prompts via SystemPromptGenerator
- Only subprocess creates gym environment with correct instance_id
- **Changes**:
- registry.py: Added instance_id parameter to make_factorio_env()
- run_eval.py: Removed main process gym.make(), kept subprocess gym.make()
- config.py: Added instance_id field to track container mapping
- **Result**: Each subprocess now connects to its own container without conflicts
- run_idx=0 → container 0 (port 27000)
- run_idx=1 → container 1 (port 27001)
- run_idx=2 → container 2 (port 27002)
Fixes RCON disconnection errors in multi-container gym environments.
* Remove Path usage and eliminate redundant multiagent instructions
- **Path removal**: Replace Path() with os.path.join() in run_eval.py
- File path now resolves to: /home/kian/factorio-learning-environment/fle
- Eliminates Path dependency as requested
- **Redundancy fix**: Remove duplicate multiagent instructions from run_eval.py
- run_eval.py was duplicating the same multiagent logic as instance.py
- Now uses basic generator.generate('') in main process
- Proper agent-specific system prompts handled by instance.get_system_prompt(agent_idx)
- Eliminates code duplication between run_eval.py and instance.py
- **Result**: Cleaner code with single source of truth for multiagent instructions
* Fix outdated env/src paths in MCP protocol files
- Problem: MCP files were using non-existent path parent/env/src
This resolved to fle/env/protocols/env/src which does not exist
- Root cause: Legacy path structure assumption
Current: fle/env/ contains tools/, instance.py, etc.
Old assumption: fle/env/src/ never existed
- Solution: Update all MCP files to use correct path parent.parent
Before: Path(...).parent / env / src → fle/env/protocols/env/src ❌
After: Path(...).parent.parent → fle/env ✅
- Files fixed:
resources.py: 2 instances fixed
tools.py: 2 instances fixed
unix_tools.py: 2 instances fixed
Removed obsolete env.src. string replacement
- Verification: All paths now correctly point to fle/env/ with tools/ and instance.py
* Add container mapping debug prints for 7-env test verification
- Container Discovery: Shows all discovered containers with IPs and ports
- Main Process: Logs each run_idx → instance_id assignment
- Subprocess: Verifies gym.make() uses correct instance_id
- Registry: Shows which container is selected for each instance_id
- Instance: Confirms actual RCON connection details
Debug output will show:
🐳 CONTAINER DISCOVERY: Found X containers
🔍 Container details: Container 0: ip:port
🚀 MAIN PROCESS: Starting run_idx=X with instance_id=X
🎯 SUBPROCESS X: Creating gym environment with instance_id=X
🏭 REGISTRY: Creating FactorioInstance for instance_id=X
📡 REGISTRY: Selecting container X: ip:port
🔌 INSTANCE: Successfully connected to ip at tcp/port
✅ SUBPROCESS X: Connected to ip:port
This will verify the fix for RCON conflicts across 7 parallel environments.
* Container selection fixes for multi-terminal runs
- Add --instance_offset CLI flag (or FLE_INSTANCE_OFFSET env) to shift instance_id per terminal
- Normalize instance_id modulo number of containers inside registry (supports any offset)
- Keep detailed debug prints for discovery, selection, and connection
This ensures parallel runs across terminals map to distinct containers.
* Make container selection explicit via instance_id offset
- Remove automatic modulo normalization in registry
- Require valid instance_id; raise if out of range
- Keep --instance_offset (and FLE_INSTANCE_OFFSET) to compose instance_id = run_idx + offset
- Debug prints reflect explicit selection
This matches previous trajectory runner behavior and avoids unintended cross-terminal overlap.
* Centralize CLI parsing in fle/run.py
- Refactor run_eval.main to accept params (config_path, offset) and remove argparse
- Extend fle/run.py to parse --instance_offset and pass to run_eval
- Keep defaults for direct invocation of run_eval
This consolidates argument parsing in a single entrypoint as requested.
* Fix CLI offset parsing: use --offset in fle/run.py to pass to run_eval
* remove fluff
* remove fluff
* remove fluff
* put back things removed by mistake
* mcp: use importlib.resources.files('fle')/env instead of __file__-based execution_path; aligns with pkg-aware path used in run_eval and CLI
* run_eval: include multi-agent instructions in SystemPromptGenerator input to match instance.get_system_prompt
* unify system prompt construction: add SystemPromptGenerator.generate_for_agent(agent_idx, num_agents); use in instance.get_system_prompt and run_eval
* paths: one-line importlib.resources.files('fle')/env; unix_tools: pkg-aware tools base; instance.get_system_prompt uses generate_for_agent
* patch
* num_agents
* Add external server support to gym interface via environment variables
Summary:
This PR adds support for external Factorio servers to the gym interface by checking environment variables before falling back to local Docker containers.
Problem
Currently, the gym interface (gym.make()) only supports local Docker containers through get_local_container_ips(). However, the underlying FactorioInstance class
already supports external servers. This limitation prevents users from:
- Using cloud-hosted or remote Factorio servers
- Sharing servers between multiple users/researchers
- Running in environments where Docker isn't available
- Using dedicated Factorio servers with better performance
Solution
Modified make_factorio_env() to check for environment variables:
- FACTORIO_SERVER_ADDRESS: External server IP/hostname
- FACTORIO_SERVER_PORT: External server RCON port (default: 27000)
If these are set, the gym interface connects to the external server. Otherwise, it falls back to the existing local container behavior.
Changes:
- Added import os to enable environment variable access
- Modified make_factorio_env() to check for FACTORIO_SERVER_ADDRESS and FACTORIO_SERVER_PORT
- Added informative print statements showing which server type is being used
- Maintained full backward compatibility - existing code continues to work unchanged
Usage
# Using external server
export FACTORIO_SERVER_ADDRESS=your-server-hostname
export FACTORIO_SERVER_PORT=27000
python your_script.py # gym.make() now uses external server
# Using local containers (default behavior)
python your_script.py # Works exactly as before
### Notes for Reviewers
- The `FactorioInstance` class already has full support for external servers, so this PR simply exposes that capability through the gym interface
- Environment variable names were chosen to be clear and consistent with common practices
- Print statements help users verify which server type is being used
* fix: resolve lint errors in registry.py
Fixed import ordering (stdlib → third-party → local) and split long print statement to comply with project's linting rules. No functional changes.
* fix 2: resolve lint errors in registry.py
missed one line and fixed it
* fix 3: resolve lint errors in registry.py
The key change is on lines 147-149, where the print statement is now split across multiple lines exactly as the formatter wants
* Update registry.py