* fix: remove duplicate burner mining drill line from crafting statistics
* fix: Remove duplicate 'Useful statistics' from task descriptions
- Statistics already included in system prompt from instance.get_system_prompt()
- Removed CRAFTING_STATISTICS duplication from task goal_description
- Cleaned up unused include_stats parameter in UnboundedThroughputTask
* fix: Remove goal description duplication and restore useful statistics
- Remove duplicate {goal_description} from GYM_AGENT_INSTRUCTIONS template
- Add CRAFTING_STATISTICS to agent.md for system prompt inclusion
- Goal now appears only once in Task section, not duplicated in Instructions
- Statistics are properly included in the system prompt via agent.md
* refactor changes
* add back system prompt
* replace goal description in proper palce
* remove redundant goal statement
* whitespace
* move rstrip to after string generation
* feat: Add modular system prompt architecture
Create flexible component-based system prompt generation allowing
agent designs to customize prompts based on specific needs.
Key Features:
• SystemPromptBuilder for flexible prompt composition
• Component-based architecture (task, stats, constraints, patterns)
• Agent-specific optimizations (minimal 100 chars to comprehensive)
• Separation of task logic from prompt generation
• Backward compatibility with existing systems
Components:
- TaskDefinitionComponent: Task-specific instructions
- ProductionStatisticsComponent: Crafting/production rates
- ResponseFormatComponent: Different formats (Gym, MCTS, custom)
- MultiAgentComponent: Coordination instructions when needed
- ImplementationPatternsComponent: Code examples
- ConstraintsComponent: Behavioral rules and limitations
- APIReferenceComponent: Method docs (full or summary)
Changes:
- Enhanced ThroughputTask with build_system_prompt()
- Enhanced FactorioInstance with get_api_documentation()
- New fle.env.system_prompt package with builders and examples
Addresses need for customizable system prompts based on agent design,
similar to how observations can be tailored per agent type.
* gpt-5
* set_speed
* from tests
* set_speed
* gym registry: Uses instance_id parameter and direct indexing: tcp_ports[instance_id]
run_eval: Passes instance_id=run_idx to gym.make()
config: Added instance_id field to track which container to use
* Fix RCON client disconnection by eliminating duplicate gym.make() calls
- **Root Cause**: Two gym.make() calls were creating separate FactorioInstance objects
trying to connect to the same container, causing RCON conflicts
- **Problem**:
- Main process: gym.make() → creates FactorioInstance → connects to container
- Subprocess: gym.make() → creates ANOTHER FactorioInstance → conflicts!
- **Solution**: Eliminate main process gym.make() by:
- Getting task directly via TaskFactory.create_task()
- Generating system prompts via SystemPromptGenerator
- Only subprocess creates gym environment with correct instance_id
- **Changes**:
- registry.py: Added instance_id parameter to make_factorio_env()
- run_eval.py: Removed main process gym.make(), kept subprocess gym.make()
- config.py: Added instance_id field to track container mapping
- **Result**: Each subprocess now connects to its own container without conflicts
- run_idx=0 → container 0 (port 27000)
- run_idx=1 → container 1 (port 27001)
- run_idx=2 → container 2 (port 27002)
Fixes RCON disconnection errors in multi-container gym environments.
* Remove Path usage and eliminate redundant multiagent instructions
- **Path removal**: Replace Path() with os.path.join() in run_eval.py
- File path now resolves to: /home/kian/factorio-learning-environment/fle
- Eliminates Path dependency as requested
- **Redundancy fix**: Remove duplicate multiagent instructions from run_eval.py
- run_eval.py was duplicating the same multiagent logic as instance.py
- Now uses basic generator.generate('') in main process
- Proper agent-specific system prompts handled by instance.get_system_prompt(agent_idx)
- Eliminates code duplication between run_eval.py and instance.py
- **Result**: Cleaner code with single source of truth for multiagent instructions
* Fix outdated env/src paths in MCP protocol files
- Problem: MCP files were using non-existent path parent/env/src
This resolved to fle/env/protocols/env/src which does not exist
- Root cause: Legacy path structure assumption
Current: fle/env/ contains tools/, instance.py, etc.
Old assumption: fle/env/src/ never existed
- Solution: Update all MCP files to use correct path parent.parent
Before: Path(...).parent / env / src → fle/env/protocols/env/src ❌
After: Path(...).parent.parent → fle/env ✅
- Files fixed:
resources.py: 2 instances fixed
tools.py: 2 instances fixed
unix_tools.py: 2 instances fixed
Removed obsolete env.src. string replacement
- Verification: All paths now correctly point to fle/env/ with tools/ and instance.py
* Add container mapping debug prints for 7-env test verification
- Container Discovery: Shows all discovered containers with IPs and ports
- Main Process: Logs each run_idx → instance_id assignment
- Subprocess: Verifies gym.make() uses correct instance_id
- Registry: Shows which container is selected for each instance_id
- Instance: Confirms actual RCON connection details
Debug output will show:
🐳 CONTAINER DISCOVERY: Found X containers
🔍 Container details: Container 0: ip:port
🚀 MAIN PROCESS: Starting run_idx=X with instance_id=X
🎯 SUBPROCESS X: Creating gym environment with instance_id=X
🏭 REGISTRY: Creating FactorioInstance for instance_id=X
📡 REGISTRY: Selecting container X: ip:port
🔌 INSTANCE: Successfully connected to ip at tcp/port
✅ SUBPROCESS X: Connected to ip:port
This will verify the fix for RCON conflicts across 7 parallel environments.
* Container selection fixes for multi-terminal runs
- Add --instance_offset CLI flag (or FLE_INSTANCE_OFFSET env) to shift instance_id per terminal
- Normalize instance_id modulo number of containers inside registry (supports any offset)
- Keep detailed debug prints for discovery, selection, and connection
This ensures parallel runs across terminals map to distinct containers.
* Make container selection explicit via instance_id offset
- Remove automatic modulo normalization in registry
- Require valid instance_id; raise if out of range
- Keep --instance_offset (and FLE_INSTANCE_OFFSET) to compose instance_id = run_idx + offset
- Debug prints reflect explicit selection
This matches previous trajectory runner behavior and avoids unintended cross-terminal overlap.
* Centralize CLI parsing in fle/run.py
- Refactor run_eval.main to accept params (config_path, offset) and remove argparse
- Extend fle/run.py to parse --instance_offset and pass to run_eval
- Keep defaults for direct invocation of run_eval
This consolidates argument parsing in a single entrypoint as requested.
* Fix CLI offset parsing: use --offset in fle/run.py to pass to run_eval
* remove fluff
* remove fluff
* remove fluff
* put back things removed by mistake
* mcp: use importlib.resources.files('fle')/env instead of __file__-based execution_path; aligns with pkg-aware path used in run_eval and CLI
* run_eval: include multi-agent instructions in SystemPromptGenerator input to match instance.get_system_prompt
* unify system prompt construction: add SystemPromptGenerator.generate_for_agent(agent_idx, num_agents); use in instance.get_system_prompt and run_eval
* paths: one-line importlib.resources.files('fle')/env; unix_tools: pkg-aware tools base; instance.get_system_prompt uses generate_for_agent
* patch
* num_agents
* Add external server support to gym interface via environment variables
Summary:
This PR adds support for external Factorio servers to the gym interface by checking environment variables before falling back to local Docker containers.
Problem
Currently, the gym interface (gym.make()) only supports local Docker containers through get_local_container_ips(). However, the underlying FactorioInstance class
already supports external servers. This limitation prevents users from:
- Using cloud-hosted or remote Factorio servers
- Sharing servers between multiple users/researchers
- Running in environments where Docker isn't available
- Using dedicated Factorio servers with better performance
Solution
Modified make_factorio_env() to check for environment variables:
- FACTORIO_SERVER_ADDRESS: External server IP/hostname
- FACTORIO_SERVER_PORT: External server RCON port (default: 27000)
If these are set, the gym interface connects to the external server. Otherwise, it falls back to the existing local container behavior.
Changes:
- Added import os to enable environment variable access
- Modified make_factorio_env() to check for FACTORIO_SERVER_ADDRESS and FACTORIO_SERVER_PORT
- Added informative print statements showing which server type is being used
- Maintained full backward compatibility - existing code continues to work unchanged
Usage
# Using external server
export FACTORIO_SERVER_ADDRESS=your-server-hostname
export FACTORIO_SERVER_PORT=27000
python your_script.py # gym.make() now uses external server
# Using local containers (default behavior)
python your_script.py # Works exactly as before
### Notes for Reviewers
- The `FactorioInstance` class already has full support for external servers, so this PR simply exposes that capability through the gym interface
- Environment variable names were chosen to be clear and consistent with common practices
- Print statements help users verify which server type is being used
* fix: resolve lint errors in registry.py
Fixed import ordering (stdlib → third-party → local) and split long print statement to comply with project's linting rules. No functional changes.
* fix 2: resolve lint errors in registry.py
missed one line and fixed it
* fix 3: resolve lint errors in registry.py
The key change is on lines 147-149, where the print statement is now split across multiple lines exactly as the formatter wants
* Update registry.py
## Fix incorrect resource patch reference in README example
### Description
The README example incorrectly uses `Prototype.IronOre` when finding iron ore patches. According to the source code in `fle/env/game_types.py`, `Resource.IronOre` should be used for resource patches.
### Changes
- Changed `position=nearest(Prototype.IronOre))` to `position=nearest(Resource.IronOre))`
### Context
- `Prototype.IronOre` is for item operations (insert_item, extract_item, etc.)
- `Resource.IronOre` is for finding resource patches on the map (nearest(), get_resource_patch())
- All other uses of `Prototype` in the example are correct for entity placement
### Impact
This aligns the documentation with the actual API implementation and prevents confusion for developers using the FLE gym interface.
* Cli wrapper (#264)
* gitignore
* exclude only fle internals in hatch build
* fix: remove unused subparser variables for linter compliance
* style: format run.py after linter fix
* style: format run.py after CLI unification
* commit
* remove config
* style: format fle/api.py after adding CLI wrappers
* style: format run.py after CLI refactor
* revert to run.py only
* Fix probe.sh: use UDP check and proper exit codes
* nice
* Update README: CLI-first approach, remove git clone and manual Docker setup
* readme concise
* Clean up README formatting and restore quickstart section
* readme
* Update pyproject.toml
v0.2.1 was already used:
Caused by: Upload failed with status code 400 Bad Request. Server says: 400 This filename has already been used, use a different version. See https://pypi.org/help/#file-name-reuse for more information.
trying v0.2.2
* Update pyproject.toml
* fix: only deploy leaderboard build output, not README
* test
* modify name
* fix: set correct homepage for leaderboard subpath on gh-pages
* restore
* Fix leaderboard deployment: copy built React app instead of source files
* leaderboard
* leaderboard
* Trigger leaderboard deployment
* Clean up workflow: use rm instead of rm -rf for files
* change
* test 2
* Fix workflow: recreate leaderboard directory before copying build files
* test 2
* Fix: ensure README.md is removed from leaderboard deployment
* Force cache refresh
* Replace React leaderboard with beautiful static HTML version
* Make leaderboard data dynamic and simplify deployment workflow
* Remove fallback data and simplify HTML
* new
* Remove public folder - move data to processed/ and use static HTML leaderboard
* test
* Fix leaderboard: remove rounding, fix undefined model names, use monospace font
* Fix leaderboard: use correct 'name' field from JSON data
* revert
* Recover deploy-leaderboard.yml
* Fix Jekyll override: add .nojekyll and move README
* script
* Update leaderboard data
* yolo
* push
* change
* good enough
* these have already been added
* Merge main into leaderboard, preserving deletions
* rever changes on process results
* Simplify process-results.js and remove old version compatibility
* Simplify deploy-leaderboard.yml workflow
* test
* Remove pending/verified workflow - simplify to direct processing
* versions
* leaderboard
* Add submittedBy column and simplify CSS styling
* Simplify CSS with single data-cell class
* Simplify to single data-cell class with uniform styling
* final change
---------
Co-authored-by: GitHub Action <action@github.com>
* Spin up as many servers as environments in config if not running
* Implement batching for limited containers in run_eval.py
* push
* fixing
* fresh air
* mssive simplification
* reset
* fluff
* better
* reset
* FIXED
* simplify parent
* remove docker container
* move import to top of file
* fixed small bug with tests
* fixing default config file
* Jul 21, 2025 at 18:27
* commit
* new changes
* revert changes
* Carry post_production_flows to next step for correct static/dynamic rewards
* undo superfluous changes
* undo superfluous changes
* task-key
* Debug BurnerMiningDrill instantiation and patch research_progress KeyError
* Fix import order for logging and all imports to resolve E402 errors
* Write entity instantiation debug info to file for reliable diagnostics
* Write research save errors to /tmp/research_save_error.log for diagnostics
* error handling
* Log all MiningDrill instantiations to /tmp/mining_drill_debug.log for diagnostics
* debusgging
* Move entity validity check before all usage in inspect_inventory for crash prevention
* Add validity checks and log invalid entities before every access in inspect_inventory
* tmp Jul 22, 2025 at 19:51
* add to debug
* undo logging
* undo logging
* Remove all invalid entity logging; keep only validity checks for lean robust code
* add back
* take 2
* take 3
* Restore non-fast GUI open/close, remove redundant validity checks, keep only necessary ones
* Delete mining_drill_debug.log
merging since tests pass and its just a version support pr
* started restructure
* mv scripts
* wont need sys.paths - breaking change
* rename env/src to env
* renamed from env.src to from env.
* mcp server changes
* added fle.env.__init__ with changed imports
* eval changes
* db_client to commons
* moved modules around and continuing to standardaise and fix imports
* even more module import fixes + stdzn
* agents changes no bueno
* import fixes and standardization
* DANGEROUS SCRIPT REMOVED
* removed legacy_gym_env
* fixing imports
* removed rcon from env
* gym_eval works
* Update .gitignore
Remove s/venv type and add trajectory_logs which was added in gym pr.
* remove Exceptions pursuant to issue 229.
* move MANUAL_short.md to algorithms/beam since that’s where it’s used. And delete MANUAL.md since it’s unused.
* add back exception removed incorrectly
* fixing env -> fle/env
* updated for new structure
* leftover changes
* update project structure in readme
* removed legacy building scripts
* moved common.models.camera.py::Camera to entities.py
* reduce relative imports + move time TimeMetrics to fix circular imports
* move all file writes to .fle directory
* fix camera circular import + reduce relative imports
* clean entrypoint + -m fle.run --run_config runs out of the box
* fixed imports
* remove deprecated code from db_client.py
* type annotation error - the function signature says it returns List[str] but it actually returns a tuple of three lists.
* moved camera.py out of entities
* fixed path issues
* fixed imports
* fixes to save_load but not for blueprint_based_policy
* fix code pushed by mistake
* fixed imports + brought some usage upto date
* import fixes
* fixed game.instance usage
* fixed imports
* fixed imports
* moved cluster_ips to commons
* remove unused section in readme
* fix: standardize all README naming to README.md
* added cluster dep
* Delete manifest since it is only used with setup tools which has been removed. And migrate pytest.ini to pyproject.
* delete function in fle/cluster/scenarios/default_lab_scenario/control.lua tagged for deletion
* add missing quotations
* hot fix to render was using old luaplayer moved to agent character
* cluster commons change
* removed path
* default lab scenario small
added new
* add project urls
* rewrite clean.sh
* update action version number following best practices
* Update exclusion files in Py Project.
* Remove duplicate leaderboard directories, and the deploy website workflow as well as add the publish to py test and publish to normal pypi.
* fix: deploy all docs and leaderboard build to gh-pages root
* fix docs for release
* fix the releases
* undo, it turns out the links break in releases
* update process-results workflow for new docs/leaderboard path
* last change to fix workflow 🤞
* fixing workflows to publish
* remove publishing, modify contributing and build, validate_installation was redundant because it's already in build, and update testpypi to push on commit
* increment patch to test testpypi upload
* trusted publishing
* bumping patch by 1
* third attempt at publishing to testpypi
* choosing truly unique version number
* README.md
* remove testpypi upload hook so that it stops trying to upload the same version with failure
* Update README.md
* update exclude
* changes to server.lua for can place entity
* refactor: clean up unused Docker scripts and redundant files
Remove unused and redundant files from fle/cluster/docker/:
- main.py: Redundant Python script that duplicates run_local.sh functionality
with hardcoded ports and basic container management. Not used in workflow.
- install-docker.sh: AWS EC2-specific setup script with hardcoded instance
(ec2-18-133-239-115) and outdated Amazon Linux commands. Current workflow
uses docker-compose instead.
- probe.sh: Simple port checking script using lsof. Redundant since
docker-compose handles health checks and container status monitoring.
- setup_docker_repo.sh: AWS ECR-specific setup with hardcoded AWS account
(216370203482). Contains mostly commented code and unused ECR repository
configuration. Not used in current workflow.
- requirements.txt: Redundant Python dependency file. Docker is a system
dependency, not a Python package. The Python docker SDK is already
included in pyproject.toml dependencies.
Kept essential files: Dockerfile, build scripts, run scripts, config/,
mods/, and README.md which are actively used in the Docker workflow.
* Fix test_assertion_exception to expect correct error output
* Add propagate_errors parameter to eval method for clearer API
* remove clean.sh from build, remove clean.sh since uv build will happen on github actions now exclusively. change imports to absolute in init.py, rename readmes, remove errneous docstring in run.py, remove duplicate dev dependencies in pyproject.toml and add python 3.13
* add testpypi on merge
* merge resolution
* Make README install/run instructions compatible with both pip/python and uv. Add python -m alternatives and clarify both are supported.
* add git clone back
* Simplify test_assertion_exception to minimal robust form
* Fix all test assertions to match actual system behavior
* remove commented code
* pydantic-bump
* migrate get_validators to __get_pydantic_core_schema__ for Python type (pydantic v2)
* fix __get_pydantic_core_schema__ to use chain_schema for Python type
* revert parse_response to original logic, only fix None unpacking
* Fix formatting with pre-commit hooks
---------
Co-authored-by: hrshtt <harshsha64@gmail.com>
Co-authored-by: Jack Hopkins <jack.hopkins@me.com>
* add changes
* refactor: clean up unused Docker scripts and redundant files
Remove unused and redundant files from fle/cluster/docker/:
- main.py: Redundant Python script that duplicates run_local.sh functionality
with hardcoded ports and basic container management. Not used in workflow.
- install-docker.sh: AWS EC2-specific setup script with hardcoded instance
(ec2-18-133-239-115) and outdated Amazon Linux commands. Current workflow
uses docker-compose instead.
- probe.sh: Simple port checking script using lsof. Redundant since
docker-compose handles health checks and container status monitoring.
- setup_docker_repo.sh: AWS ECR-specific setup with hardcoded AWS account
(216370203482). Contains mostly commented code and unused ECR repository
configuration. Not used in current workflow.
- requirements.txt: Redundant Python dependency file. Docker is a system
dependency, not a Python package. The Python docker SDK is already
included in pyproject.toml dependencies.
Kept essential files: Dockerfile, build scripts, run scripts, config/,
mods/, and README.md which are actively used in the Docker workflow.
* refactor: move lib/ and tools/ to mods/, clean up fle/env/utils/ (all deleted files were unused), prep for entrypoints/ refactor
* readd
* last undo
* refactor: move evaluator.py to algorithms/mcts, move experiment entrypoints to entrypoints/, update imports accordingly
* version info
* incorrect gitignore
* style: replace all relative imports in fle.agents with absolute imports
* simplify gitignore
* commit
* gitignore
* redo
* update
* No code is importing the fle.agents.data package/module
* exclude data/prompts from ruff lint/format in pre-commit
* yaml
* Files were cleared but not deleted
* finalize Neel's suggestions
* Jul 12, 2025 at 13:29
* Jul 12, 2025 at 15:47
* push
* Jul 12, 2025 at 16:03
* Jul 12, 2025 at 16:03
* Jul 12, 2025 at 16:03
* fix
* fix
* fix
* push
* push
* Jul 12, 2025 at 17:48
* remove publish on merge
v0.2.1 was already used:
Caused by: Upload failed with status code 400 Bad Request. Server says: 400 This filename has already been used, use a different version. See https://pypi.org/help/#file-name-reuse for more information.
trying v0.2.2
* Convert all relative imports to absolute imports
* Restore original module docstrings for llm and tools packages
* Restore missing exports and original docstrings that were accidentally removed during import conversion
* Restore all original inline comments in commons/models/__init__.py