832 Commits

Author SHA1 Message Date
Neel Kant
66f5f20e8d Env and tool improvements (#328)
* clean up achievements; fix value accrual time; report flows better

* use pause, remove value accrual time

* make clients sleep correct time, add more speed and pausing methods to instance, add tests

* server adminlist

* clean up code, add more Instance methods, render pause message, tests passing'

* add tests for elapsed ticks

* fix run_eval

* game control;

* tests

* tests

* task info

* game control and medium electric poles

* connect_entities and place_entity_next_to edits

* change prints, max achieved throughput

* ast fixes - augmented assignment and some others

* changes for get_entities (groups) and connect_entities error logging;

* better connect entities behavior for no new entities placed, better grouped entity behavior, better error messages'

* fixes for tests

* item-on-ground, grouped entities

* updated tests

* ast tests and some other tweaks, all tests passing

* add connect tests

* remove analysis directory

* reward override, prep for evals
2025-09-04 18:18:11 -07:00
Harshit Sharma
a47918a745 remove fle.env. from method signature (#331) 2025-09-04 19:55:34 +05:30
Neel Kant
1fb91880ab Eval cleanups - timing and env.step() (#308)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* clean up achievements; fix value accrual time; report flows better

* use pause, remove value accrual time

* make clients sleep correct time, add more speed and pausing methods to instance, add tests

* server adminlist

* clean up code, add more Instance methods, render pause message, tests passing'

* add tests for elapsed ticks

* fix run_eval

* game control;

* tests

* tests

* task info

* game control and medium electric poles

* change prints, max achieved throughput
2025-08-27 07:42:36 -07:00
Harshit Sharma
4a960edc80 Better init (#322)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* removed everything other than util

* explicit yet redundant initialise
2025-08-25 19:30:54 +01:00
Harshit Sharma
b6118115e9 Clean instance (#319)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* remove incorrect destroy logic

* remove old restart logic

* force amd flag

* run with saves

* clean_instance

* clear_entities flag

* formatting

* fixes for clear entities

* dup script loading bug fix

* merged

* cleaned up lua scripts

* more explicit mods structure

* more explicit mods structure

* cleaned serialize

* minor changes

* incorrect position reset flag

* formatting

* formatting

* formatting

* remove unused functions

* naively move setup tools to lua manager

* formatting

* formatting

* fix

* trying something

* use self.player_index

* fixed indexing
2025-08-23 13:10:37 -07:00
Harshit Sharma
6c6f22c916 observation_fix (#320)
Some checks failed
Lint and Format / lint (push) Has been cancelled
2025-08-22 18:10:42 +05:30
Harshit Sharma
81cd2a794f Naive saves (#318)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* remove incorrect destroy logic

* remove old restart logic

* force amd flag

* run with saves

fairly trivial stuff (maybe)
2025-08-22 03:49:01 +05:30
Kian Kyars
4b840b7114 docs(versions): fix relative asset paths in 0.1.0 2025-08-21 12:41:14 +00:00
Harshit Sharma
8143457e55 Faster ci cd (#311)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* sessions based

* try out caching + no sleep

* update fixture usage

* better reset usge

* state less on tech, probably breaking change

* better fixtures + decouple resets

* use pytest-xdist w 2 servers

* using diff grouping for dep

* formatting

* formatting

* caching for image

* formatting

* formatting

* use uv

* use uv caching

* remove docker caching (its slower)

* how about 4 workers?

* no redundant resets

* parameterize

* change names

* update all_technologies_researched usage

change log:

- used uv and cache dependencies
- used 2 factorio headless server instances
- added pytest-xdist & used 2 pytest workers
- parametrized the slowest test -- `test_sleep.py` so as to balance it across workers
- clarified resets in `instance.py` so separate instances arent needed for research testing
- better fixture usage, with autouse reset
- added configure_game callback for per test file setup of inventories & research state.
- updated task abc all_technologies_researched usage, its now a param for reset
- using 4 workers instead of 2, can probably double it again lol
- pytest parameterized a slow test
- fixed redundant reset in conftest

final speedup: 9m 4s -> 1m, ≈9.07× faster
2025-08-21 17:31:28 +05:30
kiankyars
2ae77b49cb refactor: simplify gym environment registry (#301)
merging now because main is broken without it.

* registry.py changes to dataclass

* Flatten JSON task definitions and update registry

- Remove config wrapper from all task definition JSON files
- Move all config fields to top level alongside task_type and num_agents
- Update registry.py to read flattened structure
- Applied to lab_play/, multiagent/, and unbounded/ directories

* Fix remaining config reference in get_environment_info

- Update get_environment_info to use flattened task_data structure
- Remove reference to task_data['config'] which no longer exists

* Fix TaskFactory to work with flattened JSON structure

- Remove dependency on config wrapper in task JSON files
- Extract task config by filtering out task_type and num_agents

* Aug 14, 2025 at 13:15

* retain scope

* undo changes

* add back dataclass

* split scopes

* checkpoint

* intermediate

* more changes

* Aug 20, 2025 at 18:13

* model_dump

* Aug 20, 2025 at 18:27

* task_type
2025-08-21 13:09:30 +03:00
kiankyars
5ee9586e1d llm_factory (#290)
* first iteration

* change to support openai api endpoints

* Refactor APIFactory to use OpenAI-compatible endpoints

- Unified all providers to use OpenAI client format
- Eliminated provider-specific conditional branches
- Simplified provider detection using dict ordering
- Removed unused parameters and added missing return
- 90% reduction in code complexity

* Further simplify APIFactory

- Remove redundant MODELS_WITH_IMAGE_SUPPORT array
- Use provider config supports_images instead
- Inline _prepare_messages logic
- Extract _get_reasoning_length helper
- Add missing default return
- 20+ line reduction while maintaining functionality

* removecomment

* Inline reasoning length logic

- Remove _get_reasoning_length helper method
- Inline reasoning effort logic in o1/o3 handling
- Keep code simpler and more direct

* add provider sorting for openrouter to get fastest throughput

* add nitro

* add usage tracking

* usage

* undo changes that added logging

* update config paths

* remove offset

* offset

* Aug 20, 2025 at 20:25

* fix run_idx port offset

* make sure there is keyerror if no port

* fix
2025-08-21 12:58:39 +03:00
kiankyars
f9cfb74baf remove exit_on_task_success parameter (#313)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* runner requirement

* exit_on_task_success

* remove print statement
2025-08-20 19:32:42 +03:00
kiankyars
a0aa57a33e remove redundant validate_run (#314)
* remove redundant validate_run

* merge main
2025-08-20 19:31:39 +03:00
Harshit Sharma
75b450075f Merge pull request #309 from JackHopkins/lua_bug_fixes
Some checks failed
Lint and Format / lint (push) Has been cancelled
Lua bug fixes
2025-08-20 17:44:06 +05:30
hrshtt
892f551abc format 2025-08-20 17:33:06 +05:30
hrshtt
6b0c98bd85 breaking configs for the lore 2025-08-20 17:30:26 +05:30
hrshtt
65f927861c Merge remote-tracking branch 'upstream/main' into lua_bug_fixes 2025-08-20 17:04:08 +05:30
hrshtt
37db2d19ca move version db thingy 2025-08-20 16:58:51 +05:30
hrshtt
e5b07521df added agent_num 2025-08-20 16:56:27 +05:30
hrshtt
52cca6cca2 removed local & docker reference 2025-08-20 11:09:26 +05:30
hrshtt
06d27f89b6 narrow trigger 2025-08-20 11:01:28 +05:30
hrshtt
cdf26b02ee fix cluster usage 2025-08-20 10:45:14 +05:30
hrshtt
bc08e3af77 minor fixes 2025-08-20 10:28:16 +05:30
hrshtt
7ef3dd5b7e formatted finally 2025-08-20 10:23:12 +05:30
hrshtt
5ed1527a64 we will only have one container 2025-08-20 10:21:38 +05:30
hrshtt
06ef626c08 remove local directory 2025-08-20 10:18:12 +05:30
hrshtt
9f9b5a5d17 pull policy to missing 2025-08-20 09:55:52 +05:30
hrshtt
8265f4b311 fixed run-envs.sh usage 2025-08-20 09:49:52 +05:30
hrshtt
a19880fff5 port + path fixes 2025-08-20 09:45:12 +05:30
hrshtt
6d3f249a50 early failure 2025-08-20 09:34:45 +05:30
hrshtt
4d59645740 use run-envs.sh 2025-08-20 09:33:54 +05:30
hrshtt
833563bb6d updated workflow 2025-08-20 09:28:19 +05:30
hrshtt
5108deddea attach mods option 2025-08-20 09:17:18 +05:30
hrshtt
2d997ec2f7 remove unused/broken scripts 2025-08-20 09:16:51 +05:30
hrshtt
a5f6858e1b updated api usage 2025-08-20 01:15:46 +05:30
kiankyars
25f1ab7f1e runner requirement (#310)
Some checks failed
Lint and Format / lint (push) Has been cancelled
2025-08-19 11:34:24 -07:00
hrshtt
71a2971409 undo-ing local system changes 2025-08-19 20:31:07 +05:30
hrshtt
5caa34abe6 removed redundant clear_entities.lua 2025-08-19 18:14:54 +05:30
hrshtt
b72d797dad remove production_score.lua 2025-08-19 17:44:00 +05:30
hrshtt
f2d202c0b5 remove spammy log() 2025-08-19 16:54:31 +05:30
hrshtt
1951ee6a40 removed unused tool 2025-08-19 16:53:58 +05:30
hrshtt
e2585087ae fix pcall error parsing issue 2025-08-19 16:53:08 +05:30
hrshtt
11dc044bc5 fix loading dependencies issue 2025-08-19 16:52:47 +05:30
kiankyars
d3dd325792 npx prettier (#285)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* npx prettier

* main

* final formatting

* formatting
2025-08-18 16:43:34 +03:00
kiankyars
956db04490 Removing Duplication from System Prompt (#294)
* fix: remove duplicate burner mining drill line from crafting statistics

* fix: Remove duplicate 'Useful statistics' from task descriptions

- Statistics already included in system prompt from instance.get_system_prompt()
- Removed CRAFTING_STATISTICS duplication from task goal_description
- Cleaned up unused include_stats parameter in UnboundedThroughputTask

* fix: Remove goal description duplication and restore useful statistics

- Remove duplicate {goal_description} from GYM_AGENT_INSTRUCTIONS template
- Add CRAFTING_STATISTICS to agent.md for system prompt inclusion
- Goal now appears only once in Task section, not duplicated in Instructions
- Statistics are properly included in the system prompt via agent.md

* refactor changes

* add back system prompt

* replace goal description in proper palce

* remove redundant goal statement

* whitespace

* move rstrip to after string generation
2025-08-18 16:03:27 +03:00
kiankyars
9ff49a02c5 speed => set_speed (#304)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* feat: Add modular system prompt architecture

Create flexible component-based system prompt generation allowing
agent designs to customize prompts based on specific needs.

Key Features:
• SystemPromptBuilder for flexible prompt composition
• Component-based architecture (task, stats, constraints, patterns)
• Agent-specific optimizations (minimal 100 chars to comprehensive)
• Separation of task logic from prompt generation
• Backward compatibility with existing systems

Components:
- TaskDefinitionComponent: Task-specific instructions
- ProductionStatisticsComponent: Crafting/production rates
- ResponseFormatComponent: Different formats (Gym, MCTS, custom)
- MultiAgentComponent: Coordination instructions when needed
- ImplementationPatternsComponent: Code examples
- ConstraintsComponent: Behavioral rules and limitations
- APIReferenceComponent: Method docs (full or summary)

Changes:
- Enhanced ThroughputTask with build_system_prompt()
- Enhanced FactorioInstance with get_api_documentation()
- New fle.env.system_prompt package with builders and examples

Addresses need for customizable system prompts based on agent design,
similar to how observations can be tailored per agent type.

* gpt-5

* set_speed

* from tests

* set_speed
2025-08-14 11:09:25 -07:00
kiankyars
29bca3d5b3 remove redundant variable instantiation (#303)
Some checks failed
Lint and Format / lint (push) Has been cancelled
* cherry-pick

* add back path
2025-08-14 13:35:45 +03:00
kiankyars
0d731be2b5 Gym ports (#298)
* gym registry: Uses instance_id parameter and direct indexing: tcp_ports[instance_id]
run_eval: Passes instance_id=run_idx to gym.make()
config: Added instance_id field to track which container to use

* Fix RCON client disconnection by eliminating duplicate gym.make() calls

- **Root Cause**: Two gym.make() calls were creating separate FactorioInstance objects
  trying to connect to the same container, causing RCON conflicts

- **Problem**:
  - Main process: gym.make() → creates FactorioInstance → connects to container
  - Subprocess: gym.make() → creates ANOTHER FactorioInstance → conflicts!

- **Solution**: Eliminate main process gym.make() by:
  - Getting task directly via TaskFactory.create_task()
  - Generating system prompts via SystemPromptGenerator
  - Only subprocess creates gym environment with correct instance_id

- **Changes**:
  - registry.py: Added instance_id parameter to make_factorio_env()
  - run_eval.py: Removed main process gym.make(), kept subprocess gym.make()
  - config.py: Added instance_id field to track container mapping

- **Result**: Each subprocess now connects to its own container without conflicts
  - run_idx=0 → container 0 (port 27000)
  - run_idx=1 → container 1 (port 27001)
  - run_idx=2 → container 2 (port 27002)

Fixes RCON disconnection errors in multi-container gym environments.

* Remove Path usage and eliminate redundant multiagent instructions

- **Path removal**: Replace Path() with os.path.join() in run_eval.py
  - File path now resolves to: /home/kian/factorio-learning-environment/fle
  - Eliminates Path dependency as requested

- **Redundancy fix**: Remove duplicate multiagent instructions from run_eval.py
  - run_eval.py was duplicating the same multiagent logic as instance.py
  - Now uses basic generator.generate('') in main process
  - Proper agent-specific system prompts handled by instance.get_system_prompt(agent_idx)
  - Eliminates code duplication between run_eval.py and instance.py

- **Result**: Cleaner code with single source of truth for multiagent instructions

* Fix outdated env/src paths in MCP protocol files

- Problem: MCP files were using non-existent path parent/env/src
  This resolved to fle/env/protocols/env/src which does not exist

- Root cause: Legacy path structure assumption
  Current: fle/env/ contains tools/, instance.py, etc.
  Old assumption: fle/env/src/ never existed

- Solution: Update all MCP files to use correct path parent.parent
  Before: Path(...).parent / env / src → fle/env/protocols/env/src 
  After: Path(...).parent.parent → fle/env 

- Files fixed:
  resources.py: 2 instances fixed
  tools.py: 2 instances fixed
  unix_tools.py: 2 instances fixed
  Removed obsolete env.src. string replacement

- Verification: All paths now correctly point to fle/env/ with tools/ and instance.py

* Add container mapping debug prints for 7-env test verification

- Container Discovery: Shows all discovered containers with IPs and ports
- Main Process: Logs each run_idx → instance_id assignment
- Subprocess: Verifies gym.make() uses correct instance_id
- Registry: Shows which container is selected for each instance_id
- Instance: Confirms actual RCON connection details

Debug output will show:
🐳 CONTAINER DISCOVERY: Found X containers
🔍 Container details: Container 0: ip:port
🚀 MAIN PROCESS: Starting run_idx=X with instance_id=X
🎯 SUBPROCESS X: Creating gym environment with instance_id=X
🏭 REGISTRY: Creating FactorioInstance for instance_id=X
📡 REGISTRY: Selecting container X: ip:port
🔌 INSTANCE: Successfully connected to ip at tcp/port
 SUBPROCESS X: Connected to ip:port

This will verify the fix for RCON conflicts across 7 parallel environments.

* Container selection fixes for multi-terminal runs

- Add --instance_offset CLI flag (or FLE_INSTANCE_OFFSET env) to shift instance_id per terminal
- Normalize instance_id modulo number of containers inside registry (supports any offset)
- Keep detailed debug prints for discovery, selection, and connection

This ensures parallel runs across terminals map to distinct containers.

* Make container selection explicit via instance_id offset

- Remove automatic modulo normalization in registry
- Require valid instance_id; raise if out of range
- Keep --instance_offset (and FLE_INSTANCE_OFFSET) to compose instance_id = run_idx + offset
- Debug prints reflect explicit selection

This matches previous trajectory runner behavior and avoids unintended cross-terminal overlap.

* Centralize CLI parsing in fle/run.py

- Refactor run_eval.main to accept params (config_path, offset) and remove argparse
- Extend fle/run.py to parse --instance_offset and pass to run_eval
- Keep defaults for direct invocation of run_eval

This consolidates argument parsing in a single entrypoint as requested.

* Fix CLI offset parsing: use --offset in fle/run.py to pass to run_eval

* remove fluff

* remove fluff

* remove fluff

* put back things removed by mistake

* mcp: use importlib.resources.files('fle')/env instead of __file__-based execution_path; aligns with pkg-aware path used in run_eval and CLI

* run_eval: include multi-agent instructions in SystemPromptGenerator input to match instance.get_system_prompt

* unify system prompt construction: add SystemPromptGenerator.generate_for_agent(agent_idx, num_agents); use in instance.get_system_prompt and run_eval

* paths: one-line importlib.resources.files('fle')/env; unix_tools: pkg-aware tools base; instance.get_system_prompt uses generate_for_agent

* patch

* num_agents
2025-08-14 10:22:40 +03:00
kiankyars
94e5fad5ff Error in trajectory runner iteration 1: 'dict' object has no attribute '__dict__' (#300)
Some checks failed
Lint and Format / lint (push) Has been cancelled
2025-08-13 11:51:52 -07:00
Neel Kant
6042174da4 Multiagent gym env and minor fixes (#299)
* everything working

* default fast
2025-08-13 18:48:32 +02:00