* clean up achievements; fix value accrual time; report flows better
* use pause, remove value accrual time
* make clients sleep correct time, add more speed and pausing methods to instance, add tests
* server adminlist
* clean up code, add more Instance methods, render pause message, tests passing'
* add tests for elapsed ticks
* fix run_eval
* game control;
* tests
* tests
* task info
* game control and medium electric poles
* change prints, max achieved throughput
* sessions based
* try out caching + no sleep
* update fixture usage
* better reset usge
* state less on tech, probably breaking change
* better fixtures + decouple resets
* use pytest-xdist w 2 servers
* using diff grouping for dep
* formatting
* formatting
* caching for image
* formatting
* formatting
* use uv
* use uv caching
* remove docker caching (its slower)
* how about 4 workers?
* no redundant resets
* parameterize
* change names
* update all_technologies_researched usage
change log:
- used uv and cache dependencies
- used 2 factorio headless server instances
- added pytest-xdist & used 2 pytest workers
- parametrized the slowest test -- `test_sleep.py` so as to balance it across workers
- clarified resets in `instance.py` so separate instances arent needed for research testing
- better fixture usage, with autouse reset
- added configure_game callback for per test file setup of inventories & research state.
- updated task abc all_technologies_researched usage, its now a param for reset
- using 4 workers instead of 2, can probably double it again lol
- pytest parameterized a slow test
- fixed redundant reset in conftest
final speedup: 9m 4s -> 1m, ≈9.07× faster
merging now because main is broken without it.
* registry.py changes to dataclass
* Flatten JSON task definitions and update registry
- Remove config wrapper from all task definition JSON files
- Move all config fields to top level alongside task_type and num_agents
- Update registry.py to read flattened structure
- Applied to lab_play/, multiagent/, and unbounded/ directories
* Fix remaining config reference in get_environment_info
- Update get_environment_info to use flattened task_data structure
- Remove reference to task_data['config'] which no longer exists
* Fix TaskFactory to work with flattened JSON structure
- Remove dependency on config wrapper in task JSON files
- Extract task config by filtering out task_type and num_agents
* Aug 14, 2025 at 13:15
* retain scope
* undo changes
* add back dataclass
* split scopes
* checkpoint
* intermediate
* more changes
* Aug 20, 2025 at 18:13
* model_dump
* Aug 20, 2025 at 18:27
* task_type
* first iteration
* change to support openai api endpoints
* Refactor APIFactory to use OpenAI-compatible endpoints
- Unified all providers to use OpenAI client format
- Eliminated provider-specific conditional branches
- Simplified provider detection using dict ordering
- Removed unused parameters and added missing return
- 90% reduction in code complexity
* Further simplify APIFactory
- Remove redundant MODELS_WITH_IMAGE_SUPPORT array
- Use provider config supports_images instead
- Inline _prepare_messages logic
- Extract _get_reasoning_length helper
- Add missing default return
- 20+ line reduction while maintaining functionality
* removecomment
* Inline reasoning length logic
- Remove _get_reasoning_length helper method
- Inline reasoning effort logic in o1/o3 handling
- Keep code simpler and more direct
* add provider sorting for openrouter to get fastest throughput
* add nitro
* add usage tracking
* usage
* undo changes that added logging
* update config paths
* remove offset
* offset
* Aug 20, 2025 at 20:25
* fix run_idx port offset
* make sure there is keyerror if no port
* fix