0.11 Awkward Target Reimplementation Plan
Summary of choices
- Branch strategy: Use the orphan
devbranch for the reimplementation. This keeps the 0.11 rewrite clean and lets us rebuild the repo structure intentionally. - Compiler source: Implement the new Awkward target inside
det-lab/kaitai_struct_compiler(Scala-based, following the style of other language targets). - Testing scope: Run all existing tests from
main(ported into the new repo) and add targeted tests where 0.11 changes behavior or types. - Golden data updates: Allowed. Any changes will be documented in a dedicated Markdown changelog.
- CI/testing environment: Align to Python 3.11 and disable auto-loaded pytest plugins in test runs to avoid pulling in NumPy-2-incompatible binary deps. This repo does not use those plugins, but pytest discovers them automatically via installed packages.
- Plugins observed:
pytest-astropy/pytest-astropy-header(importsastropy), which pulls inerfaand other compiled extensions. - Why they auto-load: pytest loads entry points from installed packages by default, even if tests do not import them.
- Why they are unnecessary here: tests only use core pytest features and the repo's own fixtures; no astropy-specific markers or utilities are referenced.
Phase 0 — Bootstrap the dev branch repo structure (1–2 days)
Goal: recreate the runtime repo skeleton on dev but pointing to the 0.11 compiler.
1) Bring back repo structure
- Restore directories: src/awkward_kaitai, tests, example_data, docs, pyproject.toml, Makefile, AGENTS.md, README.md.
- Ensure kaitai_struct_compiler submodule points to det-lab/kaitai_struct_compiler@0.11-awkward.
2) Initialize tooling
- Recreate noxfile.py, .pre-commit-config.yaml, pyproject.toml with current constraints.
- Update Makefile to match the new compiler flow (0.11 compiler build hooks).
3) Target baseline
- Confirm repo structure is functionally identical to main aside from compiler target changes.
Phase 1 — KSC 0.11 analysis & delta mapping (1–2 days)
Goal: identify all breaks vs 0.10, define exact adaptation work.
1) API and pipeline diff
- Compare 0.10 vs 0.11 for:
- LanguageCompiler hooks (instance*, attr*, array handling, type mapping).
- Translator behavior (value instances, casts, shifts, numeric ops).
- Runtime output expectations (C++ signatures, layouts).
2) Generate a “porting checklist”
- Map existing Awkward implementation pieces to 0.11 equivalents.
- Note any new 0.11 idioms used by other language targets.
Phase 2 — Reimplement Awkward target in KSC 0.11 (5–10 days)
Goal: port the compiler target cleanly.
2.1 C++ codegen scaffolding
- Rebuild
AwkwardCompiler.scalain 0.11 style. - Ensure
instanceandseqfield generation aligns with 0.11 base classes.
2.2 Type mapping
- Re-implement
kaitaiType2NativeTypemapping with 0.11 logic. - Ensure correct handling of:
CalcIntTypeand integer shifts.- arrays (repeat
expr/until/eos). - enums and unions.
2.3 Layout builder generation
- Re-create Awkward layout builder structures from 0.11’s AST flow:
RecordBuilder,ListOffsetBuilder,NumpyBuilder.- Ensure builder naming and structure remain stable across schema changes.
2.4 Value instances & expressions
- Align with 0.11 expression pipeline:
- Implement safe shifts for 32+ bit ops.
- Avoid UB in all integer math.
Phase 3 — Runtime interface alignment (2–3 days)
Goal: keep Python runtime working with new C++ output.
1) Reader integration
- Confirm symbols expected by Reader (fill, form, length, etc.) still match.
- If KSC 0.11 changes anything, add adapter or update runtime.
2) CMake / build system
- Make sure builder CLI still works (awkward-kaitai-build).
- Update src/awkward_kaitai/data/CMakeLists.txt if new compiler output requires it.
Phase 4 — Test & golden updates (2–4 days)
Goal: ensure all tests pass against new output.
1) Regenerate all schemas
- Rebuild test_artifacts/*.cpp and .so with 0.11.
2) Run full test suite
- make test
3) Golden changes
- Update tests/*.json.gz if output changes.
- Document all changes in a new file: docs/golden-output-changes.md.
Phase 5 — Stabilization & docs (1–2 days)
Goal: produce a clean implementation ready for merge.
1) Performance sanity checks
- Validate build times and memory usage.
2) Documentation
- Update README.md and AGENTS.md with 0.11 notes.
- Add “0.11 migration notes” section.
3) CI readiness
- Disable pytest auto-plugin loading in CI (e.g., PYTEST_DISABLE_PLUGIN_AUTOLOAD=1) and explicitly enable any required plugins.
Deliverables
- Fully reimplemented Awkward target in
det-lab/kaitai_struct_compiler0.11 branch. - Runtime repo (dev) with working tests.
docs/golden-output-changes.mddetailing all JSON output changes.- Updated Makefile/tooling/CI.
Risks & Mitigations
- KSC 0.11 API changes: mitigate with a direct diff & cross-reference to other language targets.
- Output drift in JSON: document changes and validate schema semantics.
- Builder incompatibilities: verify buffer/format contract using
Reader.