Niklas Hauser
salkinium

About me

I'm an embedded software and tooling specialist with a lot of experience in AVR and ARM Cortex-M devices and their related libraries. I love all kinds of engineering and technology, especially aviation, railways and robotics.

I studied computer science at the RWTH Aachen University with a major in communication systems and an application subject in railway safety engineering. I wrote my master's thesis on extracting data from technical documentation using table processing. In my free time, I programmed autonomous robots for the Eurobot competition as a member of the RoboterClub Aachen e.V..

For work, I created electronics at The Media Computing Group and helped hang a 60×7m large display on the building's façade. I later rebuild the railway lab electronics of the Institute of Transport Science, which included the design and manufacture of a modular 1:32 scale signal system made of PCBs and 3D prints.

During my 2+ years at Arm, I worked as an Embedded Software Engineer on mbed OS core and HAL, as well as uVisor, a device security layer for Cortex-M microcontrollers, and the arm-none-eabi-gcc toolchain, and co-authored three patents. I have expert knowledge on and experience with all ARM Cortex-M architectures: v6-M, v7-M, and v8-M.

I'm currently working as a Senior Embedded Software Engineer at Auterion in Zurich on the PX4 Autopilot project.

Projects

modm: Project lead and co-maintainer of a C++23 barebone embedded library generator, which generates custom HALs for thousands of different AVRs and Cortex-M devices.

modm-data: Creator and maintainer of a modular data processor to extract and assemble semantic hardware description data for embedded devices from vendor-provided PDF technical documentation and machine-readable data sources.

lbuild: Maintainer of a Python3+Jinja2 modular code generator used by modm.

emdbg: Creator and maintainer of the Embedded Debug Tools containing GDB plugins and tracing converters to debug and profile the PX4 Autopilot firmware with the NuttX RTOS.

My other projects are hosted on GitHub.

Talks

At emBO++18 I held a lightning talk on ARMv8-M and TrustZone-M (Slides with Notes).

At emBO++19 I gave a short talk about Modular Code Generation with lbuild (Slides with Notes).

At CCCamp23 I talked about Debugging Microcontrollers (Video, Slides with Notes).

At the NuttX International Workshop 2023 I gave a slightly updated talk on Debugging and Profiling NuttX and PX4 (Video).

At the PX4 Developer Summit 2023 I gave another talk on Debugging PX4 (Video, Slides with Notes).

At emBO++24 I presented Analyzing Cortex-M Firmware with the Perfetto Trace Processor (Slides with Notes).

Publications

Niklas Hauser and Jan Pennekamp.
Automatically Extracting Hardware Descriptions from PDF Technical Documentation.
Journal of Systems Research, 3(1), 10 2023.
[DOI] [PDF] [CODE]

ABSTRACT The ever-increasing variety of microcontrollers aggravates the challenge of porting embedded software to new devices through much manual work, whereas code generators can be used only in special cases. Moreover, only little technical documentation for these devices is available in machine-readable formats that could facilitate automating porting efforts. Instead, the bulk of documentation comes as print-oriented PDFs. We hence identify a strong need for a processor to access the PDFs and extract their data with a high quality to improve the code generation for embedded software. In this paper, we design and implement a modular processor for extracting detailed datasets from PDF files containing technical documentation using deterministic table processing for thousands of microcontrollers. Namely, we systematically extract device identifiers, interrupt tables, package and pinouts, pin functions, and register maps. In our evaluation, we compare the documentation from STMicro against existing machine-readable sources. Our results show that our processor matches 96.5% of almost 6 million reference data points, and we further discuss identified issues in both sources. Hence, our tool yields very accurate data with only limited manual effort and can enable and enhance a significant amount of existing and new code generation use cases in the embedded software domain that are currently limited by a lack of machine-readable data sources.
BIBTEX
@article{HP23,
  author = {Hauser, Niklas and Pennekamp, Jan},
  title = {{Automatically Extracting Hardware Descriptions from PDF Technical Documentation}},
  journal = {Journal of Systems Research},
  year = {2023},
  volume = {3},
  number = {1},
  publisher = {eScholarship Publishing},
  month = {10},
  doi = {10.5070/SR33162446},
  code = {https://github.com/salkinium/pdf-data-extraction-jsys-artifact},
  code2 = {https://github.com/modm-io/modm-data},
  meta = {},
}

The tool development is continued in the modm-data project.

Contact

You can reach me via electronic mail at niklas@salkinium.com.
Follow me on Mastodon for my brain dump.
For a few more structured thoughts refer to my blog.