# Hua Jiang > Principal Software Engineer At AMD Location: San Jose, California, United States Profile: https://flows.cv/huajiang Principal Systems Architect with 20+ years of expertise in AI Infrastructure, Compiler Engineering, and System Software. Specializing in Software-Hardware Co-design, bridging the gap between high-level AI frameworks and silicon performance. Proven track record in architecting MLIR-based compilers, high-performance bare-metal runtimes, and Linux kernel subsystems for heterogeneous computing. Technical Specialties: AI Compilers & Frameworks: MLIR (Dialect design, Pass pipeline), LLVM Backend, Apache TVM (Contributor, Heterogeneous Execution), Spatial Computing Optimization (Auto-tiling/Routing/Scheduling). High-Performance Runtime: Bare-metal/RTOS Runtimes, Heterogeneous Pipeline Orchestration, Zero-overhead Architectures, SRAM Optimization, Context-switch Bypass. OS & Kernel Internals: Linux Kernel Development (Network/Memory subsystems), Device Drivers (NPU, GPU, PCIe), Firmware, Hardware Abstraction Layer (HAL). Hardware Acceleration: Hardware-Software Co-design, High-Level Synthesis (HLS), FPGA targeting. Languages: C/C++, Python, MLIR, LLVM IR, TableGen. ## Work Experience ### Principal Software Engineer @ AMD Jan 2019 – Present | San Jose, California Architected and implemented an end to end MLIR based compiler(AIEHLC) for Spatial Computing NPU Architectures (AIE). Designed a robust progressive lowering pipeline leveraging custom hierarchical dialects and specialized transformation passes to automate tiling, automate routing, and automate scheduling. Apache TVM Contributor: Developed and upstreamed the Heterogeneous Pipeline Runtime to enable parallel execution across CPU, GPU, and FPGA, resulting in 3 published papers and a 30% throughput increase for YOLO on AMD Ultra96. Additionally, spearheaded HW/SW co-design for the TVM VTA accelerator by customizing HLS hardware logic and implementing corresponding runtime HAL extensions Architected and Engineered a low-latency Bare Metal Runtime (AEG API) for NPU Accelerators (AIE), specifically designed to bypass OS kernel mode-switching overhead for real-time inference and enhance security by eliminating dependencies on complex system layers. Delivered a 500% (5x) improvement in kernel dispatch performance.Achieved this by re-architecting the kernel binary format and inventing a specialized loader that maximizes SRAM utilization and data locality, removing DRAM bandwidth bottlenecks during startup. Led the Driver Development(AIE Driver) for multiple generations of NPU architectures(AIE), ensuring robustness and high throughput for both Linux, Bare metal, and RTOS environments. ### MTS @ Riverbed Technology Jan 2014 – Jan 2019 | Sunnyvale,California,USA Achieved a massive 300% (3x) throughput improvement in critical data-path processing for SDWAN. This was realized by architecting a next-generation data plane using DPDK (kernel bypass) and high-concurrency lockless queues, combined with deep system-level tuning including NUMA awareness, VM scheduling/affinity orchestration, and micro-architectural optimizations for cache and branch efficiency. Innovated hardware acceleration workflows by developing an internal source-to-source compiler, enabling the automatic conversion of C code into FPGA HLS code for rapid deployment on accelerated hardware. Engineered critical performance enhancements within the Linux kernel network subsystem. Integrated advanced protocol features such as TCP Fast Open (TFO) directly into the kernel model, significantly reducing connection establishment latency for network-intensive applications. ### Staff Software Engineer @ Juniper Networks Jan 2013 – Jan 2014 | sunnyvale,CA,USA Enhanced system robustness and reliability for Juniper SRX security platforms by developing and debugging critical low-level components, including a microkernel architecture, bootloaders, and the FreeBSD-based Junos OS kernel. ### Senior Software Engineer @ Dell Jan 2013 – Jan 2013 | San Jose,CA,USA Optimized the performance of virtualized desktop environments, achieving a 70% reduction in boot time and 1 patent filed. ### Staff Software Engineer @ Fluke Corporation Jan 2011 – Jan 2013 | Santa clara,California,USA Designed and implemented the Source-to-Source Compiler to automate the migration of Linux kernel drivers to the Windows platform. Developed translation rules to automatically convert Linux-specific kernel APIs and data structures into their Windows equivalents, significantly reducing manual porting effort and ensuring code consistency. Architected and implemented a high-performance OSPort runtime layer to bridge kernel execution models, dynamically mapping Linux Bottom Halves and performing real-time translation of kernel data structures. ### Senior Software Engineer @ Trend Micro Jan 2008 – Jan 2011 | Cupertino,California,USA Architected and developed source to source compiler for a security focused Domain Specific Language (DSL). Designed the custom grammar, implemented the frontend (lexer/parser generating AST), and built a semantic analysis engine to automate the detection of 0 day exploits based on behavioral patterns. Implemented low-level system hooking mechanisms for dynamic binary analysis. Developed kernel drivers to intercept system calls and I/O operations, enabling run-time monitoring of execution paths and memory access patterns to detect malicious behavior. ### Architect @ Suyaxing Jan 2000 – Jan 2007 Architected and optimized ultra-low latency real-time screen sharing solutions for enterprise meeting products. Engineered driver-level, hardware-assisted dirty rectangle detection within the GPU display driver to enable incremental screen capture, significantly reducing GPU encoding overhead and maximizing overall system responsiveness Developed high-performance kernel drivers, including network and file filters. Optimized critical I/O paths to achieve transparent interception with negligible impact. ### Sofware Engineer @ Kemao Jan 1999 – Jan 2000 C/C++ Development. ## Education ### BS in Mechanical design and manufacturing Nanjing University of Aeronautics and Astronautics ## Contact & Social - LinkedIn: https://linkedin.com/in/hua-jiang-52953334 --- Source: https://flows.cv/huajiang JSON Resume: https://flows.cv/huajiang/resume.json Last updated: 2026-04-12