High-performance edge artificial intelligence (Edge-AI) inference applications aim for high energy efficiency, memory density, and small form factor, requiring a design-space exploration across the whole stack—workloads, architecture, mapping, and co-optimization with emerging technology. In this article, we present a system-technology co-optimization (STCO) framework that interfaces with workload-driven system scaling challenges and physical design-enabled technology offerings. The framework is built on three engines that provide the physical design characterization, dataflow mapping optimizer, and system efficiency predictor. The framework builds on a systolic array accelerator to provide the design-technology characterization points using advanced imec A10 nanosheet CMOS node along with emerging, high-density voltage-gated spin-orbit torque (VGSOT) magnetic memories (MRAM), combined with memory-on-logic fine-pitch 3-D wafer-to-wafer hybrid bonding. We observe that the 3-D system integration of static random-access memory (SRAM)-based design leads to 9% power savings with 53% footprint reduction at iso-frequency with respect to 2-D implementation for the same memory capacity. Three-dimensional nonvolatile memory (NVM)-VGSOT allows $4times $