The growing adoption of edge AI in smart city applications such as traffic management, surveillance, and environmental monitoring necessitates efficient computational strategies to satisfy the requirements for low latency and high accuracy. This study investigated GPU sharing techniques to improve resource utilization and throughput when running multiple AI applications simultaneously on edge devices. Using the NVIDIA Jetson AGX Orin platform and object detection workloads with the YOLOv8 model, we explored the performance tradeoffs of the threading and multiprocessing approaches. Our findings reveal distinct advantages and limitations. Threading minimizes memory usage by sharing CUDA contexts, whereas multiprocessing achieves higher GPU utilization and shorter inference times by leveraging independent CUDA contexts. However, scalability challenges arise from resource contention and synchronization overheads. This study provides insights into optimizing GPU sharing for edge AI applications, highlighting key tradeoffs and opportunities for enhancing performance in resource-constrained environments.