Specific workloads are increasingly offloaded to accelerators such as a graphic processing unit (GPU) and field-programmable gate array (FPGA) for real-time processing and computing efficiency. Because accelerators are expensive and consume much power, it is desirable to increase the efficiency of accelerator utilization by sharing accelerators among multiple servers over a network. However, task offloading over a network has the problem of latency due to network processing overhead in remote offloading. This paper proposes a low-latency system for accelerator offloading over a network. To reduce the overhead of remote offloading, we propose a system composed of (1) fast recombination processing of chunked data with a simple protocol to reduce the number of memory copies, (2) polling-based packet receiving check to reduce overhead due to interrupts in interaction with a network interface card, and (3) a run-to-completion model in network processing and accelerator offloading to reduce overhead with context switching. We show that the system can improve performance by 66.40% compared with a simple implementation using kernel protocol stack and confirmed the performance improvement with a virtual radio access network use case as a low-latency application. Furthermore, we show that this performance can also be achieved in practical usage in data center networks.