-
|
Hi, first thanks for this awesome project. We've been having a good time testing your library to understand it better. On reddit you reported the latency of Events to be around 2.5us, but when we test the events using two python processes on linux, we get on average 80us of latency. We've tested this on multiple different CPUs to make sure it's not a CPU artifact. Wanted to ask if you think it's normal to have 80us (one way) latency on python processes or there is something wrong? Just for the record, I put the toy example codes here: Publisher: import os
import threading
import time
import iceoryx2 as iox2
from multiprocessing import shared_memory
import numpy as np
print("PID:", os.getpid())
print("Thread ident:", threading.get_ident())
print("Native thread ID:", threading.get_native_id())
shape = (640, 360)
dtype = np.float64
nbytes = np.prod(shape) * np.dtype(dtype).itemsize
shm = shared_memory.SharedMemory(name="test", create=True, size=nbytes)
numpy_array = np.ndarray(shape, dtype=dtype, buffer=shm.buf)
node = (
iox2.NodeBuilder.new().create(iox2.ServiceType.Ipc)
)
event_service = (
node.service_builder(iox2.ServiceName.new("test_event"))
.event()
.event_id_max_value(256)
.open_or_create()
)
event_id = iox2.EventId.new(1)
notifier = (
event_service.notifier_builder()
.default_event_id(event_id)
.create()
)
count = 0
while True:
numpy_array[1, 1] = count
numpy_array[0, 0] = time.time_ns()
notifier.notify_with_custom_event_id(event_id)
time.sleep(.001)
count += 1Receiver: import os
import threading
import time
import iceoryx2 as iox2
from multiprocessing import shared_memory
import numpy as np
print("PID:", os.getpid())
print("Thread ident:", threading.get_ident())
print("Native thread ID:", threading.get_native_id())
shape = (640, 360)
dtype = np.float64
shm = shared_memory.SharedMemory(name="test")
numpy_array = np.ndarray(shape, dtype=dtype, buffer=shm.buf)
node = (
iox2.NodeBuilder.new().create(iox2.ServiceType.Ipc)
)
event_service = (
node.service_builder(iox2.ServiceName.new("test_event"))
.event()
.open_or_create()
)
listener = event_service.listener_builder().create()
accum_elapsed = 0
num_samples = 0
for i in range(5000):
event_id = listener.blocking_wait_one()
time_ns = time.time_ns()
send_time = numpy_array[0, 0]
counter = numpy_array[1, 1]
elapsed_ms = (time_ns - send_time) / 1e6
accum_elapsed += elapsed_ms
num_samples += 1
print("Average Latency:", accum_elapsed / num_samples) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
@kiasar Multiple factors could be the cause of this. First of all, could you please run our Rust event benchmark so that we have a baseline for your machines? You just need to call cargo run --bin benchmark-event --release -- --bench-allDo you compile iceoryx2 yourself, or do you use pip? But I think the main thing is the following line in your publisher: As soon as you call I could imagine that the latency might be reduced as soon as you remove the line |
Beta Was this translation helpful? Give feedback.
It indirectly does, since this call
event_id = listener.blocking_wait_one()puts the receiver side instantly to sleep if there is no data available. And the longer it sleeps, the deeper the sleep gets - from a scheduler point of view.At first, the process is constantly rescheduled to check if there is new data, but this only goes for several hundred nanoseconds, maybe a few microseconds. Then the process is removed from the internal queue and put into deep sleep. When something later happens, the process is reloaded, put into the scheduler queue again, and can continue to work - and this is the time-intensive part.