You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, xingyizhou, thanks for sharing the code! I have some troubles.
If num_works = 0, we can train the network on kitti dataset well. However, if num_workers > 0, our training crashes:
ubuntu 16.04
pytorch 1.0.1.post2
python 3.6
~/Downloads/qingqing_disk/p4600_disk/CenterNet/src/lib/trains/base_trainer.py(63)run_epoch() 58 num_iters = len(data_loader) if opt.num_iters < 0 else opt.num_iters 59 bar = Bar('{}/{}'.format(opt.task, opt.exp_id), max=num_iters) 60 end = time.time() 61 import pdb 62 pdb.set_trace() 63 -> for iter_id, batch in enumerate(data_loader): 64 if iter_id >= num_iters:
~/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py(560)__init__() 557 # it started, so that we do not call .join() if program dies 558 # before it starts, and __del__ tries to join but will get: 559 # AssertionError: can only join a started process. 560 -> w.start() 561 self.index_queues.append(index_queue) 562 self.workers.append(w)
~/anaconda3/lib/python3.6/multiprocessing/process.py(105)start() 102 assert not _current_process._config.get('daemon'), \ 103 'daemonic processes are not allowed to have children' 104 _cleanup() 105 -> self._popen = self._Popen(self) 106 self._sentinel = self._popen.sentinel
~/anaconda3/lib/python3.6/multiprocessing/popen_fork.py(66)_launch() 63 def _launch(self, process_obj): 64 code = 1 65 parent_r, child_w = os.pipe() 66 -> self.pid = os.fork() 67 if self.pid == 0: 68 try: 69 os.close(parent_r) 70 if 'random' in sys.modules: 71 import random
Here, self.pid = os.fork(), I can't step into the os.fork() function or press key n to train the networks. However, os.fork() seems OK in terminal as follows: qingqing@qingqing-PowerEdge-T630:~$ python Python 3.6.8 |Anaconda custom (64-bit)| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.fork() 23346 0 >>> >>>
Hi, xingyizhou, thanks for sharing the code! I have some troubles.
If num_works = 0, we can train the network on kitti dataset well. However, if num_workers > 0, our training crashes:
ubuntu 16.04
pytorch 1.0.1.post2
python 3.6
~/Downloads/qingqing_disk/p4600_disk/CenterNet/src/lib/trains/base_trainer.py(63)run_epoch()
58 num_iters = len(data_loader) if opt.num_iters < 0 else opt.num_iters
59 bar = Bar('{}/{}'.format(opt.task, opt.exp_id), max=num_iters)
60 end = time.time()
61 import pdb
62 pdb.set_trace()
63 -> for iter_id, batch in enumerate(data_loader):
64 if iter_id >= num_iters:
~/anaconda3/lib/python3.6/sitepackages/torch/utils/data/dataloader.py(818)__iter__()
818 def __iter__(self):
819 -> return _DataLoaderIter(self)
~/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py(560)__init__()
557 # it started, so that we do not call .join() if program dies
558 # before it starts, and __del__ tries to join but will get:
559 # AssertionError: can only join a started process.
560 -> w.start()
561 self.index_queues.append(index_queue)
562 self.workers.append(w)
~/anaconda3/lib/python3.6/multiprocessing/process.py(105)start()
102 assert not _current_process._config.get('daemon'), \
103 'daemonic processes are not allowed to have children'
104 _cleanup()
105 -> self._popen = self._Popen(self)
106 self._sentinel = self._popen.sentinel
~/anaconda3/lib/python3.6/multiprocessing/context.py(223)_Popen()
219 class Process(process.BaseProcess):
220 _start_method = None
221 @staticmethod
222 def _Popen(process_obj):
223 -> return _default_context.get_context().Process._Popen(process_obj)
~/anaconda3/lib/python3.6/multiprocessing/context.py(277)_Popen()
272 class ForkProcess(process.BaseProcess):
273 _start_method = 'fork'
274 @staticmethod
275 def _Popen(process_obj):
276 from .popen_fork import Popen
277 -> return Popen(process_obj)
~/anaconda3/lib/python3.6/multiprocessing/popen_fork.py(19)__init__()
16 def __init__(self, process_obj):
17 util._flush_std_streams()
18 self.returncode = None
19 -> self._launch(process_obj)
~/anaconda3/lib/python3.6/multiprocessing/popen_fork.py(66)_launch()
63 def _launch(self, process_obj):
64 code = 1
65 parent_r, child_w = os.pipe()
66 -> self.pid = os.fork()
67 if self.pid == 0:
68 try:
69 os.close(parent_r)
70 if 'random' in sys.modules:
71 import random
Here, self.pid = os.fork(), I can't step into the os.fork() function or press key n to train the networks. However, os.fork() seems OK in terminal as follows:
qingqing@qingqing-PowerEdge-T630:~$ python
Python 3.6.8 |Anaconda custom (64-bit)| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.fork()
23346
0
>>> >>>
My problem is similar to pytorch/pytorch#25302 (He uses win10)
I got troubled. Could you help me? Thanks!
The text was updated successfully, but these errors were encountered: