Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError: float division by zero #4

Closed
yx018 opened this issue Jun 2, 2023 · 16 comments
Closed

ZeroDivisionError: float division by zero #4

yx018 opened this issue Jun 2, 2023 · 16 comments

Comments

@yx018
Copy link

yx018 commented Jun 2, 2023

When I run the the first run command in Xplace:
"python main.py --dataset ispd2005 --design_name adaptec1 --load_from_raw True --detail_placement True"
It gives ZeroDivisionError: float division by zero

The error:

(base) blithe@DESKTOP-DMBFNBD:~/eda/Xplace$ python main.py --dataset ispd2005 --design_name adaptec1 --load_from_raw True --detail_placement True
[   1.415] Command line: python main.py --dataset ispd2005 --design_name adaptec1 --load_from_raw True --detail_placement True
[   1.415] log file at result/2023-06-02-15:28:39/log/test.log
[   1.415]
[   1.415] dataset_root: data/raw
[   1.415] dataset: ispd2005
[   1.416] design_name: adaptec1
[   1.416] custom_path:
[   1.416] load_from_raw: True
[   1.416] run_all: False
[   1.416] seed: 0
[   1.416] gpu: 0
[   1.416] num_threads: 20
[   1.416] deterministic: True
[   1.416] lr: 0.01
[   1.416] inner_iter: 10000
[   1.416] wa_coeff: 4.0
[   1.417] num_bin_x: 512
[   1.417] num_bin_y: 512
[   1.417] threshold: 4.0
[   1.417] density_weight: 8e-05
[   1.417] density_weight_coef: 1.05
[   1.417] use_init_density_weight: True
[   1.417] target_density: 1.0
[   1.417] use_filler: True
[   1.417] noise_ratio: 0.025
[   1.417] ignore_net_degree: 100
[   1.417] scale_design: False
[   1.418] use_eplace_nesterov: True
[   1.418] clamp_node: True
[   1.418] use_precond: True
[   1.418] stop_overflow: 0.07
[   1.418] enable_skip_update: True
[   1.418] loss_type: direct
[   1.418] use_cell_inflate: False
[   1.418] use_route_force: False
[   1.418] route_freq: 1000
[   1.418] num_route_iter: 400
[   1.418] route_weight: 0
[   1.419] congest_weight: 0
[   1.419] pseudo_weight: 0
[   1.419] detail_placement: True
[   1.419] dp_engine: default
[   1.419] eval_by_external: False
[   1.419] eval_engine: ntuplace4dr
[   1.419] final_route_eval: False
[   1.419] log_freq: 100
[   1.419] result_dir: result
[   1.419] exp_id: 2023-06-02-15:28:39
[   1.419] log_dir: log
[   1.419] log_name: test.log
[   1.419] eval_dir: eval
[   1.419] draw_placement: False
[   1.420] write_placement: True
[   1.420] write_global_placement: False
[   1.420] output_dir: output
[   1.420] output_prefix: placement
[   1.420]
[   1.484] =================
[   1.484] Start place ispd2005/adaptec1
[   3.361] loading from original benchmark...
[   8.381] Use Nesterov optimizer!
[   8.435]
===================
#nodes = 211447, #nets = 221142, #pins = 944053
#Mov = 210904, #Fix = 543, #IOPin = 0, #Blkg = 0
#ConnMov = 210904, #FloatMov = 0, #ConnFix = 543, #FloatFix = 0, #ConnIOPin = 0, #FloatIOPin = 0
Core Info [0.0, 10692.0, 0.0, 10680.0]
Site Width = 100, Row Height = 1200
#Bins = (512, 512), UnitLen = (20.88281, 20.85938)
target density = 1.00
===================
[   8.435] PlaceData(ispd2005/adaptec1, die_info=[4], die_ll=[2], die_ur=[2], hpwl_scale=[2], hyperedge_index=[2, 944053], hyperedge_list=[944053], hyperedge_list_end=[221142], net_mask=[221142], net_to_num_pins=[221142], node2pin_index=[2, 944053], node2pin_list=[944053], node2pin_list_end=[211447], node_area=[211447, 1], node_id2region_id=[211447], node_lpos=[211447, 2], node_pos=[211447, 2], node_size=[211447, 2], node_to_num_pins=[211447, 1], pin_id2net_id=[944053], pin_id2node_id=[944053], pin_rel_cpos=[944053, 2], pin_rel_lpos=[944053, 2], pin_size=[944053, 2], region_boxes=[1, 4], region_boxes_end=[1], regions=[1], unit_len=[2])
[   8.436] [(0, 210904, 'Mov'), (210904, 210904, 'FloatMov'), (210904, 211447, 'Fix'), (211447, 211447, 'IOPin'), (211447, 211447, 'Blkg'), (211447, 211447, 'FloatIOPin'), (211447, 211447, 'FloatFix')]
[   8.527] #Fillers: 443755 Filler size: (1.4442e+01, 1.2000e+01)
[   8.527] DieArea: 1.142E+08 FixArea: 0.000E+00 (0.0%) PlaceableArea: 1.142E+08 (100.0%) MovArea: 3.729E+07 (32.7%) FillerArea: 7.690E+07 (67.3%)
[   8.553] start gp
[   8.572] iter: 0 | masked_hpwl: 0.00E+00 overflow: 0.0000 obj: NAN density_weight: NAN wa_coeff: 1.6697E+03
Traceback (most recent call last):
  File "main.py", line 95, in <module>
    main()
  File "main.py", line 91, in main
    run_placement_main(args, logger)
  File "/home/blithe/eda/Xplace/src/run_placement.py", line 43, in run_placement_main
    run_placement_single(args, logger)
  File "/home/blithe/eda/Xplace/src/run_placement.py", line 12, in run_placement_single
    res = run_placement_main_nesterov(args, logger)
  File "/home/blithe/eda/Xplace/src/run_placement_nesterov.py", line 146, in run_placement_main_nesterov
    if ps.need_to_early_stop():
  File "/home/blithe/eda/Xplace/src/param_scheduler.py", line 364, in need_to_early_stop
    if not self.enable_fence and self.check_divergence(
  File "/home/blithe/eda/Xplace/src/param_scheduler.py", line 407, in check_divergence
    wl_ratio = (wl_mean - self.best_metric["hpwl"]) / self.best_metric["hpwl"]
ZeroDivisionError: float division by zero

Could you please help me check why this error happen? Thanks a lot.

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

I have installed torch 1.13.0+cu117

@liulixinkerry
Copy link
Member

How do you build your project? Is there any error reported?

@liulixinkerry
Copy link
Member

[   8.527] DieArea: 1.142E+08 FixArea: 0.000E+00 (0.0%) PlaceableArea: 1.142E+08 (100.0%) MovArea: 3.729E+07 (32.7%) FillerArea: 7.690E+07 (67.3%)

The FixArea should be non-zero in adaptec1

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

(base) blithe@DESKTOP-DMBFNBD:~/eda/Xplace/build$ cmake -DPYTHON_EXECUTABLE=$(which python) ..
-- The C compiler identification is GNU 11.3.0
-- The CXX compiler identification is GNU 11.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- CMAKE_BUILD_TYPE: Release
-- PROJECT_SOURCE_DIR=/home/blithe/eda/Xplace
-- CMAKE_CXX_ABI: _GLIBCXX_USE_CXX11_ABI=0
-- pybind11 v2.11.0 dev1
-- Found PythonInterp: /home/blithe/miniconda3/bin/python (found suitable version "3.8.16", minimum required is "3.6")
-- Found PythonLibs: /home/blithe/miniconda3/lib/libpython3.8.so
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- PYTHON_INCLUDE_DIRS: /home/blithe/miniconda3/include/python3.8
-- FLUTE_INCLUDE_DIR: /home/blithe/eda/Xplace/thirdparty/flute
-- LEMON_INCLUDE_DIRS: /home/blithe/eda/Xplace/thirdparty/lemon/include
-- LEMON_LIBRARIES: /home/blithe/eda/Xplace/thirdparty/lemon/lib/libemon.a
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2")
-- Checking for module 'cairo'
--   Found cairo, version 1.16.0
-- Found Cairo: /usr/include/cairo
-- CAIRO_INCLUDE_DIRS: /usr/include/cairo
-- CAIRO_LIBRARIES: /usr/lib/x86_64-linux-gnu/libcairo.so
-- TORCH_INSTALL_PREFIX=/home/blithe/miniconda3/lib/python3.8/site-packages/torch
-- TORCH_VERSION=1.13
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "11.7", minimum required is "11.4")
-- Found CUDAToolkit: /usr/local/cuda/include (found version "11.7.99")
-- TORCH_ENABLE_CUDA=1
-- CUDA_ARCH_FLAGS: -gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
-- CUDA_NVCC_FLAGS: -gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;--compiler-options;-fPIC;-std=c++17;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;--extended-lambda;--expt-relaxed-constexpr
-- TORCH_INCLUDE_DIRS=/home/blithe/miniconda3/include/python3.8/home/blithe/miniconda3/lib/python3.8/site-packages/torch/include/home/blithe/miniconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/blithe/eda/Xplace/build

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

when I run "make -j40 && make install" to specify the number of jobs, it give me an error.
c++: fatal error: Killed signal terminated program cc1plus

But when I just run "make install", it is ok.

@liulixinkerry
Copy link
Member

Are you using WSL-ubuntu in Windows? I also meet the same errors in WSL...
The GPU-based init_density_map in calculator.py is not correctly computed. I am not sure whether this strange behavior is related to WSL.

Instead of running Xplace on WSL, I suggest using Linux (e.g. Ubuntu, CentOS) if it is possible for you.

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

Yes, I use WSL2-ubuntu22.04. I will change to Linux and try. Thanks.

@liulixinkerry
Copy link
Member

BTW, what is your GPU architecture? Is RTX 20xx?

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

GTX1080

@liulixinkerry
Copy link
Member

liulixinkerry commented Jun 2, 2023

Please remember to change the SM version to 61 if you are using GTX 1080 in https://github.com/cuhk-eda/Xplace/blob/main/CMakeLists.txt#L79

set(CUDA_ARCH_LIST 6.1)

:) Just a quick fix, I will make it more robust later.

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

thx

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

Btw, how to use ntuplace_4dr? I mean I cannot find the file in repo.

@liulixinkerry
Copy link
Member

liulixinkerry commented Jun 2, 2023

I found that I can use Xplace in WSL now after correctly setting the CUDA ARCH...

As for NTUplace4dr, you may need to contact Prof. Yao-Wen Chang and the other authors of NTUplace4dr, and see whether they can provide the binary for you.

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

Ok. I will try

@yx018
Copy link
Author

yx018 commented Jun 2, 2023

Thanks!!! It can run now. Because no qt plugin in WSL, it just do GP. Next I will try in ubuntu desktop.

@liulixinkerry
Copy link
Member

liulixinkerry commented Jun 3, 2023

I faced the same QT plugin trouble as you and I fixed it. Probably you can upgrade WSL to support the GUI.

Some references also used by me:
microsoft/WSL#9303 (comment) -> set X11 display socket
https://github.com/microsoft/wslg/wiki/Diagnosing-%22cannot-open-display%22-type-issues-with-WSLg -> set DISPLAY environment variable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants