Name | hadcm3n_101j_1940_40_007955595_2 |
Workunit | 8110707 |
Created | 11 May 2012, 17:54:24 UTC |
Sent | 11 May 2012, 17:54:54 UTC |
Report deadline | 11 Aug 2012, 1:22:05 UTC |
Received | 13 Jun 2012, 16:49:07 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1114543 |
Run time | 17 days 18 hours 1 min 17 sec |
CPU time | 14 days 8 hours 22 min 27 sec |
Validate state | Invalid |
Credit | 4,665.60 |
Device peak FLOPS | 2.28 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> Het apparaat herkent de opdracht niet. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5236, iMonCtr=1 Model crash detected, will try to restart... 08:12:10 (5852): No heartbeat from core client for 30 sec - exiting 08:12:11 (5852): No heartbeat from core client for 30 sec - exiting 08:12:12 (5852): No heartbeat from core client for 30 sec - exiting 08:12:13 (5852): No heartbeat from core client for 30 sec - exiting 08:12:14 (5852): No heartbeat from core client for 30 sec - exiting 08:12:15 (5852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CSuspended CPDN Monitor - Suspend request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6072, iMonCtr=1 Model crash detected, will try to restart... 22:00:10 (3904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:15:19 (5584): No heartbeat from core client for 30 sec - exiting 10:15:20 (5584): No heartbeat from core client for 30 sec - exiting 10:15:21 (5584): No heartbeat from core client for 30 sec - exiting 10:15:22 (5584): No heartbeat from core client for 30 sec - exiting 10:15:23 (5584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/101jko.pje4c10 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3476, iMonCtr=1 Model crash detected, will try to restart... 21:30:12 (3128): No heartbeat from core client for 30 sec - exiting 21:30:13 (3128): No heartbeat from core client for 30 sec - exiting 21:30:14 (3128): No heartbeat from core client for 30 sec - exiting 21:30:15 (3128): No heartbeat from core client for 30 sec - exiting 21:30:16 (3128): No heartbeat from core client for 30 sec - exiting 21:30:17 (3128): No heartbeat from core client for 30 sec - exiting 21:30:18 (3128): No heartbeat from core client for 30 sec - exiting 21:30:19 (3128): No heartbeat from core client for 30 sec - exiting 21:30:20 (3128): No heartbeat from core client for 30 sec - exiting 21:30:21 (3128): No heartbeat from core client for 30 sec - exiting 21:30:22 (3128): No heartbeat from core client for 30 sec - exiting 21:30:23 (3128): No heartbeat from core client for 30 sec - exiting 21:30:24 (3128): No heartbeat from core client for 30 sec - exiting 21:30:25 (3128): No heartbeat from core client for 30 sec - exiting 21:30:26 (3128): No heartbeat from core client for 30 sec - exiting 21:30:27 (3128): No heartbeat from core client for 30 sec - exiting 21:30:28 (3128): No heartbeat from core client for 30 sec - exiting 21:30:29 (3128): No heartbeat from core client for 30 sec - exiting 21:30:30 (3128): No heartbeat from core client for 30 sec - exiting 21:30:31 (3128): No heartbeat from core client for 30 sec - exiting 21:30:32 (3128): No heartbeat from core client for 30 sec - exiting 21:30:33 (3128): No heartbeat from core client for 30 sec - exiting 21:30:34 (3128): No heartbeat from core client for 30 sec - exiting 21:30:36 (3128): No heartbeat from core client for 30 sec - exiting 21:30:37 (3128): No heartbeat from core client for 30 sec - exiting 21:30:38 (3128): No heartbeat from core client for 30 sec - exiting 21:30:39 (3128): No heartbeat from core client for 30 sec - exiting 21:30:40 (3128): No heartbeat from core client for 30 sec - exiting 21:30:41 (3128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:37:27 (6232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2644, iMonCtr=1 Model crash detected, will try to restart... 08:55:32 (4476): No heartbeat from core client for 30 sec - exiting 08:55:33 (4476): No heartbeat from core client for 30 sec - exiting 08:55:34 (4476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1408, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2132, iMonCtr=1 Model crash detected, will try to restart... 07:45:59 (5744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1 Model crash detected, will try to restart... 10:28:45 (4832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3288, iMonCtr=1 Model crash detected, will try to restart... 08:13:10 (5908): No heartbeat from core client for 30 sec - exiting 08:13:11 (5908): No heartbeat from core client for 30 sec - exiting 08:13:12 (5908): No heartbeat from core client for 30 sec - exiting 08:13:13 (5908): No heartbeat from core client for 30 sec - exiting 08:13:14 (5908): No heartbeat from core client for 30 sec - exiting 08:13:15 (5908): No heartbeat from core client for 30 sec - exiting 08:13:17 (5908): No heartbeat from core client for 30 sec - exiting 08:13:18 (5908): No heartbeat from core client for 30 sec - exiting 08:13:19 (5908): No heartbeat from core client for 30 sec - exiting 08:13:20 (5908): No heartbeat from core client for 30 sec - exiting 08:13:21 (5908): No heartbeat from core client for 30 sec - exiting 08:13:22 (5908): No heartbeat from core client for 30 sec - exiting 08:13:23 (5908): No heartbeat from core client for 30 sec - exiting 08:13:24 (5908): No heartbeat from core client for 30 sec - exiting 08:13:25 (5908): No heartbeat from core client for 30 sec - exiting 08:13:26 (5908): No heartbeat from core client for 30 sec - exiting 08:13:28 (5908): No heartbeat from core client for 30 sec - exiting 08:13:29 (5908): No heartbeat from core client for 30 sec - exiting 08:13:30 (5908): No heartbeat from core client for 30 sec - exiting 08:13:31 (5908): No heartbeat from core client for 30 sec - exiting 08:13:32 (5908): No heartbeat from core client for 30 sec - exiting 08:13:33 (5908): No heartbeat from core client for 30 sec - exiting 08:13:34 (5908): No heartbeat from core client for 30 sec - exiting 08:13:35 (5908): No heartbeat from core client for 30 sec - exiting 08:13:36 (5908): No heartbeat from core client for 30 sec - exiting 08:13:37 (5908): No heartbeat from core client for 30 sec - exiting 08:13:39 (5908): No heartbeat from core client for 30 sec - exiting 08:13:40 (5908): No heartbeat from core client for 30 sec - exiting 08:13:41 (5908): No heartbeat from core client for 30 sec - exiting 08:13:42 (5908): No heartbeat from core client for 30 sec - exiting 08:13:43 (5908): No heartbeat from core client for 30 sec - exiting 08:13:44 (5908): No heartbeat from core client for 30 sec - exiting 08:13:45 (5908): No heartbeat from core client for 30 sec - exiting 08:13:46 (5908): No heartbeat from core client for 30 sec - exiting 08:13:47 (5908): No heartbeat from core client for 30 sec - exiting 08:13:48 (5908): No heartbeat from core client for 30 sec - exiting 08:13:49 (5908): No heartbeat from core client for 30 sec - exiting 08:13:51 (5908): No heartbeat from core client for 30 sec - exiting 08:13:52 (5908): No heartbeat from core client for 30 sec - exiting 08:13:53 (5908): No heartbeat from core client for 30 sec - exiting 08:13:54 (5908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5244, iMonCtr=1 Model crash detected, will try to restart... 09:13:26 (5864): No heartbeat from core client for 30 sec - exiting 09:13:27 (5864): No heartbeat from core client for 30 sec - exiting 09:13:28 (5864): No heartbeat from core client for 30 sec - exiting 09:13:29 (5864): No heartbeat from core client for 30 sec - exiting 09:13:30 (5864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6844, iMonCtr=1 Model crash detected, will try to restart... 08:53:10 (5600): No heartbeat from core client for 30 sec - exiting 08:53:11 (5600): No heartbeat from core client for 30 sec - exiting 08:53:12 (5600): No heartbeat from core client for 30 sec - exiting 08:53:13 (5600): No heartbeat from core client for 30 sec - exiting 08:53:14 (5600): No heartbeat from core client for 30 sec - exiting 08:53:15 (5600): No heartbeat from core client for 30 sec - exiting 08:53:16 (5600): No heartbeat from core client for 30 sec - exiting 08:53:17 (5600): No heartbeat from core client for 30 sec - exiting 08:53:18 (5600): No heartbeat from core client for 30 sec - exiting 08:53:20 (5600): No heartbeat from core client for 30 sec - exiting 08:53:21 (5600): No heartbeat from core client for 30 sec - exiting 08:53:22 (5600): No heartbeat from core client for 30 sec - exiting 08:53:23 (5600): No heartbeat from core client for 30 sec - exiting 08:53:24 (5600): No heartbeat from core client for 30 sec - exiting 08:53:25 (5600): No heartbeat from core client for 30 sec - exiting 08:53:26 (5600): No heartbeat from core client for 30 sec - exiting 08:53:27 (5600): No heartbeat from core client for 30 sec - exiting 08:53:28 (5600): No heartbeat from core client for 30 sec - exiting 08:53:29 (5600): No heartbeat from core client for 30 sec - exiting 08:53:30 (5600): No heartbeat from core client for 30 sec - exiting 08:53:31 (5600): No heartbeat from core client for 30 sec - exiting 08:53:32 (5600): No heartbeat from core client for 30 sec - exiting 08:53:33 (5600): No heartbeat from core client for 30 sec - exiting 08:53:34 (5600): No heartbeat from core client for 30 sec - exiting 08:53:35 (5600): No heartbeat from core client for 30 sec - exiting 08:53:36 (5600): No heartbeat from core client for 30 sec - exiting 08:53:37 (5600): No heartbeat from core client for 30 sec - exiting 08:53:38 (5600): No heartbeat from core client for 30 sec - exiting 08:53:39 (5600): No heartbeat from core client for 30 sec - exiting 08:53:40 (5600): No heartbeat from core client for 30 sec - exiting 08:53:41 (5600): No heartbeat from core client for 30 sec - exiting 08:53:42 (5600): No heartbeat from core client for 30 sec - exiting 08:53:43 (5600): No heartbeat from core client for 30 sec - exiting 08:53:44 (5600): No heartbeat from core client for 30 sec - exiting 08:53:45 (5600): No heartbeat from core client for 30 sec - exiting 08:53:46 (5600): No heartbeat from core client for 30 sec - exiting 08:53:47 (5600): No heartbeat from core client for 30 sec - exiting 08:53:48 (5600): No heartbeat from core client for 30 sec - exiting 08:53:49 (5600): No heartbeat from core client for 30 sec - exiting 08:53:50 (5600): No heartbeat from core client for 30 sec - exiting 08:53:51 (5600): No heartbeat from core client for 30 sec - exiting 08:53:52 (5600): No heartbeat from core client for 30 sec - exiting 08:53:53 (5600): No heartbeat from core client for 30 sec - exiting 08:53:54 (5600): No heartbeat from core client for 30 sec - exiting 08:53:55 (5600): No heartbeat from core client for 30 sec - exiting 08:53:56 (5600): No heartbeat from core client for 30 sec - exiting 08:53:57 (5600): No heartbeat from core client for 30 sec - exiting 08:53:58 (5600): No heartbeat from core client for 30 sec - exiting 08:53:59 (5600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6308, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5700, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6060, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6024, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5912, iMonCtr=1 Model crash detected, will try to restart... 10:15:14 (3708): No heartbeat from core client for 30 sec - exiting 10:15:15 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:19:50 (4860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:19:52 (4860): No heartbeat from core client for 30 sec - exiting 18:19:53 (4860): No heartbeat from core client for 30 sec - exiting 18:19:54 (4860): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7436, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( 17:13:27 (5424): No heartbeat from core client for 30 sec - exiting 17:13:28 (5424): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=1 Model crash detected, will try to restart... 18:35:44 (9604): No heartbeat from core client for 30 sec - exiting 18:35:45 (9604): No heartbeat from core client for 30 sec - exiting 18:35:47 (9604): No heartbeat from core client for 30 sec - exiting 18:35:48 (9604): No heartbeat from core client for 30 sec - exiting 18:35:49 (9604): No heartbeat from core client for 30 sec - exiting 18:35:50 (9604): No heartbeat from core client for 30 sec - exiting 18:35:51 (9604): No heartbeat from core client for 30 sec - exiting 18:35:52 (9604): No heartbeat from core client for 30 sec - exiting 18:35:53 (9604): No heartbeat from core client for 30 sec - exiting 18:35:54 (9604): No heartbeat from core client for 30 sec - exiting 18:35:55 (9604): No heartbeat from core client for 30 sec - exiting 18:35:56 (9604): No heartbeat from core client for 30 sec - exiting 18:35:57 (9604): No heartbeat from core client for 30 sec - exiting 18:35:59 (9604): No heartbeat from core client for 30 sec - exiting 18:36:00 (9604): No heartbeat from core client for 30 sec - exiting 18:36:01 (9604): No heartbeat from core client for 30 sec - exiting 18:36:02 (9604): No heartbeat from core client for 30 sec - exiting 18:36:03 (9604): No heartbeat from core client for 30 sec - exiting 18:36:04 (9604): No heartbeat from core client for 30 sec - exiting 18:36:05 (9604): No heartbeat from core client for 30 sec - exiting 18:36:06 (9604): No heartbeat from core client for 30 sec - exiting 18:36:07 (9604): No heartbeat from core client for 30 sec - exiting 18:36:08 (9604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9968, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
13 Jun 2012 09:46:38 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 388,800 | 1,225,266 | 3.1514 |
11 Jun 2012 15:11:48 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 362,880 | 1,166,880 | 3.2156 |
09 Jun 2012 19:52:56 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 336,960 | 1,106,870 | 3.2849 |
08 Jun 2012 11:03:13 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 311,040 | 1,047,299 | 3.3671 |
06 Jun 2012 17:30:23 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 285,120 | 989,979 | 3.4721 |
05 Jun 2012 08:16:18 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 259,200 | 930,067 | 3.5882 |
02 Jun 2012 19:53:46 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 233,280 | 871,607 | 3.7363 |
02 Jun 2012 01:53:37 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 207,360 | 813,983 | 3.9255 |
31 May 2012 20:35:09 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 181,440 | 756,072 | 4.1671 |
30 May 2012 15:44:57 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 155,520 | 698,704 | 4.4927 |
17 May 2012 19:23:06 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 129,600 | 292,755 | 2.2589 |
15 May 2012 18:38:53 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 103,680 | 234,802 | 2.2647 |
14 May 2012 11:01:02 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 77,760 | 177,006 | 2.2763 |
13 May 2012 08:07:02 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 51,840 | 118,412 | 2.2842 |
12 May 2012 12:25:26 | 1114543 | 14658714 | hadcm3n_101j_1940_40_007955595_2 | 25,920 | 59,632 | 2.3006 |
©2024 cpdn.org