climateprediction.net home page
New error message

New error message

Message boards : Number crunching : New error message
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53424 - Posted: 12 Feb 2016, 6:39:52 UTC

Well, I haven't seen it before.

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.

task = t817

ID: 53424 · Report as offensive     Reply Quote
Dave Roberts

Send message
Joined: 15 Jan 11
Posts: 175
Credit: 6,242,691
RAC: 699
Message 53425 - Posted: 12 Feb 2016, 23:10:14 UTC - in response to Message 53424.  

Interesting. Since one could surmise that -ve pressures would be associated with either -ve volumes or -ve temperatures, perhaps the conditions simulate instants before the Big Bang.
ID: 53425 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,403,322
RAC: 5,085
Message 53426 - Posted: 13 Feb 2016, 1:47:20 UTC

Obviously an unstable atmosphere result. We used to get those with the old hadcm3's back when we started running them. Could be parameters, except every one of that computer's models have crashed with that error.

My guess is the CPU and/or memory is being stressed too much and throwing errors. Time for a prime95 run for that computer.

It's been a long time since we've had to recommend that. I guess that shows that most people running the models nowadays have distributed computing experience of some type.
ID: 53426 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4314
Credit: 16,380,160
RAC: 3,563
Message 53428 - Posted: 13 Feb 2016, 7:52:57 UTC - in response to Message 53426.  

Obviously an unstable atmosphere result. We used to get those with the old hadcm3's back when we started running them. Could be parameters, except every one of that computer's models have crashed with that error.


Especially as this is with several different task types. Presumably, -ve pressure is the only impossible climate checked for though I seem to remember in the past, it was always, "Invalid Theta?"
ID: 53428 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53429 - Posted: 13 Feb 2016, 8:05:35 UTC - in response to Message 53428.  

It could be that the new message is for something that gets checked for first.
Or it could be that the modelling is getting more detailed, and there are new messages.

Doesn't negative pressure occur in tornadoes/typhoons/hurricanes? If so, no big bang. Except from electrical activity in the clouds.

ID: 53429 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,403,322
RAC: 5,085
Message 53430 - Posted: 13 Feb 2016, 16:11:52 UTC
Last modified: 13 Feb 2016, 16:13:41 UTC

P = (rho * R * T) / V in the atmosphere.

rho = density of gas (can't be negative, near 0 in outer space)
R = a constant that is positive
T = temperature of that gas in deg Kelvin (can't be negative in any real atmosphere)
V = volume of that gas

So, it is a test for negative pressures, but may as well be a check for negative absolute temperatures. The following are comments on the purpose of the P_TH_ADJ subroutine in model code:

CLL   SUBROUTINE P_TH_ADJ -------------------------------------------      PTHADJ1A.3     
CLL                                                                        PTHADJ1A.4     
CLL   PURPOSE:  CALCULATES ADDS SURFACE PRESSURE INCREMENTS USING          PTHADJ1A.5     
CLL             EQUATION (27). CALCULATES AND ADDS POTENTIAL TEMPERATURE   PTHADJ1A.6     
CLL             INCREMENTS USING EQUATION (28).                            PTHADJ1A.7     


We used to get these error messages with the hadcm3's back in the day. Using google search will bring up a bunch of model results, and a few threads in this forum where those errors are mentioned. Negative theta detected is a much more common error message, for whatever reason.
ID: 53430 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53432 - Posted: 14 Feb 2016, 8:14:24 UTC

I wonder if that computer is grossly over clocked? That may account for everything failing soon after starting, and may even cause a negative value in variables that can't be negative, due to the same incorrect value being returned from the fpu.

I think that I'll ask about this tomorrow, and also see if that person can be sent an email.

ID: 53432 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53463 - Posted: 18 Feb 2016, 0:22:05 UTC - in response to Message 53461.  

That error means that one or more of the needed files for those tasks couldn't be found on the server.
There's no other reports yet, so it may be an isolated incident.

What happens depends on the message in the Status column of the Tasks tab.
If they show as an error, then Abort them.

ID: 53463 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 53464 - Posted: 18 Feb 2016, 2:22:51 UTC

I had one similar to that today, Les, also for a Wah2 Eu task. Fourteen lines of the 'Parse' error Misty mentioned. Next contact was normal and a new task snagged.

I tend not to worry about one-off messages on my machines -- until they recur, or someone else posts about the problem.

Jim

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 53464 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 53466 - Posted: 18 Feb 2016, 16:28:03 UTC - in response to Message 53465.  

They will continue to show as �in progress� on the account page until they time out.

ID: 53466 · Report as offensive     Reply Quote
WB8ILI

Send message
Joined: 1 Sep 04
Posts: 161
Credit: 81,421,805
RAC: 1,225
Message 53467 - Posted: 18 Feb 2016, 18:36:09 UTC

Misty -

I haven't done this for a while, but it used to work.

Wait until all of your current models have completed (you have no work).
REMOVE the CPDN project from your computer.
Wait a few minutes.
ADD the CPDN project back to your computer.
CPDN will realize you can't possibly have any models on your computer and will abort all the models in the CPDN database that are currently in-progress on your computer.
ID: 53467 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53468 - Posted: 18 Feb 2016, 19:47:58 UTC - in response to Message 53465.  
Last modified: 18 Feb 2016, 19:48:32 UTC

Misty

That sort of thing is called a ghost model, and lots of people have them in their collection. Like Casper, they're friendly, and don't hurt anything, so the simplest idea is to just ignore them.
ID: 53468 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,489,403
RAC: 2,906
Message 53476 - Posted: 19 Feb 2016, 19:40:36 UTC - in response to Message 53468.  

I recently had few of these ghost ones and did as WB8ILI suggested here. Detached and re-attached to the CPDN project. Besides getting rid of ghosts one frees WU's for others to crunch. In my case they were too old and labelled as No-Resubmission, but for newer it should work just fine.
ID: 53476 · Report as offensive     Reply Quote
Kevin

Send message
Joined: 5 Jul 09
Posts: 63
Credit: 6,091,274
RAC: 0
Message 53479 - Posted: 20 Feb 2016, 6:15:29 UTC - in response to Message 53476.  

I recently had few of these ghost ones and did as WB8ILI suggested here. Detached and re-attached to the CPDN project. Besides getting rid of ghosts one frees WU's for others to crunch. In my case they were too old and labelled as No-Resubmission, but for newer it should work just fine.



It also clears out any old files left over from previous CPND projects etc.


Kevin
ID: 53479 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4314
Credit: 16,380,160
RAC: 3,563
Message 53480 - Posted: 20 Feb 2016, 8:27:06 UTC

Wait until all of your current models have completed (you have no work).


Is the important bit. Otherwise you lose real work! Depending on the speed of your computer and boinc settings etc. you may take a long time to reach the no work position unless you set the project to, "no new tasks."
ID: 53480 · Report as offensive     Reply Quote
WB8ILI

Send message
Joined: 1 Sep 04
Posts: 161
Credit: 81,421,805
RAC: 1,225
Message 53481 - Posted: 20 Feb 2016, 15:49:19 UTC

I know it has been posted many times that one missing result is no big deal. But, I am always afraid there is some scientist in Australia or New Zealand just waiting for this last result to complete a study. So, I like to get my ghosts back into the queue.

Sometimes I do have to wait 10 days or more before I have no work left before removing the project from the computer.
ID: 53481 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53482 - Posted: 20 Feb 2016, 16:46:23 UTC - in response to Message 53481.  

I suspect that there's no problem for the scientists, either with models like this, or with those that are "being sat on" by people so that their computer always has work way into the future.

If they don't get sufficient results back in what they consider to be a reasonable amount of time, then they just need to issue a new batch.
With batch numbers now included in the model names, this makes it even easier for them to write simple scripts to find out what is happening.

ID: 53482 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 53483 - Posted: 20 Feb 2016, 18:26:41 UTC - in response to Message 53482.  

If they don't get sufficient results back in what they consider to be a reasonable amount of time, then they just need to issue a new batch.
With batch numbers now included in the model names, this makes it even easier for them to write simple scripts to find out what is happening.

It seems to me that is the ultimate argument for shorter deadlines. Otherwise, the project will issue multiple copies of the same work just in the hope of getting one back in time. The "hoarders", if I may call them that, then may never know that their work doesn't count, and hence have no incentive to change their buffer sizes, or whatever else it takes. But the bigger problem is that the work could have been sent out to machines that don't delay so long to begin with.

At least one project on World Community Grid has a "reliable machine" designation for those that return the results early, which is a shorter time period than the formal deadline. It would appear to me that the scientists would be insisting that some such approach is implemented here. Or else they are forced to just issue multiple work as indicated above, assuming they are really not willing to delay their publications a year.
ID: 53483 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53484 - Posted: 20 Feb 2016, 18:53:43 UTC - in response to Message 53483.  

I recently suggested shorter times, but didn't get anywhere, which is why I think what I said in my post may be "how things are".

Given that we have a "distributed input" from all over the planet, it may be simpler than trying to get the work to reliable computers.
Although it should be possible to write scripts to look for hoarders and serial crashers, implementing a way for fast, reliable machines to get more of the work load may be tricky.

It would also explain the lack of work for Linux computers, where there's an increasing number of 64 biter's who don't bother checking even once to see if things are OK with the tasks that they get.

Anyway, it's only a problem for/with some computers. And the owners will get their credit without knowing that they're wasting electricity.


ID: 53484 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 22 Feb 15
Posts: 3
Credit: 1,065,624
RAC: 0
Message 53485 - Posted: 20 Feb 2016, 19:50:04 UTC
Last modified: 20 Feb 2016, 19:51:09 UTC

Got this popup today when i got home.

http://imgur.com/cxboWF7
ID: 53485 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : New error message

©2024 climateprediction.net