Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 14
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 18080 times and has 13 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
pending verification

Have been running CE2 tasks without problem until I got the following message:
"Killing job because cpu time has been exceeded"
Curious why it was terminated. Deadline for it was not until next month. I do leave tasks in memory when suspended (normally when tasks cycle).

Did I just waste 12 hours of CPU time (more if you look at the actual time, not set back to last checkpoint)?
[May 25, 2013 7:08:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
captainjack
Advanced Cruncher
Joined: Apr 14, 2008
Post Count: 140
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: pending verification

The clean energy project has a 12 hour limit on tasks. If the task finishes in less time, hooray. If the task is still running at 12 hours, the software stops it and the intermediate results are sent back to the scientists. The scientists are supposed to be able to tell whether the experiment is worth pursuing by looking at the intermediate results. No time was wasted.

Keep on crunchin'
[May 25, 2013 7:39:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2955
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: pending verification

At one time, there was talk of giving the users an option to 'up' this cut-off limit to 24 hours - although I haven't seen anything of that suggestion for a very long time now...
----------------------------------------

[May 25, 2013 7:48:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: pending verification

Not going to happen. What the future upgrade of the science app brings we'll learn when it's beta time... as always, will likely come with hours notice in the Beta forums.
[May 25, 2013 8:28:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2955
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: pending verification

Not going to happen
SekeRob, I did tend to realise that it wasn't going to happen - as, after all, it must be at least 12, if not 18 months ago since I last saw anything of that idea.
----------------------------------------

[May 25, 2013 11:11:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: pending verification

We are working on upgrading to a new version of QCHEM now and will make some changes to the workunits when that version is released.

However, for complex reasons that I encourage the Harvard researchers to describe fully (I would explain my understanding but I would likely introduce some misinformation), the information created by running one workunit for 24 hours is less valuable than running 2 workunits for 12 hours each. As a result, it is not necessarily in the interests of advancing the science behind the project for us to change that limit.

However, what Harvard has told us is that would advance their project is that they have some jobs that they will not currently send us and have been running on powerful grids. This is because they require more RAM, more CPU power, more IO and more bandwidth. However, we looked at the higher end devices connected to us and they could certainly run these jobs. When we role out the new version of QCHEM we plan to provide a mechanism for people to sign up to receive these challenging jobs and allow Harvard to run more of these high end jobs.
[May 26, 2013 2:35:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: pending verification

Weel, trying again with 2 WU's. No other WU's running from any project on my box which is ONLY running boinc and not used for anything else. Up to almost 5 hours of CPU time (same on RUN time) and both WU's haven't checkpointed in over 4 & 1/2 hours! CPU time at last checkpoint was right around 24 minutes.

How often is it supposed to checkpoint?!? Every 4 hours? 5 hours? More?

If this is normal for CE2 WU's then seems that information should be included on the "requirements for projects" page so users can either opt out or increase the "switch between applications time".
[May 26, 2013 5:09:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: pending verification

There's -no- regular interval between checkpoints of which there are a maximum of 16, of which #3 and #12 take the longest to reach... depending on device power from 3 to 6 hours, some devices don't even get to checkpoint #3 before the 12 hours cut-off is there.

It is recommended, strongly, to -not- allow multiple CEP2 tasks start at exactly the same moment. This so badly compounds IO bottle-necking, that even the first checkpoint/setup phase can take very long.

There's a checkpoint Start Here FAQ which discusses the particulars for all the different sciences at WCG. CEP2 is opt-in, not opt-out. A special configuration / recommendation page was compiled by the CEP2 scientists.

Switch time you don't have to change. All BOINC clients since about v6 do not switch project unless a checkpoint has been made. It is on the recommendation sheet to activate LAIM (Leave application in memory when suspended) when running CEP2, so when they are interrupted, they can resume from where they left off.

edit: Obviously, if BOINC is restarted [unloaded from memory] or the system booted, the task(s) resumes from last checkpoint.
----------------------------------------
[Edit 1 times, last edit by Former Member at May 26, 2013 5:26:04 PM]
[May 26, 2013 5:24:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Randzo
Senior Cruncher
Slovakia
Joined: Jan 10, 2008
Post Count: 339
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: pending verification

Thank you for the update Knreed.
Give us these challenging jobs, we like challenges I will definitely opt-in for them.
[May 26, 2013 6:28:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SuDu2
Cruncher
Joined: Nov 13, 2013
Post Count: 9
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: pending verification

I think I have data files setting in my Home file (Ubuntu) that need to be fetched. I am new to Ubuntu and WCG, so will need help. The files appear to be locked.
----------------------------------------
[Edit 1 times, last edit by SuDu2 at Nov 17, 2013 5:14:02 PM]
[Nov 17, 2013 2:22:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread