Posts by Sorceress
log in
21) Message boards : Number crunching : cas is suspended until problem is fixed (Message 159)
Posted 3166 days ago by Profile Sorceress
Thanks for your input. I hope he will. As I said, there are much better ways to handle redundancy than aborting all these work units. He just needs to implement them.

BTW, I am retired and have a lot of time to watch most of what goes with my projects. Maybe this will help you understand how I came to my conclusions.

Regards
22) Questions and Answers : Windows : Aborted by project (Message 157)
Posted 3166 days ago by Profile Sorceress
simon-PC CAS@home 5.14 Short-Cut Threading wu_1280073360_250403_2 16/08/2010 5:47:31 AM 16/08/2010 7:15:48 AM Aborted by project
simon-PC CAS@home 5.14 Short-Cut Threading wu_1280073360_237133_0 15/08/2010 3:44:20 AM 15/08/2010 4:05:50 AM Aborted by project

any idea as to why this is happpeing. win7 ultimate on amd 64 bit processor 3 gig ram etcetc


It is happening because CAS@H requires redundancy, So it sends out 3 replicas of the WU at a time. It then waits for two of the WU to be validated, then automatically aborts the third WU. It's a problem I am try to get resolved. Look in the 'Number Crunching' forums.
23) Message boards : Number crunching : cas is suspended until problem is fixed (Message 156)
Posted 3166 days ago by Profile Sorceress
Your Boinc manager says "cancelled by project if the task you downloaded has already been finished by others and your result is no longer needed.
Many projects do this to prevent redundant results.


No, *MANY* do not! I am in 30 top projects, and none of them have this problem. Yes, occasionally I have seen the *aborted by project* message but it's very rare. Nor do I know if there had been any work started on the WU before it was aborted. I wasn't watching.

When the project communicates with your Boinc manager and sees that your task has not started yet and its result is not needed anymore, it will automatically jump to 100% and download a new work unit.


BUt!.. What if the WU has been started??

No CPU time will be used on that cancelled task, so you do not need to worry about wasted computation time.

This behavior happens more often with cas@home than with other projects because your Boinc client connects with the server once every minute, while other projects have longer time spans to connect with the server.
I hope my explanation answers your question.


What is the reason BOINC communicates with CAS@H more often than the other projects? Even if BOINC communicates with CAS@H every minute, given the short processing time for CAS#H WUs, processing is well underway on all the WUs. So if the WU has begun processing and the the project checks on its status, what happens then? CAS@H is sending out 3 replicas of a WU. Then it must wait for two of them to be validated. The time between being sent and the time the WU is validated, is more than enough time for all 3 to be completed. Why would CAS@H abort any WU until it's been validated. That would defeat the purpose redundancy.

I'm sorry, but I find your explainations simply don't hold water. Because I was aware there may be problems with CAS@H WUs, I WATCHED the whole process completely. From start to finish. The WU was received, processing was started as normal, and I watched it thru to completion. It took about 38 minutes, and at no time was the WU aborted (as would happen in your scenario) until I uploaded it. Nor was I paid for the WU! So *DO NOT* tell me the WUs are aborted *BEFORE* they have started. It's not true! Nor was this just one incidence, but one of many. So your explainantion may work for some, it's not true in my case. And I KNOW this happening to others.

MY point in this whole matter is not the redundancy but in credits for work done. IT *IS* being done and *NOT* being credited for!! I listed 5 WUs earlier in 'Granted Credits' that I observed personally. Those, alone, contradict your explainations. What aggravates me is the fact that out of 20 WUs I *COMPLETED*, 9 of them were aborted. That's almost half! Because of this and others issue I have, this project is beginning to be a waste of my time. I do like CAS@H and the science behind it, but either these issues are resolved or I move on. There is no reason for us to have to deal with all these aborted WUs. The issue of redundancy can be handle in much easier ways, but ADMIN refuses to implement them.

What I do question is where you get your facts? Are you part of the CAS@H admin or just a volunteer. Or is there a page where I can find them? Since the 'facts of life' don't support your 'theories of evolution', why dont you do some real research? You could begin with reading earlier posts in the other forums.
24) Message boards : Number crunching : Granted Credits (Message 152)
Posted 3167 days ago by Profile Sorceress
Hi,
i also had a lot of wasted WUs. It´s time to fix the problem. I hate chrunching for nothing!


Thanks for your input Henery. I am hopeful we can resolve this issue soon.
25) Message boards : Number crunching : Granted Credits (Message 150)
Posted 3168 days ago by Profile Sorceress
As a footnote to my earlier reply to Jie Wu, I will admit that I am not familiar with BOINC server software. While indeed, there maybe ways of aborting WUs before they have started processing, I see time constraints as being a key factor in making that impossible. The project server would have to immediately contact the host directly, (within seconds of sending out the WU) to abort it. Or certain controls could be imbedded in the work unit to wait for a 'go/no-go' command from the server before processing. Again, this would require direct intervention from the server. I have been using BOINC for 4 years now. As far as I know, BOINC only contacts the project's server when requesting work or reporting completed work or during a user udate. It may be possible for a project's server to contact a host directly (outside the BOINC environment?), but not that I am aware of.

CAS@H is seeking redundancy by sending 3 replicas out at the same time, but only requiring 2 for a quoram. The first 2 hosts to report valid WUs, get the credits. The third WU is then *automatically* aborted as not needed. The problem is, the third WU is most likely to have been alreay completed. Either the server is not checking the third WU on it's status or CAS@H is deliberately ignoring it's status. Either way, CAS@H is not granting any credits on the third WU.

In the beginning, CAS@H was paying for all 3, but I see that stopped a while back. Now they only pay for 2. That required direct server intervention by admin. As such, I can not help but feel that CAS@H is aware of the situation and refuses to rectify it. I am not sure why that is.

I would appreciate other users weighing in on this, so that I am not a voice of just one. I think Jie Wu may view me as a cranking ol' witch, but I assure you, that is not the case. (At least I don't think so LOL). Anyway, if he heard from others as well, he may find a solution that is a 'win-win' for us all. If any of you have uncredits WUs that you know was completed but aborted by the project, please post your finding here. Your help would be appreciated.
26) Message boards : Number crunching : Granted Credits (Message 149)
Posted 3168 days ago by Profile Sorceress
I have checked it, but I am sure it is handled in case 2! Is it a bug of BOINC?
Has anybody else run into this problem? And does anybody know how to fix it?


Jie Wu, You need to understand that you CAN NOT stop WUs from being processed once you have sent them out! Case #2 can't be done! This is not a Boinc problem or bug, it's the way you have the project server setup to run. Once it receives the two validated WUs it needs for quoram, the server *AUTOMATICALLY* aborts the third WU. Apparantly, the server does not check to see if the third WU has been completed or not. I know for a fact, the WUs I indicated earlier were, indeed, completed and were aborted when I tried to upload them. I don't know how much plainer I can make that. Your server is aborting completed WUs! Thus, you are wasting our time and resources by not granting credits for our completed work. This has to stop! It is not my desire to be disrespectful, but all this 'beating the bushes' is getting us nowhere.

Here is the fix. You can...
1) Stop aborting the third WU if it has been completed and grant the proper cretit for its successful completion. Or...

2) Stop sending out 3 replicas if your not going to pay for all three if successfully completed. Or..

3) Send out 2 replicas and wait for their successful completion. If one or both fail, then send out another replica as needed to satify your 'quoram'.

This is the way the other projects handle their *redundancy* needs. They don't ask their users to work on WUs and then not credit them for their time. That's what this project is doing. It's unethical and shows a lack of respect for our efforts on your behalf.

Normally I would have already drop this project and moved on (I'm sure others have already done so). But, I like this project and the science behind it. I prefer to remain working for it. I understand it is new and has it's problems. For these reasons, I am trying to resolve these problems in a politeful manner. But I can not continue wasting my time and resources on uncredited work. Either we fix the problems or I will detach and move on. And I will inform others of what I see accuring here. There are forums for just this purpose.

It's your choice. Please advise us on your decisions.

Regards,
The Sorceress





27) Message boards : Number crunching : Granted Credits (Message 146)
Posted 3172 days ago by Profile Sorceress

1) Abort tasks regardless of status.

2) Abort tasks, only if they have not yet started crunching


In the case of 2), I dont see how you can abort any of the 3 WUs if they are all sent at the same time. All 3 will have already started processing. You can either hold the 3rd WU until the first two are validated, then send the 3rd WU ONLY if needed. Or pay for all three, in which case. scenerio 1) is acceptable. As it now stands, the first two computers to upload a valid WU, get paid while the 3 computer does not. I find this unacceptable!

I strongly prefer NOT to process redundant WUs with out being compensated for my work.
28) Message boards : Number crunching : Granted Credits (Message 144)
Posted 3173 days ago by Profile Sorceress
I'm am still not getting paid for work done. CAS is not paying for the third validation, only for two. I have suspended this project until this issue is fixed.
29) Message boards : Number crunching : Wingman stats (Message 142)
Posted 3175 days ago by Profile Sorceress
Thanks Zombie, for clarifying that point. There is a difference between a wingman's information and your own information within a project. When someone looks at your computer information, they see what you see when you look at their information. They do not see what YOU see when YOU look at your own information. No personal info like IP adress, Domain name, BOINC use stats, etc is seen in the wingman view. I know this may sound a bit confusing, but it's not once you understand the points of reference. It would be helpful if our wingman's info was available. TIA

What may also be happening Jie Wu, is that as admin, you may be seeing more information about a user's computer than what is normally seen when we look at that users information. A lot more. In truth, the wingman view has very limited information but it's what we need for our evaluations. Please give this some consideration. Thanks

PS I would be honored if you would join my friends list. I have sent you a request.
30) Message boards : Number crunching : Granted Credits (Message 141)
Posted 3175 days ago by Profile Sorceress
Work unit 227568 just finished being processed. When I tried to upload the work I got an 'aborted by project' message. This problem is caused by sending out 3 replicas of a WU when you only need 2 for validation. This is fine as long as you pay ALL three users for their work. That's not happening. You can also drop the 3rd replication and pay only for the two you require. No free pass on this issue. We should get paid for all WUs completed whether you need them or not. I have suspended further processing until this problem is resolved. Your urgent attention would be appreciated. TIA
31) Message boards : Number crunching : Operating Systems (Message 138)
Posted 3176 days ago by Profile Sorceress

Most of the big ones use Linux. :-)

Unfortunately, the page only shows Windows. Oh well, maybe next time.

Cheers, Mike.


In actually Mike (ADHC), the most common DC OS platform is windows based. Yes there are 'some' large Linux 'farms' but they arent 'most'. Look at any project's top participant's computer list (if they arent hidden) and Window OS leads the pack with Linux and MAC in trail. Sometimes it's mix of both with windows generally being the leader.

I am not sure why not having a Linux app would be a problem. In looking at your BOINC@AUSTRALIA team member's computers , I see that most of them are WIN OS based anyway (includint yours). Im sure in time, CAS@H will be available on all OS platforms, but for now your team (and you) will do just fine attached to this project. And it's good science! Happy crunching!
32) Message boards : Number crunching : Granted Credits (Message 137)
Posted 3176 days ago by Profile Sorceress
Thanks for taking the time to reply Jie Wu.

In keeping on topic I have another problem that concerns me. I have been watching the WUs that are aborted by the project and I have noticed that the abort command is happening after my computer has finnished processing the WU and is uploading the results. In other words I am wasting computer time on redundant work and losing credits I have earned in processing. The last 3 WUs: 209566, 217204, 222349 (+others), had all been completed and were being uploaded when the abort command came. And no credit was granted for my time.

Can you please give this your attention?
33) Message boards : Number crunching : Wingman stats (Message 136)
Posted 3176 days ago by Profile Sorceress
Jie Wu, wingman information is readily available on all the projects I am associated with. There is nothing in the infomation that violates his/her privacy that I am aware of. It contains no personal information, only the computer's. Anyone can access my information anytime they want. Its ok. There is nothing I need private about my system. It does help us determine who is working on the task with us and how long it may take them to complete their part. It's a common practice to allow this access and can easily be verified if you wish. TIA
34) Message boards : Number crunching : Wingman stats (Message 131)
Posted 3177 days ago by Profile Sorceress
SETI does this too. But only for the tasks that have not yet completed. What is it doing here?


I am not aware that S@H did that. I have always been able to access
a wingman's information, regardless of the task's status. Is this something recent? I have dropped the Seti projects because of the long down times and the low credits so have not check this lately.

As for CAS, when I try to access the wingmans info I get this message.."Sorry! You have no privilege to browse this page!" I was hoping Jie Wu could fix this.

BTW, Zombie I would be honored if you join my friends list here. My request is pending. Have a great day!
35) Message boards : Number crunching : Wingman stats (Message 125)
Posted 3180 days ago by Profile Sorceress
Would you allow us access to our wingman's computer information in the 'Task View' window. Right now access is denied. TIA
36) Message boards : Number crunching : Granted Credits (Message 121)
Posted 3183 days ago by Profile Sorceress
Would you increse the granted credits level? 18 credits/hr is fairly low compared to most of the projects. A 2-3x increase would be on par.
37) Message boards : Number crunching : Super User With Mega Host-Farm ? (Message 119)
Posted 3184 days ago by Profile Sorceress
How can 1 (one) user generate 31,583 credits in just 1 (one) day?


Easily. Try 400+ 8-processor CPUs w/ 20+gigs of ram each, 2 or more very high-end GPU in each machine. Plus high end bus interfaces and more money than you can imagine. They can process several thousand WUs a day without breaking a sweat. There are many user out here with that kind of power. I have seen them.

Many users are IT techs, working for large corporations, universities, etc with hugh computer systems that the ITs used to process work units during the off-peak hours. And some are users with more time and money than most of us have.

Welcome to distributed computing!
38) Message boards : Number crunching : Profile problem (Message 114)
Posted 3186 days ago by Profile Sorceress
The profile problem can wait till you have the time. Also, could you place the friends list on the right of the profiles instead of at the bottom?


Sorry, I didn't understand this! Could you explain it again?[/quote

Looking at the 'User (account) Information' page, you see the third header down is 'Community'. Could you move the 'Community' information to the right hand side of the 'Account Information' page where the 'ICT,CAS - Tsinghua University' column is. Adding it under the 'ICT,CAS' information would look good. This is generally how most of the projects have the user information page set up. It makes the page more compact. Otherwise if you have a lot of 'friends', as many of us do, the user information page can get quite long
39) Message boards : Number crunching : Profile problem (Message 112)
Posted 3187 days ago by Profile Sorceress
Thank you for your attention and reply. The profile problem can wait till you have the time. Also, could you place the friends list on the right of the profiles instead of at the bottom? TIA

I did want to ask if 3 replications are needed for processing when you need 2 for a quorum? It would help with the server aborts is you used 2 for inital replication if 2 are required for a quorum, I would think.

I wish the project best of wishes for its success. :)
40) Message boards : Number crunching : Profile problem (Message 110)
Posted 3188 days ago by Profile Sorceress
Hello and Welcome Jie Wu! I have joined your project to help in your research.

I have two problems so far.

1) I am unable to add a picture to either my profile or message board.
Can you fix this please?

2) Why are my WUs being aborted by the project? Is there something wrong on
my end or yours?

Otherwise, I'm good to go :)


Previous 20