Posts by Sorceress
log in
1) Message boards : Number crunching : Validation Problem (Message 276)
Posted 2993 days ago by Profile Sorceress
I have another WU failed after uplaod. Is is me or any others having this problem? WUs are failing on 2 different computers so the problem is with the WU or the server. I cant get Admin to fix the problem so Ive suspended CAS.
2) Message boards : Number crunching : Validation Problem (Message 274)
Posted 2996 days ago by Profile Sorceress
Is your OS 64bit?


No. WinXP 32bit
3) Message boards : Number crunching : Validation Problem (Message 272)
Posted 2997 days ago by Profile Sorceress
Hi,Sorceress, I will not change the WU state, just try to grant you credits for WU which is not righit handled! So you will see nothing for your WU state.


Please explain this. In the last 40 WUs I have been granted 34.2 credits for 105,676 secs computer time. I would appreciate some compensation for the time I spent on the bad WUs. I feel 400 credits added to my account would be fair. TIA

Sorceress
4) Message boards : Number crunching : Validation Problem (Message 261)
Posted 3001 days ago by Profile Sorceress
This is a bug of BOINC, I will report it to the developing team of BOINC. As for credits, I will compensate you as soon as possible!


Thanks Jie Wu. Thats is fair with me. I had no idea there was a problem with the CAS WUs until, as I routinely do, checked on my completed tasks. I then saw that a large number had ended in errors. ON my end, those WUs were sucessfully completed, ready to report, not as a 'computation error'. So the error was after they were uploaded. Had I known they were erroring out, I would have suspended CAS much earlier. I will watch to see how the 'failed' WUs are handled.

Sorceress
5) Message boards : Number crunching : Extreme short deadline (Message 257)
Posted 3001 days ago by Profile Sorceress
So sorry about this, it is a fault of admin! Because of miss a zero at the end of the deadline in configuration file. I have asked someone to correct it! And sorry that it is so late to relpy everone, becasue I was involved to a conference, I didn't look at the forum the other day. Sorry again!


Thanks for replying Jie Wu. The big question now is what will be done with the WUs that have been completed but no credit has been granted. I have over 40 WUs that had been completed without pay. Since the problem was on your end, we should have credits issued for the work that has been completed. Your attention to these matters would be appreciated.

Sorceress
6) Message boards : Number crunching : Extreme short deadline (Message 253)
Posted 3002 days ago by Profile Sorceress
Yes, the deadlines have become reasonable again. The one I just downloaded has deadline of 8 November 2010. But I think we crunchers deserve some explanation from the admin what has happened in the past few days, included the many computation errors we encountered?


I agree!! And grant credits for the work completed on the earlier 'failed' WUs. The task view says 'errors while computing' yet when I up loaded the WUs they said 'Ready to report' not 'computation error'. The computation error was on the server end, not my computer.

Any more problems with bad WUs and I will drop this project. I have spent enough time on corrupt WUs and the failure of Admin to adress our problems is disrespectful of our time and resources.
7) Message boards : Number crunching : Validation Problem (Message 244)
Posted 3003 days ago by Profile Sorceress
Is admin responding to any of our messages?? I have over 30 WUs that have errored out on two machines with a lot of crunch time on them. In examining the failed WUs I see a lot of computers (5-9 on many) also returned errors for the same WU!! I have suspended futher work from this project until we get some answers.

Jie Wu, you asleep, dead or just ignoring us?? Your WUs or software is corrupt and costing us valuable machine time!! How about some answers, please!! And some credits for our work.

Sorceress
8) Message boards : Number crunching : Granted Credits (Message 210)
Posted 3032 days ago by Profile Sorceress
I see sever upgrade ...Maybe wu credit upgrade too? :)

DD,


We been trying. So far they haven't answered. One can hope, can one not?
9) Message boards : Number crunching : PROFILES (Message 198)
Posted 3052 days ago by Profile Sorceress
Jie Wu,
FWIW, you need to implement an anti-spam program into the profile creation/editing process or the spammers will flood your database with thousands of ads. A good program for that is called 'ReCaptcha' were a human response is required to match numbers/letters/words before a profile can be created or edited. Once the spammers find your database is unprotected, you'll have a mess on your hands.
10) Message boards : Number crunching : Profile problem (Message 190)
Posted 3063 days ago by Profile Sorceress
Looking at the 'User (account) Information' page, you see the third header down is 'Community'. Could you move the 'Community' information to the right hand side of the 'Account Information' page where the 'ICT,CAS - Tsinghua University' column is. Adding it under the 'ICT,CAS' information would look good. This is generally how most of the projects have the user information page set up. It makes the page more compact. Otherwise if you have a lot of 'friends', as many of us do, the user information page can get quite long



It is a good idea! I will try that later!

Thank you for your proposal!


I would like to get the friends list placed as described above previously. Thanks for fixing the 'add avatar' to the profile creation. In the profile galleries, the link for each user should link to their profile data. Right now, it links back to our own profile data instead. Please fix the links. Clicking on their profile will then takes us to their profile where we can add them to our friends list if desired.
11) Message boards : Number crunching : Granted Credits (Message 189)
Posted 3063 days ago by Profile Sorceress
Okay, back to the original thread topic. Can we please get an increase in the credits granted? Maybe a 2-3x increase would be fine. 12 credits\hr is pretty low. TIA
12) Message boards : Number crunching : cas is suspended until problem is fixed (Message 181)
Posted 3071 days ago by Profile Sorceress
Jie Wu, your assistance in finally resolving a very frustrating situation, is much appreciated. The new WUs I received are now set appropriately to 2 'init_replicas'. I appologize for my hard-edged approach to resolving this problem, but it had to be fixed. It was never my intention to be disrespectful, in any event. I feel certain that operations will run much more smoothly now that it has. Especially with the misunderstandings and aggravations all those aborted WUs were causing. We still have a few more areas to work on, like increasing the credit levels and and making some changes to the website GUI, But for now, those can wait upon your convenience. We will talk about those later. I am glad I don't have to detach from CAS@H. I do like the science your doing and want to do my part to help. Thanks from all of us.

BTW, A word of advice. Wikipedia is really not a good place to get any kind of 'accurate' information, of ANY kind. While Wiki does provide useful information about a lot of things, you can not take everything it say 'verbatum'. Caution and a little common sense is the key to using Wiki. In our case, contact with the other project's admins, would have been much more productive and reliable. Wikipedia is like the Christain Bible. Written by men with good intentions.

Cheers
13) Message boards : Number crunching : cas is suspended until problem is fixed (Message 176)
Posted 3072 days ago by Profile Sorceress
Please look at http://www.boinc-wiki.info/Min_quorum. Apparently, in some projects, the min_quorum is equal to target_nresults while in some projects, it is not. Actually, it is common to ask arget_nresults to be more than min_quorum in some projects, because in these projects, scientists usually want to get results quickly. Supposing this, the min_quorum is equal to target_nresults and a voluteer gets a WU, then due to his business trip for a month, he has to shut down his computer before the WU is finished, what will happen? The answer is that the project server has to wait before the WU exceeds the deadline( BOINC default value is 7 days), and then sends another copy of WU to another computer. It is not acceptable to scientists who want results quickly.


I looked at the link you posted and most all of the 'initial replica' are totally incorrect. If your basing your project's 'mode of operations' on Wiki, no wonder you are so far off target. Wiki is a great information source but it's completely unreliable for accuracy and rarely up to date. I crunch for five of the project listed, SETI, Einstein, Simap, Climate Prediction, and Rosetta. The only 'initial replica' on that list that is correct is Climate Prediction. I can easily bet the the rest are inaccurate as well.

This issue is obviously, going nowhere. I told you how to reduce the large number of aborted workunits you have by setting the 'initial replica' to 2 or less. Send a third only if you need validation for a min_quoram. You are wasting our time and resources with this nonsense, whether you think so or not. It's aggravating to see that many aborted projects.

You have a choice, Stick by your 'guns' and continue to push the 3 'initial replicas' on us, with all those aborted WU's piling up. Or change it to 2 and see if that works. I'm tired of the abortions and the lack of consideration for the users. Coupled with the other issues I have with how this project works, its time for me to drop it. I do hope you will see some rational. Otherwise, I'm outta here.

It's your call!

Let me say one more thing. I don't think you understand the the project/participant symbiosis. We attach our computers to projects for serveral reasons. Some for the science and the desire to help in the research. Some for the 'bragging rights' in having the most credits. And some of a little of both. But in all cases, we do this for fun. Distributed computing is all 'volunteer" When a project is causing or having problems, its no longer FUN. Get my point? With all the aborted WUs, you project is not fun and a source of irritation. When that happens, you will start losing your support.
14) Message boards : Number crunching : cas is suspended until problem is fixed (Message 170)
Posted 3074 days ago by Profile Sorceress
I have checked all 30 of my current projects (you can view my project list) and NONE require 3 for 'initial replica'. NONE. CAS@H is the only one. It's either 1 or 2 what I see. Requiring 3 initial replicas is the cause for the large amount of aborted WUs and wasted redundancy . I feel this is unnecessary aggravation and could be easily be avoided by a 2 or less inital replication. IMHO, that is the best solution.
15) Message boards : Number crunching : cas is suspended until problem is fixed (Message 169)
Posted 3075 days ago by Profile Sorceress
The issue of redundancy can be handle in much easier ways, but ADMIN refuses to implement them.


Firstly, tell me how! no philosophy, step by step!

Secondly, tell me when I said "refuse", or how you got this conclusion?


Jie Wu, I did not say you 'refused' at all. Sorry for my inconsideration. I meant 'refused' as in not 'acting' to resolve the problem. I appologize for the misunderstanding.

I feel that requiring 3 replicas to be sent out at a time, invites troubles. There are to many different systems out here to use such an ineffecient method for redundancy. Although I have no statistical facts to show how many WU actually fail at successful processing, from my own personal experience, my system rarely fails. If that is the case with most hosts, then the chances of failing to get the necessary validations of a WU is fairly low.

So here's what I propose...

Try sending only 2 replicas out first, instead of your usual 3. Then see how many times you get validation on the first try. If the majority of the time, you get the 2 you need for quoram on the first sending, stop sending the 3rd. Use a third sending ONLY as need for a valid quarom. If you look at most of the other projects requirements, you see 2 for quoram and 2 for replica. Rarely is 3 replicas sent at first. (This only applies to the projects I am attached to).

By doing it this way, you reduce or eliminate the excessive amount of aborted WUs. When I look at my task view and I see all those aborted WUs, I get aggravated. Believing that, as I do, those WU had been completed and not granted any credit. And I'm sure other users are seeing and feeling the same way. Many probably just say, 'screw it' and detach, rather than trying to fix the problems. I pefer to stay and fix them.

Try to see our point of view. When I look at any of my other project's task view, I do not see a long list of aborted WUs. So, obviously, the other projects are handling redundancy in a different way. Their stats prove that. I am asking for your considerations in helping reduce the number of aborted WUs and the fustrations that may accompany them.

I mean no disrepect what so ever, Jie WU. Again I appologize for any misunderstandings.

Regards,
Sorceress
16) Message boards : Number crunching : Granted Credits (Message 168)
Posted 3075 days ago by Profile Sorceress

Aslo, what I want to say is that this is also happening in other projects, not just CAS@home. My computer is also running WGC, you konw, it is a successful project. I checked the result list I had finished, and some results were not granted credits, either.

Please check your computer, network and BOINC client, although maybe it still can not resolve your problem.

Hopefully everything is OK!


Thanks for the information, Jie Wu. I am including a post I made to One World, One Dream earlier that might explain what may be happenning:


I had suspended CAS@H until I could resolve the issues I am seeing. But I opened it up last night and started processing WUs again. I followed the first all the way thru and I was paid, as where all three. So right now, I am at a loss as to what is happening. One thing that might be happening is, that BOINC switches between project's fairly often sometimes even though I have it set to switch every 120 minutes. I do run a lot of projects (30) and some require 'high priorty' processing at times. So BOINC will run a short time on one WU, then switch to another to run in 'hi pri' for a while then move on to another. Some where in that shuffle, CAS@H's server may have checked the WU status and saw it had not been started, sent the abort command, but BOINC didn't get it in time to stop the WU until later. So it was processed and then when BOINC uploaded the WU, saw it had been aborted and did so. In that case, the problem is on my end. Running this many projects with the processor I have may be pushing BOINC to its limits to stay up with which WU to process next. In other words BOINCis at its 'wits end'. :)



Can you see what I am mean here? Could this be my problem? Too many projects? Or the switching?

I don't know what 'WGC' is or how it effects the overall processing/validation/granted credits, of the project. I do know that none of my other projects have as many aborted WU as you do. CAS@H has FAR more 'projected aborted' WUs than any other. You can look at my list of tasks and see that for yourself. I feel this is inappropriate and your redundancy requirements could be handled a lot more efficiently than they are right now.

Meanwhile, 'back at the ranch', I will continue to follow this problem, and hopefully, find the true cause for my 'aborted' WUs. In the mean time, I have started processing CAS@H WUs again. I do like the science your doing and want to participate in it.

There is also, still the issue of the 'granted credits' being so low. This was the original premise of the thread. Could you please up it a few levels more, say 3-4x what it is now?? I would appreciate you looking into this and the other requests I made earlier. Allowing us to view our wingman information, moving the freinds list to the right of our profiles, fixing the user profiles to allow avatars and initiating the 'participant photo galleries'. I know you are busy, but these areas are just as important to us as your science is to you!

Thanks again for taking time to address our problems and answer questions.

Regards,
Sorceress
17) Message boards : Number crunching : Granted Credits (Message 167)
Posted 3075 days ago by Profile Sorceress

What kind of equipment do you use? At the moment I am only using one slow Pentium M 1.6 Ghz processor, and my Boinc manager version is 6.10.18 (I know I should update).
I am looking forward to your reply, hopefully this problem can be resolved soon!



I am running a 2.2ghz duo core w/4gb of ram, 6.10.58 Boinc. And yes all 3 replicas were paid. I am slow to upgrade to newer version myself as I believe in the old saying...'if it ain't broke, don't fix it'. My current version is working fine so I am good to go for a while longer.

I have no idea why my 'abort' rate is that high compared to yours at this point.

I had suspended CAS@H until I could resolve the issues I am seeing. But I opened it up last night and started processing WUs again. I followed the first all the way thru thru and I was paid, as where all three. So right now, I am at a loss as to what is happening. One thing that might be happening is, that BOINC switches between project's fairly often sometimes even though I have it set to switch every 120 minutes. I do run a lot of projects (30) and some require 'high priorty' processing at times. So BOINC will run a short time on one WU, then switch to another to run in 'hi pri' for a while then move on to another. Some where in that shuffle, CAS@H's server may have checked the WU status and saw it had not been started, sent the abort command, but BOINC didn't get it in time to stop the WU until later. So it was processed and then when BOINC uploaded the WU, saw it had been aborted and did so. In that case, the problem is on my end. Running this many projects with the processor I have may be pushing BOINC to its limits to stay up with which WU to process next. In other words BOINCis at its 'wits end'. :)

I think at this point, I will let it go and track the problem a while longer. Maybe somewhere along the tracks, I can define the problem better. I will keep you posted.

Sorceress
18) Message boards : Number crunching : Granted Credits (Message 162)
Posted 3076 days ago by Profile Sorceress
One World, One Dream, what is happening, is your work unit was one of the 2 required for validation, in which case you would get paid for it. But were all 3 replicas paid for? I doubt it. If you look at all three hosts, 2 are paid, one is not. I can almost guarantee the third work unit had been completed but was aborted, not paid, because his result was not needed. He did the work, but not paid. The project will not abort any WU until the first two are validated. The time involved to validate would assure all three have been processed. This is the point I'm trying to make. We are not getting paid for work completed! Check it out for yourself. Remember, last one in, buys the beer! :)
19) Message boards : Number crunching : Granted Credits (Message 161)
Posted 3076 days ago by Profile Sorceress
For me everything works fine. Redundant tasks are only aborted by the server if they have not yet started crunching. And if two people have already finished a task and I send in a third result for the same work unit, I still get credits.


Hmmm... your 'milage' is different than mine. Maybe you should take a closer look at your facts. Out of 20 WU I completed and uploaded, 9 were aborted without any credits granted. I KNOW they were completed. I watched them complete! That's almost half!

When all three replicas are sent at the same time, there is a 95% chance work will have started on all of them. Also, the time between being sent and the time validated is more than enough time for all three WU to be completed, given the short processing time of CAS@H WUs. Otherwise, aborting any WU until it's validated, would defeat the whole purpose of redundancy in the first place. Timing is the KEY factor in all of this!

BTW, I am retired and have hours of time to watch what goes on with my projects. If there are problems, I have the time to collect the facts. Hope this explains how I arrive at my conclusions.
20) Message boards : Number crunching : cas is suspended until problem is fixed (Message 160)
Posted 3076 days ago by Profile Sorceress
http://casathome.ihep.ac.cn/forum_thread.php?id=23

If you have a question or problem, please use the Questions & Answers section of the message boards. (why no one loooks there )


Probably because Q&A rarely answers any questions!! :> Most likely it's 'cause most of the WU processing issues are handled in the 'Number Crunching' forums. That's were people are reading and replying to posts.


Next 20