Buying ISK for real money is not allowed!
May 5th, 2009Debugging Jita Live is For Real Men
Debugging Jita Live is For Real Men
One of the problems that we’re faced with in running a cluster on a massive scale such as our beloved Tranquility is the fact that it’s extremely difficult to test specific load issues before code is deployed onto the cluster.
We have a series of load-inducing tests that we run on our test servers and we get players to participate in huge fights on our public test server in order to gauge the effects of new code.
In Apocrypha we had a staggering number of changes to the code base from dozens of programmers working on three continents. Keeping tabs on the changes was a daunting task and, as always in large software projects, a few bugs slip the net and make their way to the production server.
In this case, a bug caused Jita to start suffering from performance degradation with 300 people in the system and we had no idea why. Basically all the nodes were running hotter than they should be, and in the case of Jita, it was running at 100% CPU capacity under load which should only have it running at 30%.
Luckily our live team has become quite proficient in debugging issues on TQ and immediately sprung into action. We have extremely talented people in all departments and for this specific problem Programming, Quality Assurance and Virtual World Operations formed the backbone of Operation: Fix-whatever-is-wrong-on-TQ.
We have extremely good diagnostic tools in place on the cluster, but the first thing we noticed is that those tools did not properly report where the CPU cycles were bleeding. From studying the graphs, it looked as if the server was doing the same amount of work as before only each work unit was more expensive than it used to be.
After much deliberation, we concluded that the fault must lie in one of the low level systems of the game that reside outside our timing logic (in lay man’s terms: Planck-Scale Spacetime Fluctuations). Moving forwards we split the taskforce into two teams. The first team (Team Alpha) started to examine the code in search of the needle in a haystack while the second team (Team Bravo) started to look more closely into the program execution on the cluster itself.
Team Alpha looked at the code carefully and found out that this must lie in a piece of logic that has to do with “bound object connections”. Following that route led them to an internal reproduction case. Once they had that, finding the actual problem was rather straightforward.
Team Bravo went onto the Jita node and paused the running process to see in what segment of code it was running through the CPython interpreter.
By a startling coincidence both teams arrived at the same piece of code at the same time!
It was literally one line of code. Typical…
Fixing the problem was trivial, and the teams then deployed the fixed code directly onto a single node on the live running cluster. This had never been done before in this way and was extremely exciting. The Jita solar system was moved onto the node and we saw immediately that the fix was working.
The next day, during downtime, the fixed server build was deployed cluster-wide and space-time was finally allowed to heal itself.
This is how real men (and women) do it…
- CCP Atlas
Source: www.eveonline.com
Call for Candidates – The Third Council of Stellar Management
Is it that time again already?
Yes it is.
Soon we will open the candidate applications page for the third term of the CSM (details of dates below). Like the two terms before, the full name and location of the candidate will be published, along with the candidate’s character name, URL and campaign message. All candidates have to submit a scan of their valid passport in order to be considered eligible candidates and be 21 years of age or more. Furthermore, it is very important to have 100% correct and up to date ownership information on all of your user accounts, as all of them will be checked.
Why should you run for the CSM?
Because if you do you have a chance of affecting EVE, both in the short term and in the long run. So far the CSM, during the first term and the current second one, has brought to CCP 128 topics — topics such as “add skill queue,” “crane needs more powergrid,” “switch ammo in all weapons at once,” “black ops improvements” and “bombs need a boost” that have been approved by CCP and implemented already to some extent — weapon grouping, allowing Blockade runners to use Black ops jump portals and decreasing the mineral requirements for bombs being the changes made so far.
This might sound like easy and trivial — we, however, believe that all topics that do not qualify as a bug report can be brought up by the CSM. Luckily, the number of the easy sounding and trivial looking topics is finite and we are already seeing discussions between the CSM and CCP regarding the future of 0.0 from several perspectives: sovereignty, industry and station handling and management. Furthermore, the topics of the progress of the history in EVE — touching the Factional Warfare and written fiction – and the general mineral status and where the minerals are coming from have been discussed by the CSM.
Out of the 128 topics discussed, 50 have been added to and/or given increased priority in the backlog, 47 have been bumped into the pipeline from the backlog and/or given increased priority there, 20 topics are considered to have been implemented (some will appear in Apocrypha) and 9 have been denied implementation (mostly because of technical reasons). Two topics are awaiting further input from the CSM. Remember that EVE is a living game where many of its features are bound to change as the days go by — thus many items said to be in the pipeline may have addressed a topic brought up by the CSM in a partial manner. Small, incremental changes are the key — such as allowing Blockade runners to use black ops jump bridges — where a change was introduced, its effects monitored and further decisions made once data has been gathered.
The CSM white paper and the summary of it can be found if you chase these links.
Following are the important dates for the third CSM:
- March, 17th. Opening for candidacy runs for the third CSM.
- March, 31st. Closing for the candidacy applications.
- May, 12th. Voting for the third CSM opens.
- May, 26th. Voting for the third CSM closes.
- May, 29th, Thursday. A permanent announcement made about the results of the third CSM elections.
- May, 31st, Saturday. The third CSM meets online for the first time.
- August, 19th – 23rd or 26th – 30th, Wednesday to Sunday. CSM arrives in Iceland to speak with CCP.
The meeting in Iceland has been moved to August because of summer vacations – and the final dates for the meetings have not been confirmed yet. The meeting will be held either 19th – 23rd of August, or 26th – 30th of August.
Throw your hat in the ring – make history.
Source: www.eveonline.com