Jump to content

Wikimedia Quarto/2/tech/Zh-tw

From Meta, a Wikimedia project coordination wiki

+/-

Technical Development
Technical Development

Tim Starling is the Developer Liaison, the primary contact between our Board and our community of developers. Most of the below report has been written by James Day.
Information about our servers may be found any time at Wikimedia servers. Developer activity falls into two main areas: server maintenance and development of the MediaWiki software, which is also used for many non-Wikimedia applications. Most developers (though not all, by their choice) are listed here. One may show appreciation of their dedication by thank you notes or financial support. Thank you !
Until now, all developers have been working for free, but that may change in the future to support our amazing growth.

Installation of Squid caches in France

[edit]
The cluster near Paris.
Our servers are the three in the middle:
(from top to bottom: bleuenn, chloe, ennael.)


On December 18, 2004, 3 donated servers were installed at a colocation facility in Aubervilliers, a suburb of Paris, France They are named bleuenn, chloe, ennael by donor request. For the technically-minded, the machines are HP sa1100 1U servers with 640 Mb of RAM, 20 Gb ATA hard disks, and 600 MHz Celeron processors.

The machines are to be equipped with Squid caching software. They will be a testbed for the technique of adding Web caches nearer to users in order to reduce latency. Typically, users in France on DSL Internet connections can connect to these machines with a 30 ms latency, while they connect to the main cluster of Wikimedia servers in Florida in about 140 ms. The idea is that users from parts of Europe will use the Squid caches in France, to reduce by 1/10 second, access delays both for multimedia content for all users and for page content for anonymous users. Logged-in users will not profit as much, since pages are generated specifically for them and, thus, are not cached across users. If a page is not in a Squid cache, or a page is for a logged in user, the Apache web servers must take 1/5 to 3 or more seconds plus database time to make the page. Database time is about 1/20 second for simple things but can be many seconds for categories or even 100 seconds for a very big watchlist.

The Telecity data center
The Telecity data center

The Squid caches are not yet active on the machine. Policies as to which clients will be directed to these caches have yet to be defined. Then, the system may require some significant tuning after set-up to be efficient. If the experiment is successful, it may be generalized with more Squid caches added outside of the Florida colocation.

在佛羅里達州安裝更多伺服器

[edit]

In mid-October two more dual Opteron database slave servers, with 6 drives in RAID 0 and 4GB of RAM, plus five 3GHz/1GB RAM Apache servers were ordered. Delays due to compatibility problems the vendor had to solve before shipping the database servers left the site short of database power and until early December search was sometimes turned off.

In November 2004, five Web servers, four with high RAM (working memory) capacity used for Memcached or Squid caching, experienced failures. This resulted in very slow Wikis at times.

Five 3GHz/3GB RAM servers were ordered in early December. Four of the December machines will provide Squid and Memcached service as improved replacements for the failing machines until they are repaired. One machine with SATA drives in RAID 0 will be used as a testbed to see how much load less costly database servers might be able to handle, as well as providing another option for a backup-only database slave also running Apache. These machines are equipped with a new option for a remote power and server health monitoring board at $60 extra cost, an option we took for this order to investigate the effectiveness compared to a remote power strip and more limited monitoring tools. Remote power and health reporting helps to reduce the need for colocation facility labor, which can sometimes involve costs or delays.

A further order of a master database server, for a split of the database servers into two sets of a master and pair of slaves, each holding about half of the project activity; and five more Apaches is planned for the end of the quarter or the first days of the next, using the remainder of the US$50,000 from the last fundraising. The database server split is to halve the amount of disk writing each must do, leaving more capacity for the disk reads needed to serve requests. It is intended to happen in about three months, after the new master has proved its reliability during several months of service as a database slave.

流量及聯結量增加

[edit]

Traffic grew during the third quarter from about 400-500 requests per second at the start to about 800 per second at the end. In the early fourth quarter that rose further to often exceeding 900 requests per second with daily peak traffic hours in the 1,000 to 1,100 requests per second range, then steadied at about 900 and slowly rose, due to the end of the back to school surge, slower than desired response times or both ([1]. Bandwidth use grew from averaging about 32 megabits per second at the start of the quarter to about 43 megabits per second at the end. Typical daily highs are about 65-75 megabits per second and sometimes briefly hit the 100 megabits per second limit of a single outgoing ethernet connection. Dual 100 megabit connections were temporarily used and a gigabit fiber connection has been arranged at the Florida colocation and the required parts ordered.