Jump to content

Toolserver/MaintenanceLog

From Meta, a Wikimedia project coordination wiki

Admin maintenance and status log for the Toolserver.


2007

[edit]

October 12

[edit]
  • all river: the maintenance log has moved to [1]. this can be updated from the command line by running addlog text
  • all river: moved /etc/sudoers to LDAP

October 11

[edit]
  • hemlock river: changed mkuser to add grants for u_username_% and u_username, instead of u_username%
  • zedler river: changed grants for zedler to hemlock.ts.wikimedia.org instead of hemlock.ts-local, to simplify administration. updated DNS and /etc/hosts.

October 7

[edit]
  • hemlock dab: Created missing mysql-access for Simetrical too.

October 5

[edit]
  • hemlock dab: Changed mkuser-script. New user can now create databases of his/her own, is long as the databasename start with "u_username".
  • hemlock dab: Created normal mysql-access for "pietrodn". For some reason, he/she hadn't any mysql-rights anywhere.

October 4

[edit]
  • hemlock dab: Switched sql-s1 to yarrow.

October 3

[edit]
  • clematis&hemlock dab: Created /aux0/user-store/ on clematis, config it for nfs, mounted it at /mnt/user-store on hemlock and make it writeable by user.

October 1

[edit]
  • yarrow river: started s1 import from vandale dump
  • yarrow river: confirmed bad disk in yarrow's array, channel 3, id 12:
Medium error during read
ASC: 0x13   ASCQ: 0x0
Repairing hard error on 127914882 (7962/84/60)...Warning: Block 127914882 zero-filled.
The new block also appears defective.
  • yarrow river: rebooted to add new lun for possibly failed disk, now running media scan
  • zedler river: rebooted accidentally, mysqld recovers
  • yarrow river: ran a RAID5 parity check on the array, one drive (channel 3, id 12) reported many media errors. replaced it with the hot spare and will run a full media scan later.

September 30

[edit]
  • yarrow river: copy finished, restarted mysqld on yarrow
  • yarrow river: reformatted /aux0 as UFS, copying old data from clematis:/aux0/yarrow-backup

September 29

[edit]
  • yarrow dab: stoped mysql, it's useless. Asked TimStarling for a new dump of s3.

September 27

[edit]
  • hemlock dab: Did maintaince and changed discstructur (details: [2]).
  • yarrow dab: Started mysql_safe. mysql seems crashed without message in the logs.

September 26

[edit]
  • zedler dab/mark: Replaced a disc in the external array. Installed a cache-batterie at the array too. Restarted array and zedler to make sure, that no disc-activitiy will be lost.

September 17

[edit]
  • zedler dab: Switched master for s2 to lomaria.

September 15

[edit]
  • hemlock dab: Started Tomcat, was stopped

September 14

[edit]
  • yarrow river: recreated RAID array as RAID-0 temporarily, to move s1 from vandale

September 11

[edit]
  • clematis river: rebooted, missing drive reappeared

September

[edit]
  • September 09
    • 18:55 hemlock dab: Deleted old kernel-sources.
    • clematis river: disk 1 failed
    • zedler river: recreated the /aux1 filesystem as ZFS, to identify the failed disk. re-loaded s2 dump and started replication.
  • September 2
    • Evening zedler dab: Mysql-Database total defect. Reinstall needed.

August

[edit]
  • August 27
    • 19:39 yarrow dab: Changed db-master to db5, restart replication.
  • August 24
    • 21:17 vandale dab: Manualy added row image.img_sha1 to enwiki (replication stoped because of missing it). Restarted replication.

July

[edit]
  • July 26
    • 16:35 hemlock daniel: added yarrow to phpmyadmin's config, so it can be accessed via the web interface
  • July 19
    • 15:40 hemlock daniel: created script /root/grantit that outputs grant statements for all users, ready to be piped into mysql. Usage is /root/grantit <priv> <target> [client-host] [mysql-server], so for example you can do /root/grantit select "toolserver.*" hemlock.ts.wikimedia.org yarrow to grant SELECT on the toolserver db on yarrow to all users logging in from hemlock.
    • 15:30 zedler/yarrow daniel: toolserver db (with wiki and namespace tables) is now available on yarrow. It contains views that delegate to toolserver_priv, which in turn contains federated tables that map to the respective tables on zedler. The user ts_federated on zedler is used for the federated queries. The views are necessary to hide the password of ts_federated from users, which otherwise would be visible through SHOW CREATE TABLE on the federated tables (in MySQL 5.1 this would not be necessary when using a pre-defined connection created via CREATE SERVER).

June

[edit]
  • June 22
    • 13:25 both river: changed root password
  • June 8
    • 16:45 hemlock river: upgraded kernel

May

[edit]
  • May 27
    • 19:45 zedler DaB.: Restart mysql after a mysql crash.
  • May 26
    • 00:04 zedler river: gave normal users show view grant so EXPLAIN works
  • May 15
    • 09:27 hemlock river: reboot to upgrade kernel and enable auditing
  • May 2
    • 18:50 zedler DaB. Changed s2-master to db8 because master changed.
  • May 1
    • 3:03 zedler river: upgraded mysqld to 5.0.40

Apr

[edit]
  • April 22
    • 12:38 hemlock river: removed per-user groups
  • April 18
    • 15:10 zwinger DaB.: Changed the Mysql-Master-Server for S2 to db1; restart the sqltunnel and restart all replication.
  • April 17
    • 20:03 zwinger DaB.: Changed the Mysql-Master-Server for S3 to db1; restart the sqltunnel and restart all replication.
  • April 16
    • 19:34 hemlock river: renamed xyrael -> swhitton
  • April 15
    • 23:03 hemlock DaB.: Change /etc/alias. eMails to root@ go now again to the roots and not to the OTRS
  • April 14
    • 1:08 hemlock river: moved from stable back to testing
  • April 8
    • 16:05 Zedler DaB.: SQL-Tunnel didn't start, start it manualy, restart mysqlreplication.
    • zedler (robchurch) : Seems to be down; mysql timing out on clients, unable to SSH from hemlock
  • April 2
    • zedler (robchurch) : Added user.user_editcount to user view, rebuilt views for all databases

Mar

[edit]
  • Mar 30
    • ~19:00 zedler DaB.: bewiki- and the old bewiki-database seems wrong. Perhaps a reimport is needed.
  • Mar 25
    • 05:03 zedler river: something got unhappy at zedler, rebooted
  • Mar 15
    • ca 13:20 UTC: zedler Daniel: Restarted replication for s3, leaving s1 off for now (should that be running?)
    • 12:12 UTC: zedler Daniel: Restarted MySQL after crash ("page corruption", twice)
  • Mar 11
    • 1:30 UTC: zedler DaB.: Beginn to write a howto for the replication at wikitech.
  • Mar 10
    • ~17:06 UTC zedler Daniel: restarted replication for s3 (using startit script)
  • Mar 03
    • ~23:58 zedler Daniel: restarted replication for s3 after MySQL crash. Running in screen. Hope I did it right.
  • Mar 01
    • ~2:50 zedler DaB.: Start s3-replication for testing in a screen. Runs very throsseled.

Feb

[edit]
  • Feb 28
    • 20:10 zedler DaB: Change master from samuel to db8 and restart replication for s2. Why nobody had done this before?
  • Feb 27
    • 04:50 zedler river: upgraded mysql to 5.0.33 (compiled from source in /opt/mysql5, since mysql no longer provides binaries)
  • Feb 23
    • 06:18 zedler river: mysql copy finished, restarted mysqld
  • Feb 22
    • 12:57 hemlock DaB.: Removed the last-night-backup, that was created on hemlock instead on zedler and block the hole disc.
  • Feb 21
    • 12:57 hemlock river: upgraded to Linux 2.6.20.1; wrote new kernel build script (cd /usr/local/src; ./buildkernel.sh)
  • Feb 19
    • 4:17 zedler river: replaced the Solaris mpt driver (for LSI SCSI cards) with LSI's itmpt driver; installed current kernel update (118855-36)
  • Feb 18
    • 1:23 zedler DaB: Drop view for zh_cnwiki. zh_cnwiki seems incomplete, wm-server-admins do not know the database.
  • Feb 17
    • 23:10 (or so, before the WM-Cras) zedler DaB: Limit mysql-connection to 15 connection by user at the same time. Set limit of daniel_www to 30
  • Feb 08
    • 0:30 (or so) zedler Duesentrieb: created database u_orgullo_logs for CommonsDelinker logs.

Jan

[edit]
  • Jan 27
    • 17:10 zedler DaB.: Created the view for page_restrictions on all databases except enwiki, where the table is mising at the moment.
  • Jan 26
    • 16:10 zedler DaB.: Reenable innodb doublewrite; restart mysql.
    • ~3:00 hemlock DaB.: Fix the eMail-config at bugzilla. eMail-settings can now config by user.
  • Jan 25
    • later zedler DaB.: Reenable his (magnus) mysql-account, river allready moved his public_html-dir back and fixed his mysql-rights.
    • 17:10 zedler DaB.: Deactivated all tools by magnus by mobing his public_html-dir. Revoke his mysql-rights and changed his password. He should message the roots, one or more tools by him kill the database!
  • Jan 20
    • 11:19 zedler robchurch: Added view for page_restrictions table to /blah/blah/db_views and regenerated views for dewiki only
    • 00:10 zedler DaB.: Change no-en-master to adler. Restart replication.
  • Jan 19
    • 23:40 zedler DaB.: non-en-repliaction is stoped because a corruped binlog-file on samuel (There was an crash on samuel today).
  • Jan 10
    • 18:00 zedler DaB.: Create a new index "user" (create index user on archive (ar_user_text);) in archive. This should speed up the editcounter.
  • Jan 6
    • ~2:00 zedler DaB.: Start en-replication for testing in a screen on zedler (do not close the screen!).
  • Jan 5
    • 03:10 zedler river: split /aux2 into /aux2 (forcedirectio) and /aux3 (normal) to move MyISAM data from /aux1; takes a little more load off the ift
  • Jan 4
    • 18:50 zedler river: binlog was somehow corrupted and broke replication; restarted reading from the master and it seems fine
    • ~morning zedler river: moved data from /int to /ift; created RAID-10 UFS filesystem in place of RAID-Z and added additional InnoDB tablespace there. (mysql seems to want to fill the existing tablespace before it starts using the new one). re-mirrored / as /dev/md/dsk/d20
  • Jan 3
    • 20:18 hemlock DaB.: Change postfix-config to stop the spam to the useraccounts a little bit. e.g. a existened from-domain is now necessary.
    • 14:02 hemlock river: upgraded kernel/reboot
  • Jan 2
    • 17:29 zedler river: disabled innodb doublewrite again; reduced buffer pool size sightly; gave mysql user the priv_proc_memlock privilege for memlock; made new MySQL startup script /usr/local/sbin/fast_mysql which should be used instead of /usr/local/mysql/bin/safe_mysqld
    • 15:45 zedler DaB.: Drop a few views for databases, which are not public. Update the mkviews-script.
    • 15:17 zedler robchurch: Regenerating views for all databases
    • 15:16 zedler robchurch: Added redirect table to views

2006

[edit]

December

[edit]
  • Dezember 23:
    • 17:15 zedler DaB.: Change db-masters to samuel and db2
    • 17:00 zedler DaB.: Recreated views for all databases
  • Dezember 5:
    • 14:00 zedler DaB.: Restart mysql-recovery
  • Dezember 4:
    • 23:35 Both DaB.: Restart
    • 23:30 Both DaB.: Crashed because a powerfailure at sara.
    • ~23:XX zedler DaB.: Mysql-Crashed. Start recovery

November

[edit]
  • November 25:
    • 13:52: DaB.: hemlock: decrease the number of apache-threads to avoid bothering log-eMails.
    • 13:50: DaB.: hemlock: Fixed the singlelogin-rewriterule.


  • November 10: DaB.
    • Zedler: Change non-en-db-master to adler. The master was changed.
  • November 2: DaB.
    • Changed the apacheconfig to start more threads.
      This seems to be puking up errors in the daily reporting email. Check the configuration isn't fubared somewhere? -- Rob
    • Blocked 75.0.15.233. Downloads /media/wikipedia/commons/2/2d/Beethoven_concerto4_1.ogg 41310 times today.

October

[edit]
  • October 29: DaB.
    • Hemlock: Change Apache to version 2.2.
  • October 27: DaB.
    • Zedler: Change non-en-db-master to samuel. The master was changed.
  • October 23: DaB.
    • Hemlock: The update of libapache2-mod-proxy-html broke apache2. Removed this module and restart apache2. Waiting for fixing.
  • October 17: DaB.
    • Zedler: Drop database enwiki.
    • Zedler: Restart enwiki-dump-playin in a screen.
    • Zedler: Drop view enwiki_p.
    • Zedler: Create empty database dummy.
    • Zedler: Create empty mediawiki-tables in dummy.
    • Zedler: Create view enwiki_p which points to dummy.
  • October 13: DaB.
    • Hemlock: Removed the MySQL-Server. All databases belong to zedler, why was it installed?
  • October 08: DaB.
    • Both: The CNAME tools.wikimedia.de points now to hemlock.
    • Zedler:: Conf. Apache in this way, that it rewrites urls to tools.wikimedia.de to hemlock (for people with slow caches).
  • October 03: DaB.
    • Zedler: Change the replication again and killed old ssh-tunnels for sql which blocked the new. Changed the logposition again. Restart the replication.
  • October 01: DaB.
    • Zedler: Change the mysql-replicationmaster from samuel to adler. Reset the logposition to 000001' position: 4 and start the replication.

September

[edit]
  • September 30: DaB.
    • hemlock: Added proxy-scanner.eris.dk[193.163.220.4] to hosts.deny for all daemons.
  • September 29: DaB.
    • Zedler: Recreated the view for enwiki, because too many tools break (who need this old data?). Let out "pagelinks"
    • Hemlock: Give all user a .mytop-file for mytop, added a host-parameter in users .my.cnf-file, if not allready exists.
    • Zedler: Removed view for enwiki (enwiki_p) untill replay-in of the en-dump is finish.
  • September 26: DaB.
    • Hemlock: Removed /oldvar and /oldusr
  • September 26: DaB.
    • Hemlock: Unset hosts.deny for sshd to PARANOID, tescali france is to stupid to do ip2dns-things right.
  • September 25: DaB.
    • Hemlock: Repartioned md0
      • Make backup of /home at /mnt/aux0/backup
      • Made a lvm on md0
      • Create partion for /var: ~4GB (ext3)
      • Create partion for /usr: ~6GB (ext3)
      • Create partion for /tmp: ~1GB (ext2) (not used at the moment)
      • Create partion for /home: ~52GB (ext3)
      • Copied all data from /var /usr /home to the new partions
      • Move /var /usr to /oldvar /oldusr
      • Restore backup in /home
      • Changed /etc/fstab
      • Reboot
  • September 24: DaB.
    • Hemlock: Set hosts.deny for sshd to PARANOID.
  • September 23: Brion
    • Hemlock: Moved the 2G apache error log to the 1TB partition for huge files.
    • Hemlock: Renamed /home/sk/public_html/cgi-bin/geo/ which was flooding the error log to /home/sk/public_html/cgi-bin/geo-broken/.
  • September 23: JeLuF
    • Zedler: Restarted the crashed mysql
  • September 19: DaB.
    • hemlock: Created a gziped dd-copy of /dev/sdb1 and save it on /dev/sda1/root.
    • hemlock: Moved /aux0 to /mnt/aux0.
    • hemlock: Created a directory for big userthings (dumps and so) in /mnt/aux0/archiv
    • hemlock: Created a directory for backups in /mnt/aux0/backup


  • September 6: DaB.
    • Single-Login
      • Create a folder in Interiots home for singlelogin. Create a symlink from /opt/apache/singlelogin to this folder
      • Allowed CGI-Execution in this folder
      • Create a new mysql-user singlelogin with execution-right for interiots singlelogin.
  • Septemper 1: DaB.
    • Create a database called u_leon_wikistats_p for the statstool. Give leon full rights, all others should have select-right, because the _p.

August

[edit]
  • August 28: DaB.
    • Create a ramdisc (20MB) for the pgcounter. Moved the pgcounter-logs from /tmp/pgcounter/ to /var/log/apache/pgcounter/.
  • August 16: robchurch
    • Corrected /home/paulatz/.forward, owner was root
  • August 4: MySQL was stop at night and nobody has made a entry here. So I start it again. --DaB. 10:13, 4 August 2006 (UTC)[reply]

July

[edit]
  • july 12, river: installed 122213-05 GNOME Image Editor Patch, 119060-13 Xsun patch, 118778-05 Sun GigaSwift Ethernet 1.0 driver patch, 122035-03 awk nawk Patch, 118919-16 Solaris Crypto Framework patch, 121127-02 umountall.sh Patch, 118344-11 Fault Manager Patch, 119558-04 tavor Patch, 118372-07 elfsign Patch, 120252-05 mt patch, 120759-06 Sun Compiler Common patch for x86 backend, 121616-02 Patch for Sun dbx 7.5_x86 Debugger, 120762-02 Patch for Performance Analyzer Tools
  • June 30: robchurch
    • Added account for chrislb
    • Added account for escaladix
    • Added account for orgullo
  • June 29: robchurch
    • Added account for archinform
  • June 28: robchurch
    • Added account for mdd4696

June

[edit]
  • June 26: DaB.
    • Replace the huge suexec-log-file in /var/log/apache with the last 50 lines of itself, compressed the other lines in a bzip.
  • June 24: robchurch
    • Introduced user view (user_id, user_name, user_registration) and updated views for all databases
  • June 23: river: installed 118668-06 J2SE 5.0_x86: update 7 patch (upgrade Java to 1.5.0_07)
  • June 21: robchurch
    • Added account for pathos
    • Added account for paulatz
  • June 20: robchurch
    • Added account for gunther
  • June 18: robchurch
    • Added site_stats to views and updated views for all databases
  • June 16: robchurch
    • Killed long-running SELECT running as root. Apologies if this was important.
  • June 12: robchurch
    • Added view in MySQL for incubatorwiki
  • June 10 river: installed 119255-24 Install and Patch Utilities Patch, 119116-18 Mozilla 1.7_x86 patch, 120759-05 Sun Compiler Common patch for x86 backend
  • June 10: robchurch
    • At some point, ~robchurch/cgi-bin/php.cgi managed to get deleted. Restored it.
    • Added view in SQL for fiwikisource
  • June 3: DaB.
    • Make php5 the default php. Move php version 4 to php4.
  • June 2: DaB.
    • Manualy repaired u_daniel_cache/dewiki_cache'. Mysql couldn't fix it automatic.
  • June 1: brion
    • disk filled up due to massive php error reportage from magnus' xml script running for several hours pumping out errors for every byte of a many-megabyte string repeatedly
    • killed the 154-gigabyte 'error' log file from apache logs dir. replaced it with an extract of the last 50 megabytes or so from it
    • restarted apache to kill everything with the old file open. might or might not have done it right
    • renamed magnus's php.cgi to php.cgi.broken to make sure it's not coming back until problem is resolved

May

[edit]
  • May 27: installed irssi (but note that zedler shouldn't be used as a general IRC host)
  • May 21: installed 121018-02 Patch for Sun C++ 5.8 compiler, 121020-02 Patch for x86 Fortran 95 8.2 Compiler
  • May 13: upgraded PHP to 5.1.4, changed php.cgi in user homedirs

April

[edit]
  • Apr 30: Expand the expiredate for users to May 1 2007.
  • Apr 26: installed 119964-06, Shared library patch for C++_x86
  • Apr 24: / fs was corrupted by accident. changed primary root to /dev/dsk/c0t0d0s0. (SVM array for / needs rebuilding.) installed additional SCSI HBA.
  • Apr 21: sendmail patch removed postfix's aliases.dir, ran postalias /etc/mail/aliases + restarted
  • Apr 21: Installed 122035-01 (nawk patch), 122857-02 (sendmail patch), removed IDR122826-02 (sendmail IDR). restarted sendmail on login-services.
  • Apr 20: SunOS patch 118919-12 installed. There were a few hours of unexpected downtime due to a corrupted grub boot archive.