Jump to content

Objective Revision Evaluation Service/reverted

From Meta, a Wikimedia project coordination wiki

One of the most critical concerns about Wikimedia's open projects is the detection and removal of damaging contributions. This model predicts whether or not an edit will be likely to need to be reverted. It is useful for quality control tools (e.g. en:WP:Huggle and en:User:ClueBot NG)

This model is trained to predict 'reverted' edits. Not all reverted edits are "vandalism". Consume scores with this in mind.

Contexts (wikis)

[edit]

Arabic Wikipedia (arwiki)

[edit]

https://ores.wmflabs.org/v2/scores/arwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: min_samples_leaf=1, max_features="log2", n_estimators=700, learning_rate=0.01, presort="auto", verbose=0, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, balanced_sample=false, center=true, max_leaf_nodes=null, max_depth=5, subsample=1.0, random_state=null, loss="deviance", init=null, warm_start=false, min_samples_split=2, scale=true
 - version: 0.3.0
 - trained: 2017-01-06T19:06:15.589011

Table:
                 ~False    ~True
        -----  --------  -------
        False     17964     1027
        True         72      615

Accuracy: 0.944
Precision:
        -----  -----
        False  0.996
        True   0.375
        -----  -----

Recall:
        -----  -----
        False  0.946
        True   0.896
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.514
        -----  -----

ROC-AUC:
        -----  -----
        False  0.963
        True   0.967
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.602     0.928  0.093
        True           0.229     0.926  0.088

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.068     0.984         0.98
        True           0.97      0.057         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.028     1            0.966
        True           0.97      0.057        0.987

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.028     1            0.966
        True           0.827     0.697        0.455

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.028     1            0.966
        True           0.114     0.951        0.177

Czech Wikipedia (cswiki)

[edit]

https://ores.wmflabs.org/v2/scores/cswiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: learning_rate=0.01, presort="auto", subsample=1.0, random_state=null, n_estimators=700, min_samples_leaf=1, verbose=0, max_depth=7, balanced_sample_weight=true, min_weight_fraction_leaf=0.0, loss="deviance", warm_start=false, init=null, min_samples_split=2, center=true, balanced_sample=false, max_leaf_nodes=null, scale=true, max_features="log2"
 - version: 0.3.0
 - trained: 2017-01-06T19:12:50.748800

Table:
                 ~False    ~True
        -----  --------  -------
        False     18141     1129
        True        180      395

Accuracy: 0.934
Precision:
        -----  -----
        False  0.99
        True   0.259
        -----  -----

Recall:
        -----  -----
        False  0.941
        True   0.685
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.376
        -----  -----

ROC-AUC:
        -----  -----
        False  0.919
        True   0.92
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.904     0.746  0.094
        True           0.272     0.773  0.096

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.147     0.987         0.98
        True           0.957     0.065         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.038     1            0.972
        True           0.957     0.065        1

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.038      1           0.972
        True           0.86       0.32        0.467

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.038     1            0.972
        True           0.191     0.824        0.159

German Wikipedia (dewiki)

[edit]

https://ores.wmflabs.org/v2/scores/dewiki/reverted?model_info

 - type: GradientBoosting
 - params: min_weight_fraction_leaf=0.0, center=true, min_samples_split=2, balanced_sample=false, min_samples_leaf=1, subsample=1.0, scale=true, n_estimators=300, presort="auto", max_leaf_nodes=null, random_state=null, init=null, warm_start=false, max_depth=3, balanced_sample_weight=true, max_features="log2", learning_rate=0.1, verbose=0, loss="deviance"
 - version: 0.3.0
 - trained: 2017-01-06T19:17:16.259241

Table:
                 ~False    ~True
        -----  --------  -------
        False     16847     1983
        True        254      729

Accuracy: 0.887
Precision:
        -----  -----
        False  0.985
        True   0.269
        -----  -----

Recall:
        -----  -----
        False  0.895
        True   0.741
        -----  -----

PR-AUC:
        -----  -----
        False  0.99
        True   0.451
        -----  -----

ROC-AUC:
        -----  -----
        False  0.889
        True   0.888
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.835     0.586  0.096
        True           0.54      0.733  0.097

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.282     0.933         0.98
        True           0.967     0.075         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.023     1            0.952
        True           0.956     0.113        0.951

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.023     1            0.952
        True           0.876     0.425        0.471

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.023     1            0.952
        True           0.258     0.848        0.156

English Wikipedia (enwiki)

[edit]

https://ores.wmflabs.org/v2/scores/enwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: min_samples_split=2, max_depth=7, balanced_sample_weight=true, warm_start=false, presort="auto", scale=true, learning_rate=0.01, random_state=null, max_features="log2", balanced_sample=false, init=null, verbose=0, center=true, loss="deviance", min_samples_leaf=1, n_estimators=700, min_weight_fraction_leaf=0.0, max_leaf_nodes=null, subsample=1.0
 - version: 0.3.0
 - trained: 2017-01-06T19:23:24.945358

Table:
                 ~False    ~True
        -----  --------  -------
        False     15554     2560
        True        457      962

Accuracy: 0.846
Precision:
        -----  -----
        False  0.971
        True   0.273
        -----  -----

Recall:
        -----  -----
        False  0.859
        True   0.68
        -----  -----

PR-AUC:
        -----  -----
        False  0.984
        True   0.424
        -----  -----

ROC-AUC:
        -----  -----
        False  0.866
        True   0.867
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.852     0.597  0.096
        True           0.605     0.583  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.688     0.768         0.98
        True           0.94      0.039         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.054     1            0.929
        True           0.926     0.071        0.947

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.054     1            0.929
        True           0.758     0.378        0.455

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.054     1            0.929
        True           0.157     0.898        0.155

English Wiktionary (enwiktionary)

[edit]

https://ores.wmflabs.org/v2/scores/enwiktionary/reverted?model_info

ScikitLearnClassifier
 - type: RF
 - params: oob_score=false, verbose=0, min_samples_split=2, min_samples_leaf=3, class_weight=null, center=true, n_estimators=320, max_depth=null, max_leaf_nodes=null, warm_start=false, balanced_sample=false, max_features="log2", n_jobs=1, balanced_sample_weight=true, random_state=null, criterion="entropy", bootstrap=true, min_weight_fraction_leaf=0.0, scale=true
 - version: 0.3.0
 - trained: 2017-01-06T19:44:30.000302

Table:
                 ~False    ~True
        -----  --------  -------
        False     19808      183
        True        279      574

Accuracy: 0.978
Precision:
        -----  -----
        False  0.986
        True   0.76
        -----  -----

Recall:
        -----  -----
        False  0.991
        True   0.675
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.742
        -----  -----

ROC-AUC:
        -----  -----
        False  0.972
        True   0.974
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.863     0.915  0.094
        True           0.124     0.919  0.095

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.267     0.996        0.981
        True           0.944     0.126        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.038     1            0.961
        True           0.873     0.321        0.92

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.038     1            0.961
        True           0.2       0.832        0.459

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.038     1            0.961
        True           0.044     0.984        0.16

Spanish Wikipedia (eswiki)

[edit]

https://ores.wmflabs.org/v2/scores/eswiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: min_weight_fraction_leaf=0.0, max_leaf_nodes=null, verbose=0, init=null, subsample=1.0, presort="auto", random_state=null, balanced_sample=false, loss="deviance", scale=true, learning_rate=0.01, min_samples_split=2, max_features="log2", warm_start=false, balanced_sample_weight=true, n_estimators=700, center=true, min_samples_leaf=1, max_depth=7
 - version: 0.3.0
 - trained: 2017-01-06T19:51:18.041685

Table:
                 ~False    ~True
        -----  --------  -------
        False     14751     2881
        True        485     1697

Accuracy: 0.83
Precision:
        -----  -----
        False  0.968
        True   0.37
        -----  -----

Recall:
        -----  -----
        False  0.837
        True   0.778
        -----  -----

PR-AUC:
        -----  -----
        False  0.983
        True   0.584
        -----  -----

ROC-AUC:
        -----  -----
        False  0.901
        True   0.901
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.75      0.726  0.099
        True           0.643     0.658  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.686     0.757         0.98
        True           0.957     0.047         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     0.997        0.901
        True           0.937     0.1          0.919

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.035     1            0.892
        True           0.643     0.657        0.453

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.035     1            0.892
        True           0.053     0.983        0.154

Spanish Wikibooks (eswikibooks)

[edit]

https://ores.wmflabs.org/v2/scores/eswikibooks/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: subsample=1.0, balanced_sample_weight=true, init=null, min_samples_leaf=1, max_leaf_nodes=null, learning_rate=0.01, n_estimators=700, random_state=null, loss="deviance", center=true, verbose=0, warm_start=false, min_weight_fraction_leaf=0.0, presort="auto", max_depth=7, scale=true, balanced_sample=false, min_samples_split=2, max_features="log2"
 - version: 0.3.0
 - trained: 2017-01-06T19:57:03.343720

Table:
                 ~False    ~True
        -----  --------  -------
        False     16173     1164
        True        129     1573

Accuracy: 0.932
Precision:
        -----  -----
        False  0.992
        True   0.574
        -----  -----

Recall:
        -----  -----
        False  0.933
        True   0.924
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.818
        -----  -----

ROC-AUC:
        -----  -----
        False  0.976
        True   0.978
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.401     0.943  0.095
        True           0.166     0.96   0.096

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.196     0.97         0.981
        True           0.98      0.109        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.016     1            0.915
        True           0.96      0.359        0.908

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.016      1           0.915
        True           0.112      0.97        0.466

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.016     1            0.915
        True           0.015     0.996        0.178

Estonian Wikipedia (etwiki)

[edit]

https://ores.wmflabs.org/v2/scores/etwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: presort="auto", max_depth=7, min_samples_split=2, balanced_sample_weight=true, learning_rate=0.01, init=null, balanced_sample=false, verbose=0, min_samples_leaf=1, scale=true, max_leaf_nodes=null, subsample=1.0, loss="deviance", max_features="log2", n_estimators=500, warm_start=false, random_state=null, min_weight_fraction_leaf=0.0, center=true
 - version: 0.3.0
 - trained: 2017-01-06T20:02:25.031504

Table:
                 ~False    ~True
        -----  --------  -------
        False     18666      810
        True        106      288

Accuracy: 0.954
Precision:
        -----  -----
        False  0.994
        True   0.263
        -----  -----

Recall:
        -----  -----
        False  0.958
        True   0.728
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.532
        -----  -----

ROC-AUC:
        -----  -----
        False  0.943
        True   0.942
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.858     0.834  0.089
        True           0.18      0.877  0.093

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.033     1            0.983
        True           0.959     0.211        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.025      1           0.982
        True           0.952      0.24        0.945

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.025      1           0.982
        True           0.819      0.49        0.478

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.025     1            0.982
        True           0.215     0.865        0.161

Persian Wikipedia (fawiki)

[edit]

https://ores.wmflabs.org/v2/scores/fawiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: presort="auto", center=true, scale=true, subsample=1.0, max_features="log2", balanced_sample_weight=true, min_weight_fraction_leaf=0.0, random_state=null, min_samples_leaf=1, verbose=0, warm_start=false, learning_rate=0.01, max_depth=7, balanced_sample=false, max_leaf_nodes=null, min_samples_split=2, loss="deviance", n_estimators=700, init=null
 - version: 0.3.0
 - trained: 2017-01-06T20:09:09.014465

Table:
                 ~False    ~True
        -----  --------  -------
        False     18405      935
        True        167      297

Accuracy: 0.944
Precision:
        -----  -----
        False  0.991
        True   0.244
        -----  -----

Recall:
        -----  -----
        False  0.952
        True   0.646
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.319
        -----  -----

ROC-AUC:
        -----  -----
        False  0.933
        True   0.938
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.875     0.808  0.09
        True           0.254     0.814  0.097

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.093     0.995         0.98
        True           0.96      0.041         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.041     1            0.977
        True           0.96      0.041        1

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.041     1            0.977
        True           0.895     0.207        0.483

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.041     1            0.977
        True           0.233     0.837        0.157

French Wikipedia (frwiki)

[edit]

https://ores.wmflabs.org/v2/scores/frwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: verbose=0, max_leaf_nodes=null, warm_start=false, scale=true, random_state=null, max_features="log2", presort="auto", min_samples_split=2, learning_rate=0.01, center=true, min_samples_leaf=1, max_depth=5, balanced_sample_weight=true, n_estimators=700, init=null, subsample=1.0, balanced_sample=false, loss="deviance", min_weight_fraction_leaf=0.0
 - version: 0.3.0
 - trained: 2017-01-06T20:26:23.424804

Table:
                 ~False    ~True
        -----  --------  -------
        False     17260     1951
        True        158      538

Accuracy: 0.894
Precision:
        -----  -----
        False  0.991
        True   0.216
        -----  -----

Recall:
        -----  -----
        False  0.898
        True   0.772
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.438
        -----  -----

ROC-AUC:
        -----  -----
        False  0.914
        True   0.914
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.799     0.733  0.09
        True           0.533     0.779  0.096

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.151     0.976         0.98
        True           0.954     0.086         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False           0.04     1            0.966
        True            0.95     0.102        0.976

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.04      1            0.966
        True           0.866     0.424        0.461

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.04      1            0.966
        True           0.273     0.858        0.158

Hebrew Wikipedia (hewiki)

[edit]

https://ores.wmflabs.org/v2/scores/hewiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample_weight=true, balanced_sample=false, max_depth=7, warm_start=false, loss="deviance", subsample=1.0, max_features="log2", max_leaf_nodes=null, min_samples_split=2, learning_rate=0.01, verbose=0, init=null, random_state=null, min_weight_fraction_leaf=0.0, center=true, scale=true, n_estimators=500, min_samples_leaf=1, presort="auto"
 - version: 0.3.0
 - trained: 2017-01-06T20:31:51.924465

Table:
                 ~False    ~True
        -----  --------  -------
        False     17245     1689
        True        275      685

Accuracy: 0.901
Precision:
        -----  -----
        False  0.984
        True   0.289
        -----  -----

Recall:
        -----  -----
        False  0.911
        True   0.714
        -----  -----

PR-AUC:
        -----  -----
        False  0.991
        True   0.407
        -----  -----

ROC-AUC:
        -----  -----
        False  0.899
        True   0.901
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.818     0.708  0.095
        True           0.45      0.745  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.358     0.935         0.98
        True           0.942     0.046         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.053     1            0.953
        True           0.939     0.059        0.975

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.053      1           0.953
        True           0.855      0.34        0.467

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.053      1           0.953
        True           0.2        0.88        0.155

Hungarian Wikipedia (huwiki)

[edit]

https://ores.wmflabs.org/v2/scores/huwiki/reverted?model_info

ScikitLearnClassifier
 - type: RF
 - params: verbose=0, bootstrap=true, min_weight_fraction_leaf=0.0, max_leaf_nodes=null, warm_start=false, min_samples_leaf=13, oob_score=false, n_estimators=320, n_jobs=1, min_samples_split=2, balanced_sample_weight=true, class_weight=null, scale=true, criterion="entropy", max_features="log2", max_depth=null, balanced_sample=false, random_state=null, center=true
 - version: 0.3.0
 - trained: 2017-01-06T20:53:22.021344

Table:
                 ~False    ~True
        -----  --------  -------
        False     38248      990
        True        218      372

Accuracy: 0.97
Precision:
        -----  -----
        False  0.994
        True   0.274
        -----  -----

Recall:
        -----  -----
        False  0.975
        True   0.627
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.37
        -----  -----

ROC-AUC:
        -----  -----
        False  0.929
        True   0.929
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.912     0.761  0.095
        True           0.147     0.8    0.093

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     1            0.986
        True           0.922     0.074        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     1            0.986
        True           0.922     0.074        0.992

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067      1           0.986
        True           0.794      0.31        0.467

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     1            0.986
        True           0.221     0.749        0.169

Indonesian Wikipedia (idwiki)

[edit]

https://ores.wmflabs.org/v2/scores/idwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: init=null, balanced_sample=false, learning_rate=0.01, scale=true, warm_start=false, subsample=1.0, max_leaf_nodes=null, max_depth=5, random_state=null, balanced_sample_weight=true, presort="auto", min_weight_fraction_leaf=0.0, loss="deviance", verbose=0, max_features="log2", min_samples_leaf=1, min_samples_split=2, n_estimators=700, center=true
 - version: 0.3.0
 - trained: 2017-01-06T21:32:36.621853

Table:
                 ~False    ~True
        -----  --------  -------
        False     85465    12234
        True        258     2014

Accuracy: 0.875
Precision:
        -----  -----
        False  0.997
        True   0.141
        -----  -----

Recall:
        -----  -----
        False  0.875
        True   0.886
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.27
        -----  -----

ROC-AUC:
        -----  -----
        False  0.94
        True   0.945
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.533     0.865  0.098
        True           0.616     0.849  0.099

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.076     0.995         0.98
        True           0.952     0.011         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.047     1            0.977
        True           0.952     0.011        1

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.047     1            0.977
        True           0.927     0.105        0.468

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.047     1            0.977
        True           0.554     0.869        0.152

Italian Wikipedia (itwiki)

[edit]

https://ores.wmflabs.org/v2/scores/itwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: center=true, min_weight_fraction_leaf=0.0, balanced_sample=false, learning_rate=0.01, verbose=0, max_leaf_nodes=null, random_state=null, max_depth=7, scale=true, loss="deviance", subsample=1.0, min_samples_split=2, max_features="log2", balanced_sample_weight=true, min_samples_leaf=1, n_estimators=700, init=null, presort="auto", warm_start=false
 - version: 0.3.0
 - trained: 2017-01-06T21:38:17.924898

Table:
                 ~False    ~True
        -----  --------  -------
        False     16471     2397
        True        252      639

Accuracy: 0.866
Precision:
        -----  -----
        False  0.985
        True   0.211
        -----  -----

Recall:
        -----  -----
        False  0.873
        True   0.717
        -----  -----

PR-AUC:
        -----  -----
        False  0.992
        True   0.334
        -----  -----

ROC-AUC:
        -----  -----
        False  0.898
        True   0.902
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.853     0.724  0.094
        True           0.595     0.641  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.394     0.906         0.98
        True           0.936     0.042         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.053     1            0.956
        True           0.934     0.046        0.988

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.053      1           0.956
        True           0.844      0.21        0.474

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.053     1            0.956
        True           0.211     0.874        0.154

Dutch Wikipedia (nlwiki)

[edit]

https://ores.wmflabs.org/v2/scores/nlwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: subsample=1.0, min_samples_split=2, n_estimators=700, verbose=0, balanced_sample_weight=true, warm_start=false, random_state=null, max_features="log2", init=null, max_leaf_nodes=null, presort="auto", center=true, learning_rate=0.01, min_weight_fraction_leaf=0.0, min_samples_leaf=1, scale=true, max_depth=7, balanced_sample=false, loss="deviance"
 - version: 0.3.0
 - trained: 2017-01-06T21:44:14.717645

Table:
                 ~False    ~True
        -----  --------  -------
        False     16884     1379
        True        277      924

Accuracy: 0.915
Precision:
        -----  -----
        False  0.984
        True   0.401
        -----  -----

Recall:
        -----  -----
        False  0.924
        True   0.77
        -----  -----

PR-AUC:
        -----  -----
        False  0.992
        True   0.593
        -----  -----

ROC-AUC:
        -----  -----
        False  0.928
        True   0.929
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.865     0.755  0.098
        True           0.305     0.831  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.349     0.942         0.98
        True           0.959     0.114         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.032     1            0.942
        True           0.943     0.196        0.922

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.032     1            0.942
        True           0.673     0.705        0.453

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.032     1            0.942
        True           0.101     0.931        0.156


Norwegian Wikipedia (nowiki)

[edit]

https://ores.wmflabs.org/v2/scores/nowiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample=false, min_weight_fraction_leaf=0.0, min_samples_leaf=1, init=null, max_features="log2", scale=true, random_state=null, presort="auto", max_leaf_nodes=null, warm_start=false, verbose=0, loss="deviance", min_samples_split=2, center=true, balanced_sample_weight=true, n_estimators=500, max_depth=7, subsample=1.0, learning_rate=0.01
 - version: 0.3.0
 - trained: 2017-01-06T22:08:36.437323

Table:
                 ~False    ~True
        -----  --------  -------
        False     38123     1102
        True        141      626

Accuracy: 0.969
Precision:
        -----  -----
        False  0.996
        True   0.363
        -----  -----

Recall:
        -----  -----
        False  0.972
        True   0.817
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.581
        -----  -----

ROC-AUC:
        -----  -----
        False  0.964
        True   0.963
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.807     0.89   0.092
        True           0.182     0.902  0.091

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.022     1            0.982
        True           0.974     0.116        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.021     1            0.982
        True           0.97      0.175        0.923

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.021     1            0.982
        True           0.812     0.712        0.455

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.021     1            0.982
        True           0.183     0.902        0.165

Polish Wikipedia (plwiki)

[edit]

https://ores.wmflabs.org/v2/scores/plwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: subsample=1.0, min_samples_leaf=1, max_leaf_nodes=null, init=null, balanced_sample=false, verbose=0, center=true, random_state=null, loss="deviance", presort="auto", n_estimators=700, learning_rate=0.01, min_weight_fraction_leaf=0.0, min_samples_split=2, balanced_sample_weight=true, warm_start=false, max_features="log2", max_depth=5, scale=true
 - version: 0.3.0
 - trained: 2017-01-06T22:22:13.325027

Table:
                 ~False    ~True
        -----  --------  -------
        False     34941     3588
        True        276     1155

Accuracy: 0.903
Precision:
        -----  -----
        False  0.992
        True   0.244
        -----  -----

Recall:
        -----  -----
        False  0.907
        True   0.807
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.42
        -----  -----

ROC-AUC:
        -----  -----
        False  0.928
        True   0.929
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.746     0.794  0.096
        True           0.471     0.819  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.121     0.973         0.98
        True           0.954     0.042         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.965
        True           0.953     0.054        0.969

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.965
        True           0.902     0.366        0.455

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.965
        True           0.259     0.898        0.153

Portuguese Wikipedia (ptwiki)

[edit]

https://ores.wmflabs.org/v2/scores/ptwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: presort="auto", verbose=0, warm_start=false, min_samples_split=2, max_features="log2", random_state=null, max_leaf_nodes=null, loss="deviance", init=null, min_samples_leaf=1, subsample=1.0, center=true, n_estimators=700, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, balanced_sample=false, learning_rate=0.01, scale=true, max_depth=7
 - version: 0.3.0
 - trained: 2017-01-06T22:36:08.809695

Table:
                 ~False    ~True
        -----  --------  -------
        False     14777     3022
        True        370     1644

Accuracy: 0.829
Precision:
        -----  -----
        False  0.976
        True   0.352
        -----  -----

Recall:
        -----  -----
        False  0.83
        True   0.817
        -----  -----

PR-AUC:
        -----  -----
        False  0.985
        True   0.546
        -----  -----

ROC-AUC:
        -----  -----
        False  0.905
        True   0.907
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.721     0.757  0.097
        True           0.673     0.649  0.099

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.59      0.805         0.98
        True           0.952     0.035         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.056     0.999        0.903
        True           0.926     0.098        0.935

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.036     1            0.899
        True           0.701     0.602        0.456

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.036     1            0.899
        True           0.057     0.986        0.158

Russian Wikipedia (ruwiki)

[edit]

https://ores.wmflabs.org/v2/scores/ruwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample=false, warm_start=false, min_weight_fraction_leaf=0.0, presort="auto", center=true, random_state=null, max_depth=5, loss="deviance", verbose=0, subsample=1.0, min_samples_leaf=1, balanced_sample_weight=true, init=null, max_leaf_nodes=null, min_samples_split=2, n_estimators=700, learning_rate=0.01, scale=true, max_features="log2"
 - version: 0.3.0
 - trained: 2017-01-06T22:52:45.773825

Table:
                 ~False    ~True
        -----  --------  -------
        False     15893     2796
        True        220      826

Accuracy: 0.847
Precision:
        -----  -----
        False  0.986
        True   0.229
        -----  -----

Recall:
        -----  -----
        False  0.85
        True   0.789
        -----  -----

PR-AUC:
        -----  -----
        False  0.991
        True   0.382
        -----  -----

ROC-AUC:
        -----  -----
        False  0.895
        True   0.897
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.767     0.709  0.097
        True           0.702     0.664  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.306     0.904         0.98
        True           0.929     0.054         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.062     1            0.949
        True           0.922     0.069        0.952

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.062     1            0.949
        True           0.855     0.283        0.461

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.062     1            0.949
        True           0.232     0.889        0.155

Swedish Wikipedia (svwiki)

[edit]

https://ores.wmflabs.org/v2/scores/svwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: max_features="log2", warm_start=false, subsample=1.0, max_leaf_nodes=null, random_state=null, min_samples_split=2, n_estimators=500, init=null, max_depth=7, loss="deviance", learning_rate=0.01, scale=true, balanced_sample_weight=true, center=true, min_samples_leaf=1, verbose=0, min_weight_fraction_leaf=0.0, presort="auto", balanced_sample=false
 - version: 0.3.0
 - trained: 2017-01-06T23:13:40.455191

Table:
                 ~False    ~True
        -----  --------  -------
        False     37978     1244
        True        124      601

Accuracy: 0.966
Precision:
        -----  -----
        False  0.997
        True   0.326
        -----  -----

Recall:
        -----  -----
        False  0.968
        True   0.83
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.598
        -----  -----

ROC-AUC:
        -----  -----
        False  0.969
        True   0.971
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.761     0.904  0.093
        True           0.231     0.899  0.084

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.025     1            0.984
        True           0.971     0.165        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.024     1            0.983
        True           0.966     0.199        0.931

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.024     1            0.983
        True           0.834     0.691        0.47

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.024     1            0.983
        True           0.238     0.899        0.173

Turkish Wikipedia (trwiki)

[edit]

https://ores.wmflabs.org/v2/scores/trwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: n_estimators=700, warm_start=false, learning_rate=0.01, max_features="log2", random_state=null, max_leaf_nodes=null, presort="auto", min_samples_leaf=1, min_samples_split=2, init=null, min_weight_fraction_leaf=0.0, center=true, scale=true, max_depth=7, balanced_sample=false, subsample=1.0, loss="deviance", verbose=0, balanced_sample_weight=true
 - version: 0.3.0
 - trained: 2017-01-06T23:19:07.190955

Table:
                 ~False    ~True
        -----  --------  -------
        False     14975     2496
        True        350     1910

Accuracy: 0.856
Precision:
        -----  -----
        False  0.977
        True   0.434
        -----  -----

Recall:
        -----  -----
        False  0.857
        True   0.845
        -----  -----

PR-AUC:
        -----  -----
        False  0.985
        True   0.554
        -----  -----

ROC-AUC:
        -----  -----
        False  0.916
        True   0.919
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.724     0.804  0.098
        True           0.712     0.738  0.099

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.586     0.842         0.98
        True           0.937     0.02          1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.101     0.99         0.901
        True           0.934     0.029        0.966

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.061     1            0.887
        True           0.591     0.808        0.452

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.061     1            0.887
        True           0.044     0.993        0.163


Ukrainian Wikipedia (ukwiki)

[edit]

https://ores.wmflabs.org/v2/scores/ukwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: max_leaf_nodes=null, presort="auto", learning_rate=0.01, init=null, center=true, random_state=null, max_features="log2", verbose=0, min_samples_leaf=1, scale=true, warm_start=false, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, max_depth=7, balanced_sample=false, min_samples_split=2, loss="deviance", subsample=1.0, n_estimators=700
 - version: 0.3.0
 - trained: 2017-01-06T23:35:31.227174

Table:
                 ~False    ~True
        -----  --------  -------
        False     18519      928
        True        210      192

Accuracy: 0.943
Precision:
        -----  -----
        False  0.989
        True   0.172
        -----  -----

Recall:
        -----  -----
        False  0.952
        True   0.477
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.204
        -----  -----

ROC-AUC:
        -----  -----
        False  0.852
        True   0.853
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.935     0.511  0.087
        True           0.303     0.597  0.095

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.095     0.998        0.981
        True           0.938     0.073        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.059     1             0.98
        True           0.938     0.073         1

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.059     1            0.98
        True           0.874     0.132        0.494

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.059     1            0.98
        True           0.443     0.525        0.155

Vietnamese Wikipedia (viwiki)

[edit]

https://ores.wmflabs.org/v2/scores/viwiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: verbose=0, n_estimators=700, scale=true, presort="auto", min_weight_fraction_leaf=0.0, max_leaf_nodes=null, min_samples_split=2, loss="deviance", min_samples_leaf=1, balanced_sample_weight=true, balanced_sample=false, center=true, subsample=1.0, init=null, max_depth=7, warm_start=false, learning_rate=0.01, random_state=null, max_features="log2"
 - version: 0.3.0
 - trained: 2017-01-07T00:21:37.945283

Table:
                 ~False    ~True
        -----  --------  -------
        False     90617     7589
        True        336     1458

Accuracy: 0.921
Precision:
        -----  -----
        False  0.996
        True   0.161
        -----  -----

Recall:
        -----  -----
        False  0.923
        True   0.813
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.457
        -----  -----

ROC-AUC:
        -----  -----
        False  0.956
        True   0.96
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.668     0.876  0.098
        True           0.415     0.86   0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.027      1           0.984
        True           0.966      0.14        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.027     1            0.984
        True           0.956     0.188        0.927

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.027     1            0.984
        True           0.892     0.415        0.463

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.027     1            0.984
        True           0.466     0.834        0.152

Wikidata (wikidatawiki)

[edit]

https://ores.wmflabs.org/v2/scores/wikidatawiki/reverted?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: scale=true, balanced_sample=false, max_depth=7, max_features="log2", warm_start=false, min_samples_split=2, init=null, verbose=0, max_leaf_nodes=null, balanced_sample_weight=true, loss="deviance", center=true, subsample=1.0, learning_rate=0.1, random_state=null, presort="auto", min_weight_fraction_leaf=0.0, min_samples_leaf=1, n_estimators=700
 - version: 0.3.0
 - trained: 2017-01-07T00:46:53.835597

Table:
                 ~False    ~True
        -----  --------  -------
        False     11786     1035
        True        792    10819

Accuracy: 0.925
Precision:
        -----  -----
        False  0.937
        True   0.913
        -----  -----

Recall:
        -----  -----
        False  0.919
        True   0.932
        -----  -----

PR-AUC:
        -----  -----
        False  0.978
        True   0.972
        -----  -----

ROC-AUC:
        -----  -----
        False  0.977
        True   0.979
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.334     0.943  0.099
        True           0.397     0.948  0.099

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.886     0.799         0.98
        True           0.958     0.664         0.98

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.272     0.952        0.901
        True           0.422     0.945        0.9

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0             1        0.537
        True           0.001         1        0.516

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0             1        0.537
        True           0.001         1        0.516