Research:Revision scoring as a service/Word lists/la

From Meta, a Wikimedia project coordination wiki

ISO code Language Generated list Badwords Informal words Stopwords Dictionary Stemmer Contact person Wiki labels Interface Forms Campaign Needs
la Latina (Wikipedia) 250 - - - - - See: Word lists requested no no no -
Generated list [1]

Words in the generated list commonly appear in reverted revisions but not in others. This list is generated using a TF-IDF approach.

  1. adc
  2. addurl
  3. again
  4. ailelere
  5. aint
  6. alabamy
  7. aliens
  8. altars
  9. always
  10. answer
  11. anyhow
  12. ask
  13. ass
  14. ate
  15. audivi
  16. awake
  17. bacon
  18. been
  19. behold
  20. beneath
  21. bilindi
  22. birthright
  23. bless
  24. block
  25. bother
  26. bounties
  27. breaks
  28. breathe
  29. breeze
  30. broader
  31. broken
  32. brought
  33. but
  34. carry
  35. caveat
  36. charlotte
  37. children
  38. circumcisio
  39. claim
  40. clean
  41. climb
  42. clime
  43. colossea
  44. columnationes
  45. columnatum
  46. coming
  47. consiliabatur
  48. continue
  49. cool
  50. corinne
  51. dandy
  52. ddw
  53. degaen
  54. delectare
  55. delikli
  56. deliveraverunt
  57. dervi
  58. did
  59. didn
  60. dog
  61. doodle
  62. doubt
  63. eats
  64. encircling
  65. enormitate
  66. even
  67. ever
  68. every
  69. everywhere
  70. excited
  71. feather
  72. feeling
  73. fixed
  74. folks
  75. forget
  76. foroa
  77. fortunate
  78. forumul
  79. fought
  80. fterran
  81. fuck
  82. gay
  83. get
  84. ghost
  85. give
  86. giving
  87. gladly
  88. glorious
  89. gone
  90. googol
  91. gracious
  92. guarded
  93. guess
  94. guugal
  95. hail
  96. halls
  97. has
  98. hasty
  99. height
  100. hekim
  101. help
  102. helpful
  103. her
  104. here
  105. him
  106. hoca
  107. hollywood
  108. homework
  109. ibendi
  110. impearled
  111. innovative
  112. inscribe
  113. iskilipli
  114. iudicesque
  115. just
  116. kablosu
  117. kapat
  118. keep
  119. know
  120. kten
  121. lerdir
  122. like
  123. lol
  124. look
  125. looks
  126. lordy
  127. macaroni
  128. make
  129. meseleleri
  130. millionaires
  131. mister
  132. mockbuster
  133. mom
  134. money
  135. montezuma
  136. moral
  137. mortal
  138. much
  139. mudkipz
  140. native
  141. nazi
  142. neath
  143. neden
  144. nein
  145. off
  146. okr
  147. olan
  148. ooh
  149. orr
  150. overflow
  151. owned
  152. pagename
  153. pampered
  154. partake
  155. pelican
  156. penis
  157. pick
  158. pig
  159. piglet
  160. pilgrims
  161. pingues
  162. plug
  163. pot
  164. prolong
  165. protecta
  166. pudding
  167. quaesitoria
  168. rapture
  169. really
  170. referrit
  171. remember
  172. rills
  173. rips
  174. romanvm
  175. rostrorum
  176. rummage
  177. safeguard
  178. sale
  179. sanalritim
  180. says
  181. schelp
  182. scheme
  183. semi
  184. setting
  185. share
  186. sheet
  187. shit
  188. shoals
  189. should
  190. silence
  191. sires
  192. slapping
  193. smite
  194. snake
  195. soars
  196. southland
  197. spangled
  198. spark
  199. speeds
  200. spider
  201. spoon
  202. stallion
  203. starry
  204. sterge
  205. strife
  206. stuck
  207. stultissimus
  208. stupid
  209. sunny
  210. sure
  211. swampers
  212. swastika
  213. sway
  214. swell
  215. take
  216. tarragona
  217. taught
  218. taxman
  219. tell
  220. tells
  221. templed
  222. termos
  223. thank
  224. themselves
  225. then
  226. they
  227. theyre
  228. thick
  229. think
  230. this
  231. thrills
  232. thy
  233. tongues
  234. tried
  235. triumphant
  236. tropic
  237. truth
  238. unfurled
  239. upward
  240. usagod
  241. usawikipedia
  242. very
  243. visible
  244. vomitoria
  245. vomitorium
  246. vulgatur
  247. wants
  248. was
  249. watergate
  250. went
  251. wheels
  252. when
  253. wick
  254. wikiquette
  255. xbar
  256. yapm
  257. yea
  258. yeah
  259. yerinde
  260. yoh
  261. you
  262. your
  263. zamanda
Generated common words

Common words appear on all revisions reverted or otherwise. In the English language this would include words like 'the' or 'is' which are meaningless on their own. This list is generated using a TF-IDF approach.

  1. administratio
  2. aedificia
  3. alia
  4. altitudo
  5. and
  6. anglice
  7. annexa
  8. anni
  9. anno
  10. annum
  11. antiqua
  12. aprilis
  13. apud
  14. area
  15. arms
  16. asp
  17. atque
  18. auctores
  19. augusti
  20. aut
  21. autem
  22. bibliographia
  23. bio
  24. blason
  25. books
  26. capsa
  27. carolus
  28. categoria
  29. category
  30. catholica
  31. circa
  32. circiter
  33. cives
  34. civitates
  35. civitatum
  36. clari
  37. class
  38. code
  39. collocatio
  40. com
  41. commune
  42. communia
  43. communiacat
  44. communis
  45. cuius
  46. cum
  47. decembris
  48. defaultsort
  49. deinde
  50. del
  51. der
  52. des
  53. descriptione
  54. desiderata
  55. desiderati
  56. despectus
  57. dicata
  58. die
  59. diescensusincolarum
  60. diessollemnis
  61. discretiva
  62. div
  63. dubsig
  64. ecclesia
  65. ecclesiae
  66. ehess
  67. eius
  68. erat
  69. esse
  70. est
  71. etiam
  72. externi
  73. externus
  74. false
  75. fasciculus
  76. februarii
  77. fiche
  78. file
  79. finium
  80. foederatarum
  81. fontes
  82. fractiones
  83. francia
  84. fuit
  85. gallery
  86. genetivusnominis
  87. geographia
  88. google
  89. gov
  90. graduslatitudinis
  91. graduslongitudinis
  92. haec
  93. historia
  94. hoc
  95. hodie
  96. homines
  97. hominis
  98. htm
  99. html
  100. http
  101. huius
  102. ianuarii
  103. image
  104. imago
  105. incolae
  106. incolarum
  107. index
  108. indicem
  109. inscriptioimaginis
  110. insigne
  111. inter
  112. interretialis
  113. ioannes
  114. isbn
  115. italia
  116. italiae
  117. italiana
  118. italiane
  119. italy
  120. iulii
  121. iunii
  122. joohr
  123. jpg
  124. latina
  125. latine
  126. latinitas
  127. left
  128. lifetime
  129. ling
  130. lingua
  131. link
  132. loci
  133. maii
  134. map
  135. maria
  136. martii
  137. minutalatitudinis
  138. minutalongitudinis
  139. mortui
  140. mortuus
  141. municipii
  142. municipio
  143. municipium
  144. myrias
  145. name
  146. nati
  147. natus
  148. nexus
  149. nomen
  150. nomenincolarum
  151. nomenitalianum
  152. nomenlatinum
  153. nomenlingualoci
  154. nomina
  155. nomine
  156. non
  157. notae
  158. nova
  159. novembris
  160. numerusincolarum
  161. numerustributarius
  162. octobris
  163. oeconomia
  164. old
  165. olim
  166. onepage
  167. opera
  168. oppidum
  169. org
  170. pagina
  171. paginainterretialis
  172. pars
  173. patronus
  174. per
  175. petrus
  176. php
  177. pinacotheca
  178. png
  179. post
  180. postea
  181. praefixumtelephonicum
  182. primum
  183. pro
  184. province
  185. provincia
  186. publicus
  187. quae
  188. quam
  189. qui
  190. quo
  191. quod
  192. ref
  193. references
  194. regio
  195. region
  196. regione
  197. regnum
  198. rerum
  199. res
  200. resultat
  201. rex
  202. right
  203. romana
  204. saeculo
  205. san
  206. sancti
  207. sanctus
  208. secundalatitudinis
  209. secundalongitudinis
  210. secundum
  211. sed
  212. select
  213. sententia
  214. septembris
  215. seu
  216. sib
  217. siglaprovinciae
  218. siglaregionis
  219. sine
  220. situm
  221. situs
  222. sive
  223. small
  224. stipula
  225. sunt
  226. svg
  227. tabula
  228. terra
  229. the
  230. thinsp
  231. thumb
  232. ubi
  233. una
  234. universitatis
  235. urbe
  236. urbes
  237. urbis
  238. urbs
  239. usque
  240. vel
  241. vexillum
  242. vici
  243. vicidata
  244. victio
  245. vide
  246. ville
  247. vita
  248. vivi
  249. vol
  250. www

Bad words

Bad words are words unwelcome on any page. This would include curse words, spam and other content that would be reverted regardless of where it is inserted.

Needs bad words... Use |list-badwords=

Informal words

Informal words are words unwelcome on article namespace but would be acceptable on talk pages. This would include words such as 'hello' or 'hahaha' which would be fine in discussions but not in articles.

Needs informal words... Use |list-informal=